Unicode

unicode text
0000 -- 1FFF   8K  Alphabets
 0000--00FF Latin-1
 0100--02FF Latin extensions, IPA extensions, spacing
 0300--03FF Diacritical marks, Greek (based on Latin-7) 
 0400--04FF Cyrillic (based on Latin-5)
 0500--05FF Armenian, Hebrew (based on Latin-8)
 0600--06FF Arabic (based on Latin-6)

 0900--09FF Devanagari, Bengali (based on ISCII 1988)
 0A00--0AFF Gurmukhi, Gujarati (based on ISCII 1988) 
 0B00--0BFF Oriya, Tamil (based on ISCII 1988) 
 0C00--0CFF Telugu, Kannada (based on ISCII 1988) 
 0D00--0DFF Malayalam, Sinhala (based on ISCII 1988) 
 0E00--0EFF Tai, Lao
 0F00--0FB9 Tibetian
 10A0--10FB Georgian
 1100--11FF Hangul Jamo
 1E00--1EFF Latin extended
 1F00--1FFF Greek extended

2000 -- 2FFF   4K  Symbols and Punctuation
 2000--20FF General Punctuation, Superscripts and Subscripts, Currency Symbols, Diacritical Marks
 2100--21FF Letterlike Symbols, Number Forms, Arrows
 2200--22FF Mathematical operators
 2300--23FF Miscellaneous Technical
 2400--24FF Control Pictures, Optical Character Recognition, Enclosed Alphanumerics 
 2500--25FF Box Drawing, Block Elements, Geometric Shapes
 2600--26FF Miscellaneous Symbols
 2700--27BF Dingbats

3000 -- 33FF   1K
 3000--30FF CJK Symbols and Punctuation, Hiragana, Katakana
 3100--31FF Bopomofo, Hangul Compatibility Jamo, Kanbun
 3200--32FF Enclosed CJK Letters and Months
 3300--33FF CJK Compatiblity

4E00 -- 9FFF  20.5K  CJK Unified Ideographs

AC00 -- D7FF  11K  Hangul Syllables

D800 -- DFFF   2K  Surrogates

E000 -- FFFF   8K  Miscellaneous
 E000--F8FF  Private Use
 F900--FAFF  CJK Compatibility Ideographs
 FB00--FDFF  Alphabetic Presentation Forms, Arabic Presentation Forms-A
 FE00--FEFF  Combining Half Marks, CJK Compatibility Forms, Small Form Variants, Arabic Presentation Forms-B
 FF00--FFFF  Halfwidth and Fullwidth Forms, Specials
Unicode Charts
unicode allocation
unicode allocation

Ryan Stansifer <ryan@cs.fit.edu>
Last modified: Wed Sep 10 12:31:40 EDT 2003