1. Unicode标准
1.1. 由于ASCII字符集、ISO字符集、GBK字符集列出的字符集都有容量限制, 而且不兼容多语言环境, Unicode联盟开发了Unicode标准。
1.2. Unicode标准涵盖了世界上的所有字符、标点和符号。
1.3. 不论是何种平台、程序或语言, Unicode都能够进行文本数据的处理、存储和交换。
1.4. Unicode标准已经获得了成功, 在XML、Java、ECMAScript(JavaScript)、LDAP、CORBA 3.0、WML中, Unicode已经得到了实现。在许多操作系统以及所有的现代浏览器中, Unicode同样得到了支持。
2. Unicode字符
2.1. 网址: https://home.unicode.org/
2.2. Unicode标准版本13归档字符
编号 |
字符名称 |
范围 |
1 |
C0 Controls and Basic Latin(控制符及基本拉丁文) |
0000–007F |
2 |
C1 Controls and Latin-1 Supplement(控制符及拉丁文-1补充) |
0080–00FF |
3 |
Latin Extended-A(拉丁文扩展-A) |
0100–017F |
4 |
Latin Extended-B(拉丁文扩展-B) |
0180–024F |
5 |
IPA Extensions(国际音标扩展) |
0250–02AF |
6 |
Spacing Modifier Letters(空白修饰字母) |
02B0–02FF |
7 |
Combining Diacritical Marks(组合用附加符号) |
0300–036F |
8 |
Greek and Coptic(希腊文及科普特文) |
0370–03FF |
9 |
Cyrillic(西里尔语) |
0400–04FF |
10 |
Cyrillic Supplement(西里尔语补充) |
0500–052F |
11 |
Armenian(亚美尼亚语) |
0530–058F |
12 |
Hebrew(希伯来语) |
0590–05FF |
13 |
Arabic(阿拉伯语) |
0600–06FF |
14 |
Syriac(叙利亚语) |
0700–074F |
15 |
Arabic Supplement(阿拉伯语补充) |
0750–077F |
16 |
Thaana(塔纳文) |
0780–07BF |
17 |
NKo(西非书面语言) |
07C0–07FF |
18 |
Samaritan(撒玛利亚语) |
0800–083F |
19 |
Mandaic(曼代克语) |
0840–085F |
20 |
Syriac Supplement(叙利亚语补充) |
0860–086F |
21 |
Arabic Extended-A(阿拉伯语扩展-A) |
08A0–08FF |
22 |
Devanagari(天城体文字) |
0900–097F |
23 |
Bengali(孟加拉语) |
0980–09FF |
24 |
Gurmukhi(锡克教语) |
0A00–0A7F |
25 |
Gujarati(古吉拉特语) |
0A80–0AFF |
26 |
Oriya(奥里雅语) |
0B00–0B7F |
27 |
Tamil(泰米尔语) |
0B80–0BFF |
28 |
Telugu(泰卢固语) |
0C00–0C7F |
29 |
Kannada(卡纳拉语) |
0C80–0CFF |
30 |
Malayalam(马拉亚兰语) |
0D00–0D7F |
31 |
Sinhala(僧伽罗语) |
0D80–0DFF |
32 |
Thai(泰文) |
0E00–0E7F |
33 |
Lao(老挝文) |
0E80–0EFF |
34 |
Tibetan(藏文) |
0F00–0FFF |
35 |
Myanmar(缅甸语) |
1000–109F |
36 |
Georgian(格鲁吉亚语) |
10A0–10FF |
37 |
Hangul Jamo(朝鲜文) |
1100–11FF |
38 |
Ethiopic(埃塞俄比亚语) |
1200–137F |
39 |
Ethiopic Supplement(埃塞俄比亚语补充) |
1380–139F |
40 |
Cherokee(切罗基语) |
13A0–13FF |
41 |
Unified Canadian Aboriginal Syllabics(统一加拿大土著语音节) |
1400–167F |
42 |
Ogham(欧甘字母) |
1680–169F |
43 |
Runic(如尼文) |
16A0–16FF |
44 |
Tagalog(菲律宾语) |
1700–171F |
45 |
Hanunoo(塔加路文) |
1720–173F |
46 |
Buhid(布希德文) |
1740–175F |
47 |
Tagbanwa(塔格巴努亚文) |
1760–177F |
48 |
Khmer(高棉语) |
1780–17FF |
49 |
Mongolian(蒙古文) |
1800–18AF |
50 |
Unified Canadian Aboriginal Syllabics Extended(统一加拿大土著语音节扩展) |
18B0–18FF |
51 |
Limbu(林布文) |
1900–194F |
52 |
Tai Le(德宏傣文) |
1950–197F |
53 |
New Tai Lue(新傣文) |
1980–19DF |
54 |
Khmer Symbols(高棉语符号) |
19E0–19FF |
55 |
Buginese(布吉文) |
1A00–1A1F |
56 |
Tai Tham(老傣文) |
1A20–1AAF |
57 |
Combining Diacritical Marks Extended(组合用附加符号扩展) |
1AB0–1AFF |
58 |
Balinese(巴厘语) |
1B00–1B7F |
59 |
Sundanese(巽他语) |
1B80–1BBF |
60 |
Batak(巴塔克文) |
1BC0–1BFF |
61 |
Lepcha(雷布查语) |
1C00–1C4F |
62 |
Ol Chiki(欧甘语) |
1C50–1C7F |
63 |
Cyrillic Extended-C(西里尔语扩展-C) |
1C80–1C8F |
64 |
Georgian Extended(格鲁吉亚语扩展) |
1C90–1CBF |
65 |
Sundanese Supplement(巽他语补充) |
1CC0–1CCF |
66 |
Vedic Extensions(梵语扩展) |
1CD0–1CFF |
67 |
Phonetic Extensions(语音学扩展) |
1D00–1D7F |
68 |
Phonetic Extensions Supplement(语音学扩展补充) |
1D80–1DBF |
69 |
Combining Diacritical Marks Supplement(组合用附加符号补充) |
1DC0–1DFF |
70 |
Latin Extended Additional(拉丁文扩充附加) |
1E00–1EFF |
71 |
Greek Extended(希腊语扩充) |
1F00–1FFF |
72 |
General Punctuation(常用标点) |
2000–206F |
73 |
Superscripts and Subscripts(上标及下标) |
2070–209F |
74 |
Currency Symbols(货币符号) |
20A0–20CF |
75 |
Combining Diacritical Marks for Symbols(组合用记号) |
20D0–20FF |
76 |
Letterlike Symbols(字母式符号) |
2100–214F |
77 |
Number Forms(数字形式) |
2150–218F |
78 |
Arrows(箭头) |
2190–21FF |
79 |
Mathematical Operators(数学运算符) |
2200–22FF |
80 |
Miscellaneous Technical(杂项工业符号) |
2300–23FF |
81 |
Control Pictures(控制图片) |
2400–243F |
82 |
Optical Character Recognition(光学识别符) |
2440–245F |
83 |
Enclosed Alphanumerics(封闭式字母数字) |
2460–24FF |
84 |
Box Drawing(制表符) |
2500–257F |
85 |
Block Elements(方块元素) |
2580–259F |
86 |
Geometric Shapes(几何图形) |
25A0–25FF |
87 |
Miscellaneous Symbols(杂项符号) |
2600–26FF |
88 |
Dingbats(印刷符号) |
2700–27BF |
89 |
Miscellaneous Mathematical Symbols-A(杂项数学符号-A) |
27C0–27EF |
90 |
Supplemental Arrows-A(追加箭头-A) |
27F0–27FF |
91 |
Braille Patterns(盲文点字模型) |
2800–28FF |
92 |
Supplemental Arrows-B(追加箭头-B) |
2900–297F |
93 |
Miscellaneous Mathematical Symbols-B(杂项数学符号-B) |
2980–29FF |
94 |
Supplemental Mathematical Operators(追加数学运算符) |
2A00–2AFF |
95 |
Miscellaneous Symbols and Arrows(杂项符号和箭头) |
2B00–2BFF |
96 |
Glagolitic(格拉哥里字母) |
2C00–2C5F |
97 |
Latin Extended-C(拉丁文扩展-C) |
2C60–2C7F |
98 |
Coptic(古埃及语) |
2C80–2CFF |
99 |
Georgian Supplement(格鲁吉亚语补充) |
2D00–2D2F |
100 |
Tifinagh(提非纳文) |
2D30–2D7F |
101 |
Ethiopic Extended(埃塞俄比亚语扩展) |
2D80–2DDF |
102 |
Cyrillic Extended-A(西里尔语扩展-A) |
2DE0–2DFF |
103 |
Supplemental Punctuation(追加标点) |
2E00–2E7F |
104 |
CJK Radicals Supplement(CJK部首补充) |
2E80–2EFF |
105 |
Kangxi Radicals(康熙字典部首) |
2F00–2FDF |
106 |
Ideographic Description Characters(表意文字描述符) |
2FF0–2FFF |
107 |
CJK Symbols and Punctuation(CJK符号和标点) |
3000–303F |
108 |
Hiragana(日文平假名) |
3040–309F |
109 |
Katakana(日文片假名) |
30A0–30FF |
110 |
Bopomofo(注音字母) |
3100–312F |
111 |
Hangul Compatibility Jamo(朝鲜文兼容字母) |
3130–318F |
112 |
Kanbun(象形字注释标志) |
3190–319F |
113 |
Bopomofo Extended(注音字母扩展) |
31A0–31BF |
114 |
CJK Strokes(CJK笔画) |
31C0–31EF |
115 |
Katakana Phonetic Extensions(日文片假名语音扩展) |
31F0–31FF |
116 |
Enclosed CJK Letters and Months(封闭式CJK文字和月份) |
3200–32FF |
117 |
CJK Compatibility(CJK兼容) |
3300–33FF |
118 |
CJK Unified Ideographs Extension A(CJK统一表意文字扩展A) |
3400–4DBF |
119 |
Yijing Hexagram Symbols(易经六十四卦符号) |
4DC0–4DFF |
120 |
CJK Unified Ideographs(CJK统一表意文字(基本汉字)) |
4E00–9FFC |
121 |
Yi Syllables(彝文音节) |
A000–A48F |
122 |
Yi Radicals(彝文字根) |
A490–A4CF |
123 |
Lisu(傈僳语) |
A4D0–A4FF |
124 |
Vai(瓦伊语) |
A500–A63F |
125 |
Cyrillic Extended-B(西里尔字母扩展-B) |
A640–A69F |
126 |
Bamum(巴姆穆语) |
A6A0–A6FF |
127 |
Modifier Tone Letters(声调修饰字母) |
A700–A71F |
128 |
Latin Extended-D(拉丁文扩展-D) |
A720–A7FF |
129 |
Syloti Nagri(锡尔赫特文) |
A800–A82F |
130 |
Common Indic Number Forms(普通印度数字表) |
A830–A83F |
131 |
Phags-pa(八思巴字) |
A840–A87F |
132 |
Saurashtra(索拉什特拉) |
A880–A8DF |
133 |
Devanagari Extended(天城体文字扩展) |
A8E0–A8FF |
134 |
Kayah Li(克耶字母) |
A900–A92F |
135 |
Rejang(勒姜语) |
A930–A95F |
136 |
Hangul Jamo Extended-A(朝鲜文扩展-A) |
A960–A97F |
137 |
Javanese(爪哇语) |
A980–A9DF |
138 |
Myanmar Extended-B(缅甸语扩展-B) |
A9E0–A9FF |
139 |
Cham(鞑靼文) |
AA00–AA5F |
140 |
Myanmar Extended-A(缅甸语扩展-A) |
AA60–AA7F |
141 |
Tai Viet(越南傣文) |
AA80–AADF |
142 |
Meetei Mayek Extensions(曼尼普尔文扩展) |
AAE0–AAFF |
143 |
Ethiopic Extended-A(埃塞俄比亚文扩展-A) |
AB00–AB2F |
144 |
Latin Extended-E(拉丁文扩展-E) |
AB30–AB6F |
145 |
Cherokee Supplement(彻罗基语补充) |
AB70–ABBF |
146 |
Meetei Mayek(曼尼普尔文) |
ABC0–ABFF |
147 |
Hangul Syllables(朝鲜文音节) |
AC00–D7AF |
148 |
Hangul Jamo Extended-B(朝鲜文扩展-B) |
D7B0–D7FF |
149 |
High Surrogate Area(UTF-16高字节占用区域) |
D800-DBFF |
150 |
Low Surrogate Area(UTF-16低字节占用区域) |
DC00-DFFF |
151 |
Private Use Area(自行使用区域) |
E000-F8FF |
152 |
CJK Compatibility Ideographs(CJK兼容表意文字) |
F900–FAD9 |
153 |
Alphabetic Presentation Forms(字母表达形式) |
FB00–FB4F |
154 |
Arabic Presentation Forms-A(阿拉伯文表达形式-A) |
FB50–FDFF |
155 |
Variation Selectors(变量选择符) |
FE00–FE0F |
156 |
Vertical Forms(竖排形式) |
FE10–FE1F |
157 |
Combining Half Marks(组合用半符号) |
FE20–FE2F |
158 |
CJK Compatibility Forms(CJK兼容形式) |
FE30–FE4F |
159 |
Small Form Variants(小型变体形式) |
FE50–FE6F |
160 |
Arabic Presentation Forms-B(阿拉伯文表达形式-B) |
FE70–FEFF |
161 |
Halfwidth and Fullwidth Forms(半型及全型形式) |
FF00–FFEF |
162 |
Specials(特殊) |
FFF0–FFFF |
163 |
Linear B Syllabary |
10000–1007F |
164 |
Linear B Ideograms |
10080–100FF |
165 |
Aegean Numbers |
10100–1013F |
166 |
Ancient Greek Numbers |
10140–1018F |
167 |
Ancient Symbols |
10190–101CF |
168 |
Phaistos Disc |
101D0–101FF |
169 |
Lycian |
10280–1029F |
170 |
Carian |
102A0–102DF |
171 |
Coptic Epact Numbers |
102E0–102FF |
172 |
Old Italic |
10300–1032F |
173 |
Gothic |
10330–1034F |
174 |
Old Permic |
10350–1037F |
175 |
Ugaritic |
10380–1039F |
176 |
Old Persian |
103A0–103DF |
177 |
Deseret |
10400–1044F |
178 |
Shavian |
10450–1047F |
179 |
Osmanya |
10480–104AF |
180 |
Osage |
104B0–104FF |
181 |
Elbasan |
10500–1052F |
182 |
Caucasian Albanian |
10530–1056F |
183 |
Linear A |
10600–1077F |
184 |
Cypriot Syllabary |
10800–1083F |
185 |
Imperial Aramaic |
10840–1085F |
186 |
Palmyrene |
10860–1087F |
187 |
Nabataean |
10880–108AF |
188 |
Hatran |
108E0–108FF |
189 |
Phoenician |
10900–1091F |
190 |
Lydian |
10920–1093F |
191 |
Meroitic Hieroglyphs |
10980–1099F |
192 |
Meroitic Cursive |
109A0–109FF |
193 |
Kharoshthi |
10A00–10A5F |
194 |
Old South Arabian |
10A60–10A7F |
195 |
Old North Arabian |
10A80–10A9F |
196 |
Manichaean |
10AC0–10AFF |
197 |
Avestan |
10B00–10B3F |
198 |
Inscriptional Parthian |
10B40–10B5F |
199 |
Inscriptional Pahlavi |
10B60–10B7F |
200 |
Psalter Pahlavi |
10B80–10BAF |
201 |
Old Turkic |
10C00–10C4F |
202 |
Old Hungarian |
10C80–10CFF |
203 |
Hanifi Rohingya |
10D00–10D3F |
204 |
Rumi Numeral Symbols |
10E60–10E7F |
205 |
Yezidi |
10E80–10EBF |
206 |
Old Sogdian |
10F00–10F2F |
207 |
Sogdian |
10F30–10F6F |
208 |
Chorasmian |
10FB0–10FDF |
209 |
Elymaic |
10FE0–10FFF |
210 |
Brahmi |
11000–1107F |
211 |
Kaithi |
11080–110CF |
212 |
Sora Sompeng |
110D0–110FF |
213 |
Chakma |
11100–1114F |
214 |
Mahajani |
11150–1117F |
215 |
Sharada |
11180–111DF |
216 |
Sinhala Archaic Numbers |
111E0–111FF |
217 |
Khojki |
11200–1124F |
218 |
Multani |
11280–112AF |
219 |
Khudawadi |
112B0–112FF |
220 |
Grantha |
11300–1137F |
221 |
Newa |
11400–1147F |
222 |
Tirhuta |
11480–114DF |
223 |
Siddham |
11580–115FF |
224 |
Modi |
11600–1165F |
225 |
Mongolian Supplement |
11660–1167F |
226 |
Takri |
11680–116CF |
227 |
Ahom |
11700–1173F |
228 |
Dogra |
11800–1184F |
229 |
Warang Citi |
118A0–118FF |
230 |
Dives Akuru |
11900–1195F |
231 |
Nandinagari |
119A0–119FF |
232 |
Zanabazar Square |
11A00–11A4F |
233 |
Soyombo |
11A50–11AAF |
234 |
Pau Cin Hau |
11AC0–11AFF |
235 |
Bhaiksuki |
11C00–11C6F |
236 |
Marchen |
11C70–11CBF |
237 |
Masaram Gondi |
11D00–11D5F |
238 |
Gunjala Gondi |
11D60–11DAF |
239 |
Makasar |
11EE0–11EFF |
240 |
Lisu Supplement |
11FB0–11FBF |
241 |
Tamil Supplement |
11FC0–11FFF |
242 |
Cuneiform |
12000–123FF |
243 |
Cuneiform Numbers and Punctuation |
12400–1247F |
244 |
Early Dynastic Cuneiform |
12480–1254F |
245 |
Egyptian Hieroglyphs |
13000–1342F |
246 |
Egyptian Hieroglyph Format Controls |
13430–1343F |
247 |
Anatolian Hieroglyphs |
14400–1467F |
248 |
Bamum Supplement |
16800–16A3F |
249 |
Mro |
16A40–16A6F |
250 |
Bassa Vah |
16AD0–16AFF |
251 |
Pahawh Hmong |
16B00–16B8F |
252 |
Medefaidrin |
16E40–16E9F |
253 |
Miao |
16F00–16F9F |
254 |
Ideographic Symbols and Punctuation |
16FE0–16FFF |
255 |
Tangut |
17000–187F7 |
256 |
Tangut Components |
18800–18AFF |
257 |
Khitan Small Script |
18B00–18CFF |
258 |
Tangut Supplement |
18D00–18D08 |
259 |
Kana Supplement |
1B000–1B0FF |
260 |
Kana Extended-A |
1B100–1B12F |
261 |
Small Kana Extension |
1B130–1B16F |
262 |
Nushu |
1B170–1B2FF |
263 |
Duployan |
1BC00–1BC9F |
264 |
Shorthand Format Controls |
1BCA0–1BCAF |
265 |
Byzantine Musical Symbols |
1D000–1D0FF |
266 |
Musical Symbols |
1D100–1D1FF |
267 |
Ancient Greek Musical Notation |
1D200–1D24F |
268 |
Mayan Numerals |
1D2E0–1D2FF |
269 |
Tai Xuan Jing Symbols |
1D300–1D35F |
270 |
Counting Rod Numerals |
1D360–1D37F |
271 |
Mathematical Alphanumeric Symbols |
1D400–1D7FF |
272 |
Sutton SignWriting |
1D800–1DAAF |
273 |
Glagolitic Supplement |
1E000–1E02F |
274 |
Nyiakeng Puachue Hmong |
1E100–1E14F |
275 |
Wancho |
1E2C0–1E2FF |
276 |
Mende Kikakui |
1E800–1E8DF |
277 |
Adlam |
1E900–1E95F |
278 |
Indic Siyaq Numbers |
1EC70–1ECBF |
279 |
Ottoman Siyaq Numbers |
1ED00–1ED4F |
280 |
Arabic Mathematical Alphabetic Symbols |
1EE00–1EEFF |
281 |
Mahjong Tiles |
1F000–1F02F |
282 |
Domino Tiles |
1F030–1F09F |
283 |
Playing Cards |
1F0A0–1F0FF |
284 |
Enclosed Alphanumeric Supplement |
1F100–1F1FF |
285 |
Enclosed Ideographic Supplement |
1F200–1F2FF |
286 |
Miscellaneous Symbols and Pictographs |
1F300–1F5FF |
287 |
Emoticons |
1F600–1F64F |
288 |
Ornamental Dingbats |
1F650–1F67F |
289 |
Transport and Map Symbols |
1F680–1F6FF |
290 |
Alchemical Symbols |
1F700–1F77F |
291 |
Geometric Shapes Extended |
1F780–1F7FF |
292 |
Supplemental Arrows-C |
1F800–1F8FF |
293 |
Supplemental Symbols and Pictographs |
1F900–1F9FF |
294 |
Chess Symbols |
1FA00–1FA6F |
295 |
Symbols and Pictographs Extended-A |
1FA70–1FAFF |
296 |
Symbols for Legacy Computing |
1FB00–1FBFF |
297 |
Unassigned |
1FF80–1FFFF |
298 |
CJK Unified Ideographs Extension B |
20000–2A6DD |
299 |
CJK Unified Ideographs Extension C |
2A700–2B734 |
300 |
CJK Unified Ideographs Extension D |
2B740–2B81D |
301 |
CJK Unified Ideographs Extension E |
2B820–2CEA1 |
302 |
CJK Unified Ideographs Extension F |
2CEB0–2EBE0 |
303 |
CJK Compatibility Ideographs Supplement |
2F800–2FA1D |
304 |
Unassigned |
2FF80–2FFFF |
305 |
CJK Unified Ideographs Extension G |
30000–3134A |
306 |
Unassigned |
3FF80–3FFFF |
307 |
Unassigned |
4FF80–4FFFF |
308 |
Unassigned |
5FF80–5FFFF |
309 |
Unassigned |
6FF80–6FFFF |
310 |
Unassigned |
7FF80–7FFFF |
311 |
Unassigned |
8FF80–8FFFF |
312 |
Unassigned |
9FF80–9FFFF |
313 |
Unassigned |
AFF80–AFFFF |
314 |
Unassigned |
BFF80–BFFFF |
315 |
Unassigned |
CFF80–CFFFF |
316 |
Unassigned |
DFF80–DFFFF |
317 |
Tags |
E0000–E007F |
318 |
Variation Selectors Supplement |
E0100–E01EF |
319 |
Unassigned |
EFF80–EFFFF |
320 |
Supplementary Private Use Area-A |
FFF80–FFFFF |
321 |
Supplementary Private Use Area-B |
10FF80–10FFFF |