034_Unicode标准

Unicode标准旨在解决不同语言字符集的兼容性和扩展问题,覆盖世界所有语言的字符、标点和符号。它在XML、Java、JavaScript等技术以及多数操作系统和浏览器中得到广泛应用。标准包括多种语言文字,如拉丁文、希腊文、西里尔文、汉字等,并涵盖音标、数学符号、箭头、表情符号等多个领域。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1. Unicode标准

1.1. 由于ASCII字符集、ISO字符集、GBK字符集列出的字符集都有容量限制, 而且不兼容多语言环境, Unicode联盟开发了Unicode标准。

1.2. Unicode标准涵盖了世界上的所有字符、标点和符号。

1.3. 不论是何种平台、程序或语言, Unicode都能够进行文本数据的处理、存储和交换。

1.4.  Unicode标准已经获得了成功, 在XML、Java、ECMAScript(JavaScript)、LDAP、CORBA 3.0、WML中, Unicode已经得到了实现。在许多操作系统以及所有的现代浏览器中, Unicode同样得到了支持。

2. Unicode字符

2.1. 网址: https://home.unicode.org/

2.2. Unicode标准版本13归档字符

编号

字符名称

范围

1

C0 Controls and Basic Latin(控制符及基本拉丁文)

0000–007F

2

C1 Controls and Latin-1 Supplement(控制符及拉丁文-1补充)

0080–00FF

3

Latin Extended-A(拉丁文扩展-A)

0100–017F

4

Latin Extended-B(拉丁文扩展-B)

0180–024F

5

IPA Extensions(国际音标扩展)

0250–02AF

6

Spacing Modifier Letters(空白修饰字母)

02B0–02FF

7

Combining Diacritical Marks(组合用附加符号)

0300–036F

8

Greek and Coptic(希腊文及科普特文)

0370–03FF

9

Cyrillic(西里尔语)

0400–04FF

10

Cyrillic Supplement(西里尔语补充)

0500–052F

11

Armenian(亚美尼亚语)

0530–058F

12

Hebrew(希伯来语)

0590–05FF

13

Arabic(阿拉伯语)

0600–06FF

14

Syriac(叙利亚语)

0700–074F

15

Arabic Supplement(阿拉伯语补充)

0750–077F

16

Thaana(塔纳文)

0780–07BF

17

NKo(西非书面语言)

07C0–07FF

18

Samaritan(撒玛利亚语)

0800–083F

19

Mandaic(曼代克语)

0840–085F

20

Syriac Supplement(叙利亚语补充)

0860–086F

21

Arabic Extended-A(阿拉伯语扩展-A)

08A0–08FF

22

Devanagari(天城体文字)

0900–097F

23

Bengali(孟加拉语)

0980–09FF

24

Gurmukhi(锡克教语)

0A00–0A7F

25

Gujarati(古吉拉特语)

0A80–0AFF

26

Oriya(奥里雅语)

0B00–0B7F

27

Tamil(泰米尔语)

0B80–0BFF

28

Telugu(泰卢固语)

0C00–0C7F

29

Kannada(卡纳拉语)

0C80–0CFF

30

Malayalam(马拉亚兰语)

0D00–0D7F

31

Sinhala(僧伽罗语)

0D80–0DFF

32

Thai(泰文)

0E00–0E7F

33

Lao(老挝文)

0E80–0EFF

34

Tibetan(藏文)

0F00–0FFF

35

Myanmar(缅甸语)

1000–109F

36

Georgian(格鲁吉亚语)

10A0–10FF

37

Hangul Jamo(朝鲜文)

1100–11FF

38

Ethiopic(埃塞俄比亚语)

1200–137F

39

Ethiopic Supplement(埃塞俄比亚语补充)

1380–139F

40

Cherokee(切罗基语)

13A0–13FF

41

Unified Canadian Aboriginal Syllabics(统一加拿大土著语音节)

1400–167F

42

Ogham(欧甘字母)

1680–169F

43

Runic(如尼文)

16A0–16FF

44

Tagalog(菲律宾语)

1700–171F

45

Hanunoo(塔加路文)

1720–173F

46

Buhid(布希德文)

1740–175F

47

Tagbanwa(塔格巴努亚文)

1760–177F

48

Khmer(高棉语)

1780–17FF

49

Mongolian(蒙古文)

1800–18AF

50

Unified Canadian Aboriginal Syllabics Extended(统一加拿大土著语音节扩展)

18B0–18FF

51

Limbu(林布文)

1900–194F

52

Tai Le(德宏傣文)

1950–197F

53

New Tai Lue(新傣文)

1980–19DF

54

Khmer Symbols(高棉语符号)

19E0–19FF

55

Buginese(布吉文)

1A00–1A1F

56

Tai Tham(老傣文)

1A20–1AAF

57

Combining Diacritical Marks Extended(组合用附加符号扩展)

1AB0–1AFF

58

Balinese(巴厘语)

1B00–1B7F

59

Sundanese(巽他语)

1B80–1BBF

60

Batak(巴塔克文)

1BC0–1BFF

61

Lepcha(雷布查语)

1C00–1C4F

62

Ol Chiki(欧甘语)

1C50–1C7F

63

Cyrillic Extended-C(西里尔语扩展-C)

1C80–1C8F

64

Georgian Extended(格鲁吉亚语扩展)

1C90–1CBF

65

Sundanese Supplement(巽他语补充)

1CC0–1CCF

66

Vedic Extensions(梵语扩展)

1CD0–1CFF

67

Phonetic Extensions(语音学扩展)

1D00–1D7F

68

Phonetic Extensions Supplement(语音学扩展补充)

1D80–1DBF

69

Combining Diacritical Marks Supplement(组合用附加符号补充)

1DC0–1DFF

70

Latin Extended Additional(拉丁文扩充附加)

1E00–1EFF

71

Greek Extended(希腊语扩充)

1F00–1FFF

72

General Punctuation(常用标点)

2000–206F

73

Superscripts and Subscripts(上标及下标)

2070–209F

74

Currency Symbols(货币符号)

20A0–20CF

75

Combining Diacritical Marks for Symbols(组合用记号)

20D0–20FF

76

Letterlike Symbols(字母式符号)

2100–214F

77

Number Forms(数字形式)

2150–218F

78

Arrows(箭头)

2190–21FF

79

Mathematical Operators(数学运算符)

2200–22FF

80

Miscellaneous Technical(杂项工业符号)

2300–23FF

81

Control Pictures(控制图片)

2400–243F

82

Optical Character Recognition(光学识别符)

2440–245F

83

Enclosed Alphanumerics(封闭式字母数字)

2460–24FF

84

Box Drawing(制表符)

2500–257F

85

Block Elements(方块元素)

2580–259F

86

Geometric Shapes(几何图形)

25A0–25FF

87

Miscellaneous Symbols(杂项符号)

2600–26FF

88

Dingbats(印刷符号)

2700–27BF

89

Miscellaneous Mathematical Symbols-A(杂项数学符号-A)

27C0–27EF

90

Supplemental Arrows-A(追加箭头-A)

27F0–27FF

91

Braille Patterns(盲文点字模型)

2800–28FF

92

Supplemental Arrows-B(追加箭头-B)

2900–297F

93

Miscellaneous Mathematical Symbols-B(杂项数学符号-B)

2980–29FF

94

Supplemental Mathematical Operators(追加数学运算符)

2A00–2AFF

95

Miscellaneous Symbols and Arrows(杂项符号和箭头)

2B00–2BFF

96

Glagolitic(格拉哥里字母)

2C00–2C5F

97

Latin Extended-C(拉丁文扩展-C)

2C60–2C7F

98

Coptic(古埃及语)

2C80–2CFF

99

Georgian Supplement(格鲁吉亚语补充)

2D00–2D2F

100

Tifinagh(提非纳文)

2D30–2D7F

101

Ethiopic Extended(埃塞俄比亚语扩展)

2D80–2DDF

102

Cyrillic Extended-A(西里尔语扩展-A)

2DE0–2DFF

103

Supplemental Punctuation(追加标点)

2E00–2E7F

104

CJK Radicals Supplement(CJK部首补充)

2E80–2EFF

105

Kangxi Radicals(康熙字典部首)

2F00–2FDF

106

Ideographic Description Characters(表意文字描述符)

2FF0–2FFF

107

CJK Symbols and Punctuation(CJK符号和标点)

3000–303F

108

Hiragana(日文平假名)

3040–309F

109

Katakana(日文片假名)

30A0–30FF

110

Bopomofo(注音字母)

3100–312F

111

Hangul Compatibility Jamo(朝鲜文兼容字母)

3130–318F

112

Kanbun(象形字注释标志)

3190–319F

113

Bopomofo Extended(注音字母扩展)

31A0–31BF

114

CJK Strokes(CJK笔画)

31C0–31EF

115

Katakana Phonetic Extensions(日文片假名语音扩展)

31F0–31FF

116

Enclosed CJK Letters and Months(封闭式CJK文字和月份)

3200–32FF

117

CJK Compatibility(CJK兼容)

3300–33FF

118

CJK Unified Ideographs Extension A(CJK统一表意文字扩展A)

3400–4DBF

119

Yijing Hexagram Symbols(易经六十四卦符号)

4DC0–4DFF

120

CJK Unified Ideographs(CJK统一表意文字(基本汉字))

4E00–9FFC

121

Yi Syllables(彝文音节)

A000–A48F

122

Yi Radicals(彝文字根)

A490–A4CF

123

Lisu(傈僳语)

A4D0–A4FF

124

Vai(瓦伊语)

A500–A63F

125

Cyrillic Extended-B(西里尔字母扩展-B)

A640–A69F

126

Bamum(巴姆穆语)

A6A0–A6FF

127

Modifier Tone Letters(声调修饰字母)

A700–A71F

128

Latin Extended-D(拉丁文扩展-D)

A720–A7FF

129

Syloti Nagri(锡尔赫特文)

A800–A82F

130

Common Indic Number Forms(普通印度数字表)

A830–A83F

131

Phags-pa(八思巴字)

A840–A87F

132

Saurashtra(索拉什特拉)

A880–A8DF

133

Devanagari Extended(天城体文字扩展)

A8E0–A8FF

134

Kayah Li(克耶字母)

A900–A92F

135

Rejang(勒姜语)

A930–A95F

136

Hangul Jamo Extended-A(朝鲜文扩展-A)

A960–A97F

137

Javanese(爪哇语)

A980–A9DF

138

Myanmar Extended-B(缅甸语扩展-B)

A9E0–A9FF

139

Cham(鞑靼文)

AA00–AA5F

140

Myanmar Extended-A(缅甸语扩展-A)

AA60–AA7F

141

Tai Viet(越南傣文)

AA80–AADF

142

Meetei Mayek Extensions(曼尼普尔文扩展)

AAE0–AAFF

143

Ethiopic Extended-A(埃塞俄比亚文扩展-A)

AB00–AB2F

144

Latin Extended-E(拉丁文扩展-E)

AB30–AB6F

145

Cherokee Supplement(彻罗基语补充)

AB70–ABBF

146

Meetei Mayek(曼尼普尔文)

ABC0–ABFF

147

Hangul Syllables(朝鲜文音节)

AC00–D7AF

148

Hangul Jamo Extended-B(朝鲜文扩展-B)

D7B0–D7FF

149

High Surrogate Area(UTF-16高字节占用区域)

D800-DBFF

150

Low Surrogate Area(UTF-16低字节占用区域)

DC00-DFFF

151

Private Use Area(自行使用区域)

E000-F8FF

152

CJK Compatibility Ideographs(CJK兼容表意文字)

F900–FAD9

153

Alphabetic Presentation Forms(字母表达形式)

FB00–FB4F

154

Arabic Presentation Forms-A(阿拉伯文表达形式-A)

FB50–FDFF

155

Variation Selectors(变量选择符)

FE00–FE0F

156

Vertical Forms(竖排形式)

FE10–FE1F

157

Combining Half Marks(组合用半符号)

FE20–FE2F

158

CJK Compatibility Forms(CJK兼容形式)

FE30–FE4F

159

Small Form Variants(小型变体形式)

FE50–FE6F

160

Arabic Presentation Forms-B(阿拉伯文表达形式-B)

FE70–FEFF

161

Halfwidth and Fullwidth Forms(半型及全型形式)

FF00–FFEF

162

Specials(特殊)

FFF0–FFFF

163

Linear B Syllabary

10000–1007F

164

Linear B Ideograms

10080–100FF

165

Aegean Numbers

10100–1013F

166

Ancient Greek Numbers

10140–1018F

167

Ancient Symbols

10190–101CF

168

Phaistos Disc

101D0–101FF

169

Lycian

10280–1029F

170

Carian

102A0–102DF

171

Coptic Epact Numbers

102E0–102FF

172

Old Italic

10300–1032F

173

Gothic

10330–1034F

174

Old Permic

10350–1037F

175

Ugaritic

10380–1039F

176

Old Persian

103A0–103DF

177

Deseret

10400–1044F

178

Shavian

10450–1047F

179

Osmanya

10480–104AF

180

Osage

104B0–104FF

181

Elbasan

10500–1052F

182

Caucasian Albanian

10530–1056F

183

Linear A

10600–1077F

184

Cypriot Syllabary

10800–1083F

185

Imperial Aramaic

10840–1085F

186

Palmyrene

10860–1087F

187

Nabataean

10880–108AF

188

Hatran

108E0–108FF

189

Phoenician

10900–1091F

190

Lydian

10920–1093F

191

Meroitic Hieroglyphs

10980–1099F

192

Meroitic Cursive

109A0–109FF

193

Kharoshthi

10A00–10A5F

194

Old South Arabian

10A60–10A7F

195

Old North Arabian

10A80–10A9F

196

Manichaean

10AC0–10AFF

197

Avestan

10B00–10B3F

198

Inscriptional Parthian

10B40–10B5F

199

Inscriptional Pahlavi

10B60–10B7F

200

Psalter Pahlavi

10B80–10BAF

201

Old Turkic

10C00–10C4F

202

Old Hungarian

10C80–10CFF

203

Hanifi Rohingya

10D00–10D3F

204

Rumi Numeral Symbols

10E60–10E7F

205

Yezidi

10E80–10EBF

206

Old Sogdian

10F00–10F2F

207

Sogdian

10F30–10F6F

208

Chorasmian

10FB0–10FDF

209

Elymaic

10FE0–10FFF

210

Brahmi

11000–1107F

211

Kaithi

11080–110CF

212

Sora Sompeng

110D0–110FF

213

Chakma

11100–1114F

214

Mahajani

11150–1117F

215

Sharada

11180–111DF

216

Sinhala Archaic Numbers

111E0–111FF

217

Khojki

11200–1124F

218

Multani

11280–112AF

219

Khudawadi

112B0–112FF

220

Grantha

11300–1137F

221

Newa

11400–1147F

222

Tirhuta

11480–114DF

223

Siddham

11580–115FF

224

Modi

11600–1165F

225

Mongolian Supplement

11660–1167F

226

Takri

11680–116CF

227

Ahom

11700–1173F

228

Dogra

11800–1184F

229

Warang Citi

118A0–118FF

230

Dives Akuru

11900–1195F

231

Nandinagari

119A0–119FF

232

Zanabazar Square

11A00–11A4F

233

Soyombo

11A50–11AAF

234

Pau Cin Hau

11AC0–11AFF

235

Bhaiksuki

11C00–11C6F

236

Marchen

11C70–11CBF

237

Masaram Gondi

11D00–11D5F

238

Gunjala Gondi

11D60–11DAF

239

Makasar

11EE0–11EFF

240

Lisu Supplement

11FB0–11FBF

241

Tamil Supplement

11FC0–11FFF

242

Cuneiform

12000–123FF

243

Cuneiform Numbers and Punctuation

12400–1247F

244

Early Dynastic Cuneiform

12480–1254F

245

Egyptian Hieroglyphs

13000–1342F

246

Egyptian Hieroglyph Format Controls

13430–1343F

247

Anatolian Hieroglyphs

14400–1467F

248

Bamum Supplement

16800–16A3F

249

Mro

16A40–16A6F

250

Bassa Vah

16AD0–16AFF

251

Pahawh Hmong

16B00–16B8F

252

Medefaidrin

16E40–16E9F

253

Miao

16F00–16F9F

254

Ideographic Symbols and Punctuation

16FE0–16FFF

255

Tangut

17000–187F7

256

Tangut Components

18800–18AFF

257

Khitan Small Script

18B00–18CFF

258

Tangut Supplement

18D00–18D08

259

Kana Supplement

1B000–1B0FF

260

Kana Extended-A

1B100–1B12F

261

Small Kana Extension

1B130–1B16F

262

Nushu

1B170–1B2FF

263

Duployan

1BC00–1BC9F

264

Shorthand Format Controls

1BCA0–1BCAF

265

Byzantine Musical Symbols

1D000–1D0FF

266

Musical Symbols

1D100–1D1FF

267

Ancient Greek Musical Notation

1D200–1D24F

268

Mayan Numerals

1D2E0–1D2FF

269

Tai Xuan Jing Symbols

1D300–1D35F

270

Counting Rod Numerals

1D360–1D37F

271

Mathematical Alphanumeric Symbols

1D400–1D7FF

272

Sutton SignWriting

1D800–1DAAF

273

Glagolitic Supplement

1E000–1E02F

274

Nyiakeng Puachue Hmong

1E100–1E14F

275

Wancho

1E2C0–1E2FF

276

Mende Kikakui

1E800–1E8DF

277

Adlam

1E900–1E95F

278

Indic Siyaq Numbers

1EC70–1ECBF

279

Ottoman Siyaq Numbers

1ED00–1ED4F

280

Arabic Mathematical Alphabetic Symbols

1EE00–1EEFF

281

Mahjong Tiles

1F000–1F02F

282

Domino Tiles

1F030–1F09F

283

Playing Cards

1F0A0–1F0FF

284

Enclosed Alphanumeric Supplement

1F100–1F1FF

285

Enclosed Ideographic Supplement

1F200–1F2FF

286

Miscellaneous Symbols and Pictographs

1F300–1F5FF

287

Emoticons

1F600–1F64F

288

Ornamental Dingbats

1F650–1F67F

289

Transport and Map Symbols

1F680–1F6FF

290

Alchemical Symbols

1F700–1F77F

291

Geometric Shapes Extended

1F780–1F7FF

292

Supplemental Arrows-C

1F800–1F8FF

293

Supplemental Symbols and Pictographs

1F900–1F9FF

294

Chess Symbols

1FA00–1FA6F

295

Symbols and Pictographs Extended-A

1FA70–1FAFF

296

Symbols for Legacy Computing

1FB00–1FBFF

297

Unassigned

1FF80–1FFFF

298

CJK Unified Ideographs Extension B

20000–2A6DD

299

CJK Unified Ideographs Extension C

2A700–2B734

300

CJK Unified Ideographs Extension D

2B740–2B81D

301

CJK Unified Ideographs Extension E

2B820–2CEA1

302

CJK Unified Ideographs Extension F

2CEB0–2EBE0

303

CJK Compatibility Ideographs Supplement

2F800–2FA1D

304

Unassigned

2FF80–2FFFF

305

CJK Unified Ideographs Extension G

30000–3134A

306

Unassigned

3FF80–3FFFF

307

Unassigned

4FF80–4FFFF

308

Unassigned

5FF80–5FFFF

309

Unassigned

6FF80–6FFFF

310

Unassigned

7FF80–7FFFF

311

Unassigned

8FF80–8FFFF

312

Unassigned

9FF80–9FFFF

313

Unassigned

AFF80–AFFFF

314

Unassigned

BFF80–BFFFF

315

Unassigned

CFF80–CFFFF

316

Unassigned

DFF80–DFFFF

317

Tags

E0000–E007F

318

Variation Selectors Supplement

E0100–E01EF

319

Unassigned

EFF80–EFFFF

320

Supplementary Private Use Area-A

FFF80–FFFFF

321

Supplementary Private Use Area-B

10FF80–10FFFF

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值