The table below summarizes the format of these different octet types.
The letter x indicates bits available for encoding bits of the
character number.
Char. number range | UTF-8 octet sequence
(hexadecimal) | (binary)
--------------------+---------------------------------------------
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx //A/
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
这是一个Unicode编码和utf-8编码之间的对应关系表。中文的Unicode编码范围在0000 0800-0000 FFFF 中。我在前面发的代码只作了这些转换
The letter x indicates bits available for encoding bits of the
character number.
Char. number range | UTF-8 octet sequence
(hexadecimal) | (binary)
--------------------+---------------------------------------------
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx //A/
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
这是一个Unicode编码和utf-8编码之间的对应关系表。中文的Unicode编码范围在0000 0800-0000 FFFF 中。我在前面发的代码只作了这些转换