源:http://setting.iteye.com/blog/786887
评:
如何判断一个文件或一个bytes是 utf?
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.
00 00 00 xx UTF-32BE
00 xx 00 xx UTF-16BE
xx 00 00 00 UTF-32LE
xx 00 xx 00 UTF-16LE
xx xx xx xx UTF-8
rfc4627即json标准规范中,给了一个简单的判断方法。
评:
如何判断一个文件或一个bytes是 utf?
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.
00 00 00 xx UTF-32BE
00 xx 00 xx UTF-16BE
xx 00 00 00 UTF-32LE
xx 00 xx 00 UTF-16LE
xx xx xx xx UTF-8
rfc4627即json标准规范中,给了一个简单的判断方法。
本文介绍了一种简单的方法来判断一个文件或字节流是否使用了UTF-8、UTF-16或UTF-32编码。根据JSON标准规范(RFC4627),通过检查前四个字节中的空字节模式可以确定文本的编码方式。
630

被折叠的 条评论
为什么被折叠?



