/**
*
*
* UTF-16 charset 使用 16 位量,因此对字节顺序敏感。 流的字节顺序可以由 Unicode 字符 '/uFEFF'
* 所表示的初始字节顺序标记 来指示。
*
* UTF-16BE 16 位 UCS 转换格式,Big Endian(最低地址存放高位字节)字节顺序
* UTF-16LE 16 位 UCS
* 转换格式,Little-endian(最高地址存放低位字节)字节顺序
*
* java中 如果没有feff的标志,则默认为 feff
*
* @throws UnsupportedEncodingException
*/
void unicodeShow() throws UnsupportedEncodingException {
String shz;
byte[] hz;
hz = new byte[4];
hz[0] = (byte) 0xfe;
hz[1] = (byte) 0xff;
hz[2] = 0x55;
hz[3] = 0x4a;
shz = new String(hz, "utf-16");
System.out.println(shz);
hz = new byte[2];
hz[0] = 0x55;
hz[1] = 0x4a;
shz = new String(hz, "utf-16");
System.out.println(shz);
hz = new byte[2];
hz[0] = 0x55;
hz[1] = 0x4a;
shz = new String(hz, "utf-16be");
System.out.println(shz);
hz = new byte[4];
hz[0] = (byte) 0xff;
hz[1] = (byte) 0xfe;
hz[2] = 0x4a;
hz[3] = 0x55;
shz = new String(hz, "utf-16");
System.out.println(shz);
hz = new byte[2];
hz[0] = 0x4a;
hz[1] = 0x55;
shz = new String(hz, "utf-16le");
System.out.println(shz);
System.out.println("啊 UNICODE:U+554A");
System.out.print(Integer.toHexString("啊".charAt(0) >> 8 & 0xff));
System.out.print(" ");
System.out.print(Integer.toHexString("啊".charAt(0) & 0xff));
System.out.println();
for (byte i : "啊".getBytes("utf-16"))
System.out.print(Integer.toHexString(i & 0xff) + " ");
System.out.println();
for (byte i : "啊".getBytes("utf-16be"))
System.out.print(Integer.toHexString(i & 0xff) + " ");
System.out.println();
for (byte i : "啊".getBytes("utf-16le"))
System.out.print(Integer.toHexString(i & 0xff) + " ");
System.out.println();
}
本文深入探讨了UTF-16编码方式及其不同字节顺序(Big Endian和Little Endian)的应用。通过示例代码解释了如何在Java中处理UTF-16编码的数据,并展示了如何正确解析带有字节顺序标记的字符串。
626

被折叠的 条评论
为什么被折叠?



