下面这个函数亲测实验成功。
获取txt编码格式函数如下:
private String getCharset(String fileName) throws IOException{
BufferedInputStream bin = new BufferedInputStream(new FileInputStream(fileName));
int p = (bin.read() << 8) + bin.read();
String code = null;
switch (p) {
case 0xefbb:
code = "UTF-8";
break;
case 0xfffe:
code = "Unicode";
break;
case 0xfeff:
code = "UTF-16BE";
break;
default:
code = "GBK";
}
return code;
}测试读取文件:
public String getTextFromText(String filePath){
try {
InputStreamReader isr = new InputStreamReader(new FileInputStream(filePath),getCharset(filePath));
BufferedReader br = new BufferedReader(isr);
StringBuffer sb = new StringBuffer();
String temp = null;
while((temp = br.readLine()) != null){
sb.append(temp);
}
br.close();
return sb.toString();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}原文http://blog.sina.com.cn/s/blog_68ed2a9b0100vqrn.html
备注:
实验发现这篇文章(http://tinyking.blog.51cto.com/3338571/667453)给的方法不行。
InputStream inputStream = new FileInputStream("E:/1.txt");
byte[] head = new byte[3];
inputStream.read(head);
String code = "";
code = "gb2312";
if (head[0] == -1 && head[1] == -2 )
code = "UTF-16";
if (head[0] == -2 && head[1] == -1 )
code = "Unicode";
if(head[0]==-17 && head[1]==-69 && head[2] ==-65)
code = "UTF-8";
System.out.println(code);
本文提供了一种通过Java实现的检测TXT文件编码格式的方法。利用该方法可以自动判断文件是否为UTF-8、UTF-16BE、Unicode或GBK编码,并读取文件内容。文章对比了两种不同的检测方法,验证了一种更为准确有效。
798

被折叠的 条评论
为什么被折叠?



