java utf-8 文件的读写

最新推荐文章于 2024-06-17 22:45:02 发布

原创最新推荐文章于 2024-06-17 22:45:02 发布 · 3.2k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#java #string #html #file #null #存储

本文介绍如何正确处理UTF-8编码的模板文件，包括读取文件内容到内存并转换为HTML文件的过程。文章强调了UTF-8编码的特殊字节标记及正确的读写方式，避免出现乱码。

场景：

将UTF-8编码的模版文件读入内存，然后转换后存储为html文件。

注意点：

utf-8编码的文件开头有3个附加字节：0xEF 0xBB 0xBF

unicode编码的文件开头有2个附加字节：0xFF 0xFE (little endian)，或者0xFE 0xFF(big endian)

ANSI编码则没有附加字节

如果读写不当，可能得不到预期结果（html会显示为乱码）

读示例：

String readFile(String utf8File) { Reader streamReader = new InputStreamReader(new FileInputStream(utf8File), "UTF-8"); Reader reader = new BufferedReader(streamReader); String line = null; StringBuilder text = new StringBuilder(); while ((line = reader.readLine()) != null) { text.append(line); } reader.close(); return text.toString(); }

写示例：

String template = readFile("c://utf8.template"); String outputFile = "c://utf8.htm"; Writer writer = new PrintWriter( new OutputStreamWriter(new FileOutputStream(outputFile), "UTF-8")); template = template.replace("$title$", "UTF-8 read and write"); writer.write(template); writer.close();