nodejs读取本地中文json文件出现乱码

最新推荐文章于 2021-09-28 09:22:03 发布

SalmonellaVaccine

最新推荐文章于 2021-09-28 09:22:03 发布

阅读量8.4k

点赞数 1

CC 4.0 BY-SA版权

分类专栏： Node.js JSON

本文链接：https://blog.youkuaiyun.com/SalmonellaVaccine/article/details/53732732

Node.js 同时被 2 个专栏收录

61 篇文章

订阅专栏

JSON

15 篇文章

订阅专栏

本文介绍了在Node.js中读取包含中文的UTF-8 JSON文件时可能出现的乱码问题及其解决方案。首先，确认文件是否以UTF-8无BOM编码，因为Node.js不会自动去除BOM。可以使用NPM模块去除BOM或手动处理。其次，检查JSON格式是否有误，如缺少或多余的逗号。此外，注意Node.js代码文件本身应以UTF-8编码，否则可能在写入中文时产生乱码。在确保文件编码正确的情况下，直接以utf8格式读取文件即可。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1. 确定json文件是UTF-8 无BOM编码的的。如果有BOM，会在读取第一行的时候出现乱码。

Per "fs.readFileSync(filename, 'utf8') doesn't strip BOM markers #1918", fs.readFile is working as designed: BOM is not stripped from the header of the UTF-8 file, if it exists. It at the discretion of the developer to handle this.

Possible workarounds:

data = data.replace(/^\uFEFF/, ''); perhttps://github.com/joyent/node/issues/1918#issuecomment-2480359
Transform the incoming stream to remove the BOM header with the NPM module bomstrip perhttps://github.com/joyent/node/issues/1918#issuecomment-38491548

What you are getting is the byte order mark header (BOM) of the UTF-8 file. When JSON.parse sees this, it gives an syntax error (read: "unexpected character" error). You must strip the byte order mark from the file before passing it to JSON.parse:

fs.readFile('./myconfig.json', 'utf8', function (err, data) {
    myconfig = JSON.parse(data.toString('utf8').replace(/^\uFEFF/, ''));
});
// note: data is an instance of Buffer

http://stackoverflow.com/a/24376813

2. 确定json没有格式错误。我在用utf8编码并用utf8 encoding来读取文件之后依然报错，百思不得其解。

最后发现json有两个editor没有发现的格式错误，一个是一个数组中两个元素之间少了一个“,”，另一个是另一个数组最后多了一个“,”。

以下来自http://blog.youkuaiyun.com/youbl/article/details/29812669：

注1：Node的iconv模块，仅支持linux，不支持Windows，因此要用纯js的iconv-lite，另：作者说iconv-lite的性能更好，具体参考Git站点：iconv-lite

注2：我在测试读写文件时，始终无法把中文写入文件，一直乱码，读取正常，后来同事帮我发现：js文件的编码格式是ansi，nodejs的代码文件必须是utf8格式

注3：如果程序操作的文件，都是以UTF8编码格式保存的，那么就不需要使用iconv模块，直接以utf8格式读取文件即可，如：

[javascript]view plaincopy 
   
 // 参数file，必须保存为utf8格式，否则里面的中文会乱码  
 function readFile(file){  
     // readFile的第2个参数表示读取编码格式，如果未传递这个参数，表示返回Buffer字节数组  
     fs.readFile(file, "utf8", function(err, data){  
         if(err)  
             console.log("读取文件fail " + err);  
         else{  
             // 读取成功时  
             console.log(data);// 直接输出中文字符串了  
         }  
     });  
 }  

以下来自http://www.lai18.com/content/351104.html?from=cancel

1、nodejs读取中文文件编码问题
准备一个文本文件（当然也可以是csv文件等）test.txt和text.csv，nodejs文件test.js如下：

[js]view plaincopy 
   
 var iconv = require('iconv-lite');  
   
 var fs = require('fs');  
 var fileStr = fs.readFileSync('D:\\test.csv', {encoding:'binary'});  
   
 var buf = new Buffer(fileStr, 'binary');  
   
 var str = iconv.decode(buf, 'GBK');  
 console.log(str);