python request(url,headers=…).text 获取到的中文是乱码(已解决)
###这是响应内容的text###
[“ext1”,“1”,“userBirthday”,“1998-06-24 00:00:00”,“lastLogin”,"",“userPassword”,"",“userEmail”,"…",“userQQ”,"…",“pcImageUrl”,"…",“userAddress”,“新疆出å
¥å¢ƒè¾¹é˜²æ£€æŸ¥æ€»ç«™”,“ipoint”,“0”,“todayapoint”,“0”,“todaydate”,"…",“updateTime”,“2020-08-01 11:39:01”,“ppoint”,"…",“userName”,“王泽”,“imageUrl”,"…",“todaytpoint”,"…",“operator”,"…",“roleId”,“7”,“rankId”,“38”,“userBindingType”,“2”,“apoint”,"…",“domainCode”,"…",“areaCode”,“654000”,“lockFlag”,“0”,“todayipoint”,“0”,“todaylpoint”,“4”,“todayspoint”,“0”,“spoint”,"…",“userType”,“1”,“userPost”,“勤务ä¿éšœå¤§é˜Ÿæ°‘覔,“userSex”,“1”,“tpoint”,"…",“userIdCard”,"…",“todayepoint”,“0”,“createTime”,“2020-08-01 11:39:01”,“id”,"…",“lpoint”,"…",“userPhone”,"…",“politicsCode”,“p02”,“epoint”,"…",“domainName”,“ä¸å›½äººæ°‘æ¦è£
è¦å¯Ÿéƒ¨é˜Ÿä¼ŠçŠå“ˆè¨å
‹è‡ªæ²»å·žè¾¹é˜²æ”¯é˜ŸåŽå‹¤å¤„",“rankName”,"普通å
¬åŠ¡å‘˜”,“areaName”,“伊çŠå“ˆè¨å
‹è‡ªæ²»å·ž”,“politicsName”,“å
±é’团员”]
requests会从服务器返回的响应头的 Content-Type 去获取字符集编码,如果content-type有charset字段那么requests才能正确识别编码,否则就使用默认的 ISO-8859-1
可以通过 res.apparent_encoding 来查看本页面使用的编码

明确了网页的字符集编码后可以使用 res.encoding = ‘utf-8’ 获取正确结果。

本文介绍如何解决使用Python的requests库获取网页内容时遇到的中文乱码问题。当requests无法从Content-Type中正确识别字符编码时,可通过res.apparent_encoding确定页面编码,并设置res.encoding为正确的编码类型(如'utf-8'),从而获取正常显示的中文内容。
1545

被折叠的 条评论
为什么被折叠?



