乱码产生的原因是zipfile在解码时,处理非ascii编码,只在Unicode和cp437之中选一个,采用gbk编码的文件就会没采用gbk解码而乱码
乱码检测:
def checkEncodeError(string):
try:
string.encode('gb2312')
except UnicodeEncodeError:
return True
return False
乱码修复:
zipPath = 'demo.zip'
if (zipfile.is_zipfile(zipPath)):
outdir = zipPath.replace('.zip', '')
f = zipfile.ZipFile(zipPath)
for file in f.namelist():
f.extract(file, outdir )
if (self.checkEncodeError(file)):
os.rename(os.path.join(outdir,file), os.path.join(outdir,file.encode('cp437').decode('gbk')))
f.close()