UTF-8编码去掉BOM头方法

原创已于 2025-05-06 15:09:47 修改 · 6.8k 阅读

7 ·

CC 4.0 BY-SA版权

文章标签：

#python #前端 #开发语言

于 2018-01-18 18:22:57 首次发布

本文介绍了一个Python脚本，该脚本能够遍历指定目录下的所有文件，并移除文本文件中可能存在的UTF-8 BOM（字节顺序标记）。此操作对于确保文本文件的一致性和兼容性至关重要。

# -*- coding: utf-8 -*-
#encoding=utf-8
import os
import codecs

def utf8 (path):
f = open(path,"r")
s = f.read()
f.close()

if s.startswith(codecs.BOM_UTF8):
s = s[len(codecs.BOM_UTF8):]
f = open(path, "w")
f.write(s)
f.flush()
f.close()


def getListFiles(path):
assert os.path.isdir(path), '%s not exist.' % path
ret = []
for root, dirs, files in os.walk(path):
#print '%s, %s, %s' % (root, dirs, files)
for filespath in files:
ret.append(os.path.join(root,filespath))
return ret


ret = getListFiles('d:\src')  //源码的目录
print len(ret)
for f in ret:
print f,"\n"
utf8(f)