python中difflib模块祥解

本文介绍了Python difflib模块的功能,包括Differ、HtmlDiff、context_diff等方法用于文本差异比较,以及get_close_matches和ndiff等方法。模块能生成文本和HTML格式的差异结果,示例中展示了比较两个文件内容不同的过程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一 difflib模块
difflib模块:是提供的类和方法用来进行序列的差异化比较,它能够比对文件并生成差异结果文本或者html格式的差异化比较页面
1) Differ:以文本格式显示结果

import  difflib

text1 = '''  
    1. Beautiful is better than ugly.
    2. Explicit is better than implicit.
    3. Simple is better than complex.
    4. Complex is better than complicated.
'''.splitlines(keepends=True)


text2 = '''  
    1. Beautifu  is better than ugly.
    2. Explicit is better than implicit.
    3. Simple is better than complex.
    4. Complex is better than complicated.
'''.splitlines(keepends=True)



#以文本方式展示两个文本的不同:
d = difflib.Differ()
result = list(d.compare(text1, text2))
result = " ".join(result)
print(result)

这里写图片描述
其中 + - 号表示有差异行,?为下标显示
2) HtmlDiff:以html方式显示结果

import  difflib

text1 = '''  
    1. Beautiful is better than ugly.
    2. Explicit is better than implicit.
    3. Simple is better than complex.
    4. Complex is better than complicated.
'''.splitlines(keepends=True)


text2 = '''  
    1. Beautifu  is better than ugly.
    2. Explicit is better than implicit.
    3. Simple is better than complex.
    4. Complex is better than complicated.
'''.splitlines(keepends=True)
#以html方式展示两个文本的不同, 浏览器打开:
d = difflib.HtmlDiff()
with open("passwd.html", 'w') as f:
    f.write(d.make_file(text1, text2))

这里写图片描述
用颜色高亮显示文本的增加,删除或者更改
3) context_diff:返回一个差异文本行的生成器

from difflib import context_diff

import sys

s1 = ['bacon\n', 'eggs\n', 'ham\n', 'guido\n']
s2 = ['python\n', 'eggy\n', 'hamster\n', 'guido\n']
for line in context_diff(s1, s2, fromfile='before.py', tofile='after.py'):
    sys.stdout.write(line)  

这里写图片描述
对于字符串列表进行比较,可以看出只有第四个元素是相同的
假使s1 = ['eggs\n', 'ham\n', 'guido\n']为三个元素
则结果为:
这里写图片描述
每个元素会依次进行比较,而不是按照索引进行比较
4) get_close_matches:返回最大匹配结果的列表

from difflib import get_close_matches

d=get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
print(d)

这里写图片描述
返回的是与对象相似的所有元素的列表
5) ndiff:返回一个文本格式的差异结果

from difflib import ndiff

diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
              'ore\ntree\nemu\n'.splitlines(1))
print(''.join(diff))

这里写图片描述
6) restore:返回一个由两个比对序列产生的结果

from difflib import ndiff, restore

diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
              'ore\ntree\nemu\n'.splitlines(1))
diff = list(diff) # materialize the generated delta into a list
print(''.join(restore(diff, 1)))

这里写图片描述
假使print(''.join(restore(diff, 2))
这里写图片描述
因只有两个列表,不能再大于2了,表示的是列表数
二 简单应用
比较 /etc/passwd 和 /tmp/passwd 两个文件内容的不同

import  difflib


file1 = '/etc/passwd'
file2 = '/tmp/passwd'


with open(file1)  as f1, open(file2) as f2:
    text1 = f1.readlines()
    text2 = f2.readlines()

d = difflib.HtmlDiff()
with open("passwd.html", 'w') as f:
    f.write(d.make_file(text1, text2))

这里写图片描述
利用difflib模块来比较两个文件的不同,同时使用html格式高亮显示

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值