python中二进制文件有哪些_在Python中比较二进制文件

博客围绕比较两个二进制文件差异展开。给出了两个二进制文件示例,询问是否有库可实现差异比较,还是需手动写循环。并给出解决方案,使用itertools.groupby()函数,还分别给出Python 3.x和2.x版本的代码示例。

I've got two binary files. They look something like this, but the data is more random:

File A:

FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF ...

File B:

41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37 ...

What I'd like is to call something like:

>>> someDiffLib.diff(file_a_data, file_b_data)

And receive something like:

[Match(pos=4, length=4)]

Indicating that in both files the bytes at position 4 are the same for 4 bytes. The sequence 44 43 42 41 would not match because they're not in the same positions in each file.

Is there a library that will do the diff for me? Or should I just write the loops to do the comparison?

解决方案

You can use itertools.groupby() for this, here is an example:

from itertools import groupby

# this just sets up some byte strings to use, Python 2.x version is below

# instead of this you would use f1 = open('some_file', 'rb').read()

f1 = bytes(int(b, 16) for b in 'FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF'.split())

f2 = bytes(int(b, 16) for b in '41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37'.split())

matches = []

for k, g in groupby(range(min(len(f1), len(f2))), key=lambda i: f1[i] == f2[i]):

if k:

pos = next(g)

length = len(list(g)) + 1

matches.append((pos, length))

Or the same thing as above using a list comprehension:

matches = [(next(g), len(list(g))+1)

for k, g in groupby(range(min(len(f1), len(f2))), key=lambda i: f1[i] == f2[i])

if k]

Here is the setup for the example if you are using Python 2.x:

f1 = ''.join(chr(int(b, 16)) for b in 'FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF'.split())

f2 = ''.join(chr(int(b, 16)) for b in '41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37'.split())

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值