哈夫曼编码python算法实现(代码版)

一、问题:      

        请使用哈夫曼编码方法对给定的字符串,进行编码,以满足发送的编码总长度最小,且方便译码。“AABBCCDDEEABCDDCDBAEEAAA”

二、过程:

import heapq
import collections

class Node:
    def __init__(self, char, freq):
        self.char = char
        self.freq = freq
        self.left = None
        self.right = None

    def __lt__(self, other):
        return self.freq < other.freq

def build_frequency_table(text):
    return collections.Counter(text)

def build_huffman_tree(frequencies):
    priority_queue = [Node(char, freq) for char, freq in frequencies.items()]
    heapq.heapify(priority_queue)

    while len(priority_queue) > 1:
        left = heapq.heappop(priority_queue)
        right = heapq.heappop(priority_queue)
        merged = Node(None, left.freq + right.freq)
        merged.left = left
        merged.right = right
        heapq.heappush(priority_queue, merged)

    return priority_queue[0]

def build_huffman_codes(root, prefix="", codebook={}):
    if root is None:
        return

    if root.char is not None:
        codebook[root.char] = prefix
        return codebook

    build_huffman_codes(root.left, prefix + "0", codebook)
    build_huffman_codes(root.right, prefix + "1", codebook)
    return codebook

def huffman_encoding(text):
    frequencies = build_frequency_table(text)
    root = build_huffman_tree(frequencies)
    huffman_codes = build_huffman_codes(root)

    encoded_text = "".join([huffman_codes[char] for char in text])
    return encoded_text, huffman_codes

text = "AABBCCDDEEABCDDCDBAEEAAA"
encoded_text, huffman_codes = huffman_encoding(text)

print("原始文本:", text)
print("哈夫曼编码:", huffman_codes)
print("编码后的文本:", encoded_text)

 

三、结果:

哈夫曼编码是一种用于数据压缩的自适应二进制前缀编码算法,它通过构建一棵最优的二叉树来进行编码。在Python中,我们可以使用字典和堆数据结构来实现哈夫曼编码。这里是一个简单的例子: ```python import heapq # 定义一个节点类 class Node: def __init__(self, char, freq): self.char = char self.freq = freq self.left = None self.right = None # 比较函数用于堆排序 def __lt__(self, other): return self.freq < other.freq # 创建一个空堆 heap = [] # 输入字符及其频率 data = {'A': 50, 'B': 10, 'C': 20, 'D': 40} # 将字符和频率插入堆 for char, freq in data.items(): node = Node(char, freq) heapq.heappush(heap, node) while len(heap) > 1: # 弹出堆顶的两个节点 left = heapq.heappop(heap) right = heapq.heappop(heap) # 合并这两个节点,并更新频率 merged = Node(None, left.freq + right.freq) merged.left = left merged.right = right # 将合并后的节点放回堆 heapq.heappush(heap, merged) # 根节点就是哈夫曼树的根,获取其路径 huff_tree_root = heap[0] huffman_codes = {} def traverse(node, code=''): if node.char is not None: huffman_codes[node.char] = code else: traverse(node.left, code + '0') traverse(node.right, code + '1') traverse(huff_tree_root) ``` 在这个代码中,我们首先创建了一个空的堆,然后将每个字符及其频率作为节点放入堆中。接着,我们不断从堆中取出两个频率最高的节点合并它们,直到只剩下一个节点为止。最后,我们遍历这棵哈夫曼树,记录下每个字符对应的编码。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值