Huffman编码的方法
(1)统计符号发生的概率。
(2)按照出现概率从小到大排序。
(3)每一次选出概率最小的两个符号作为二叉树的叶节点,将和作为它们的根节点,其频率为两个子节点频率之和,这两个叶子节点不再参与比较,再用新的根节点参与比较。
(4)重复(3)步骤,直到得到概率为1的根节点。
(5)二叉树的左节点为0,右节点为1,从上到下由根节点到叶节点得到每个叶节点的编码。
Huffman节点及Huffman码字节点的数据结构
typedef struct huffman_node_tag
{
unsigned char isLeaf; // 是否为叶节点,1是0否
unsigned long count; //信源中出现频数
struct huffman_node_tag *parent; //父节点指针
union{
struct{ //如果不是叶节点,这里为左右子节点指针
struct huffman_node_tag *zero, *one;
};
unsigned char symbol; //如果是叶节点,这里为一个信源符号
};
} huffman_node;
typedef struct huffman_code_tag //码字数据类型
{
unsigned long numbits; //码字长度
/* 码字的第1到第8比特由低到高保存在bits[0]中,第9比特到第16比特保存在bits[1]中/
unsigned char *bits;
} huffman_code;
静态链接库
该程序文件包含两个两个工程(project),其中“Huff_run”为主工程(Win32 Console Application),其中包含程序的主函数,有“Huff_code”为库工程(Win32 Static Library)。
Huffman编码的流程
1.读入文件。
2.进行第一次扫描,统计文件中各个字符出现的频率。
3.建立huffman树。
4.将码表及其他必要信息写入输出文件。
5.第二次扫描,对源文件进行编码并输出。
Huff_code
Huffman.h
/*
* huffman_coder - Encode/Decode files using Huffman encoding.
* http://huffman.sourceforge.net
* Copyright (C) 2003 Douglas Ryan Richardson; Gauss Interprise, Inc
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef HUFFMAN_HUFFMAN_H
#define HUFFMAN_HUFFMAN_H
#include <stdio.h>
int huffman_encode_file(FILE *in, FILE *out,FILE *out_Table );//step1:changed by yzhang for huffman statistics
int huffman_decode_file(FILE *in, FILE *out);
int huffman_encode_memory(const unsigned char *bufin,
unsigned int bufinlen,
unsigned char **pbufout,
unsigned int *pbufoutlen);
int huffman_decode_memory(const unsigned char *bufin,
unsigned int bufinlen,
unsigned char **bufout,
unsigned int *pbufoutlen);
#endif
Huffman.c
1.从源文件中读取数据(本实