Huffman编码的实现

最新推荐文章于 2023-12-01 15:01:18 发布

原创最新推荐文章于 2023-12-01 15:01:18 发布 · 1.9k 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#string #iostream #数据结构 #存储 #class #c++

C/C++ 专栏收录该内容

21 篇文章

订阅专栏

本文介绍了一种使用C++实现Huffman编码的方法，详细解释了Huffman树的数据结构设计及编码过程，并提供了完整的源代码示例。

这里的Huffman编码就是离散数学中学的最优二叉树，现在我们用C++来实现它。

首先考虑实现它使用的数据结构，当然这里肯定使用二叉树。其次就是存储方式，对于树一般情况是用间接存储也就是链表。但是二叉链表不方便寻找双亲节点，这里使用向量更好，对存储空间的利用率也较高。

建立数据结构：

Huffman树中每个节点看成数组中的一个元素，它的属性有：权(weight),字符(data) 。为了方便寻找双亲和孩子节点加入三个域parent ,lch ,rch。分别存储双亲和左右孩子节点在数组中位置下标。

然后再建立个类来操作Huffman树。

// Huffman.h
#include < string >
using std:: string ;

class HuffmanTree
{
public :
   HuffmanTree( int , int []);

    void BuildTree();
    void SetCode();
    void Display();
    string Send( char * );
    string Get( string );

private :
    void Select( int , int & , int & );

private :
     class Node
    {
         friend class HuffmanTree;
          int weight;
          char data;
          string code;
          int parent,lchild,rchild;
     } * nodes;
       int codenum;
       int nodenum;
};

这里，我在节点中又加了个code属性，这个是每个字符的最后编码。codenum是叶子节点数，也就是需要编码的字符节点数。nodenum是总节点数。要让外面那个类能操作Node类，则需要声明为它的友元函数。

Huffman数建立的过程：

1. 根据给定的n个权值构成n棵二叉树。

2. 从所有的二叉树中选取两个权值最小节点的构成一个新的树，这个树的权值是两个节点权值之和。

3. 重复2，直到只剩一棵树为止。

下面是实现这个类的代码：

// Huffman.cpp
#include " Huffman.h "
#include < iostream >
#include < string >
using namespace std;

HuffmanTree::HuffmanTree( int num, int w[]):codenum(num)
{
      nodenum = 2 * codenum - 1 ;
      nodes = new Node[nodenum];
       for ( int i = 0 ;i < codenum;i ++ )
      {
            nodes[i].weight = w[i];
            nodes[i].data = ' a ' + i;
            nodes[i].parent = nodes[i].lchild = nodes[i].rchild =- 1 ;
            nodes[i].code = "" ;
      }
       for ( int i = codenum;i < nodenum;i ++ )
      {
            nodes[i].weight = 0 ;
            nodes[i].data = 0 ;
            nodes[i].parent = nodes[i].lchild = nodes[i].rchild =- 1 ;
     }
}

void HuffmanTree::BuildTree()
{
        for ( int i = codenum;i < nodenum;i ++ )
      {
              int s1 = 0 ,s2 = 0 ;
             Select(i,s1,s2);
             nodes[s1].parent = nodes[s2].parent = i;
             nodes[i].lchild = s1;
             nodes[i].rchild = s2;
             nodes[i].weight = nodes[s1].weight + nodes[s2].weight;
       }
}

void HuffmanTree::Select( int size, int & s1, int & s2)
{
       int temp = 0 ;
       for ( int i = 0 ;i < size;i ++ )
      {
              if (nodes[i].parent !=- 1 )
     continue ;
              if (temp == 0 )
             {
    s2 = i;
    temp ++ ;
     continue ;
             }
              if (temp == 1 )
             {
     if (nodes[i].weight > nodes[s2].weight)
    {
            s1 = s2;
            s2 = i;
    }
     else
            s1 = i;
    temp ++ ;
     continue ;
             }
              if (nodes[i].weight < nodes[s2].weight)
             {
     if (nodes[i].weight < nodes[s1].weight)
    {
            s2 = s1;
            s1 = i;
    }
     else
            s2 = i;
             }
      }
}

void HuffmanTree::SetCode()
{
         for ( int i = 0 ;i < codenum;i ++ )
        {
                 int j = i;
                 int p = nodes[j].parent;
                 while (p !=- 1 )
                {
        if (nodes[p].lchild == j)
                 nodes[i].code = " 0 " + nodes[i].code;
       else
                nodes[i].code = " 1 " + nodes[i].code;
       j = p;
       p = nodes[j].parent;
                 }
        }
}

void HuffmanTree::Display()
{
        for ( int i = 0 ;i < codenum;i ++ )
            cout << nodes[i].data << " :   weight = " << nodes[i].weight << " : " << nodes[i].code << endl;
}

string HuffmanTree::Send( char * old)
{
         string newcode = "" ;
         for (; * old;old ++ )
       {
               for ( int i = 0 ;i < codenum;i ++ )
             {
     if (nodes[i].data ==* old)
    {
             newcode += nodes[i].code;
              break ;
    }
             }
        }
         return newcode;
}

string HuffmanTree::Get( string old)
{
         string newcode = "" ;
         string ::size_type index = 0 ;
         int treeindex = nodenum - 1 ;
         while (index <= old.size())
        {
               if (nodes[treeindex].lchild ==- 1 )
              {
    newcode += nodes[treeindex].data;
    treeindex = nodenum - 1 ;
              }
               else
             {
     if (old[index] == ' 0 ' )
            treeindex = nodes[treeindex].lchild;
     else
            treeindex = nodes[treeindex].rchild;
    index ++ ;
              }
          }
          return newcode;
}

    构造函数：用待编码字符数和每个字符的权值来初始化。
                       根据节点个数在堆区建立数组。
                       节点分两种叶子节点(为编码的字符)和非叶子节点，所以分开初始化。

Build 函数：这个是根据Huffman树规则建立二叉树。

Select 函数：从节点数组0到size的范围内选取两个权值最小的节点，下标放在s1,s2中。

SetCode 函数：求出每个字符的编码。

Display 函数：显示编码。

Send 函数：给定一个字符串，求出它的编码。

Get 函数：从得到的0、1串中翻译出它原编码。

编码的过程就是从树叶到树根的路径，而反编码则是从树根到树叶的路径。

最后给出一个测试代码：

// The test file
#include < iostream >
#include < string >
#include " Huffman.h "

using namespace std;

int main()
{
       int w[] = { 5 , 29 , 7 , 8 , 14 , 23 , 3 , 11 , 12 , 9 , 20 , 17 , 13 , 57 , 32 };
      HuffmanTree * t = new HuffmanTree( 15 ,w);
       t -> BuildTree();
       t -> SetCode();
       t -> Display();
        char s[] = " aefbcgbam " ;
        string p = t -> Send(s);
       cout << p << endl;
       cout << t -> Get(p) << endl;
}

最后说一点，操作字符串在C++中，使用标准库中的string类比自己来拨弄指针方便多了，大家以后多试着用下，可以发现它的很多好处。