Algorithm: The implementation of Trie Tree (C++)

Question:

There is a text file which includes many lines. And each line has only one word or phrase. Now, please implement a program to get the prefix of each word. (Please print each of prefix word behind the original word and separate by a blank space.) For example,

 

<Word.txt>

chock choc

choice choi

clamp cl

cant ca

cool coo

color col

 

Solution:

The basic idea of my solution for this issue is to take advantage of Trie Tree.

In this tree, there is a root node which doesn’t have parent node. And each of nodes has 26 child nodes at most. 26 child nodes will be used to save the letter, “a” to “z” respectively.

 

For inserting each of word, we should pick up each character of word and put it to the proper node. For example, if we want to insert word “chock” to the prefix word. We should pick up the first character ‘c’ and put it to the child nodes of root node. Then, we should pick up the second character ‘h’ and put it to the child nodes of ‘c’ node which is the child node for root node. And next, we will insert the rest characters of word to the tree in the same way.

So, if we insert above words one by one, we should able to get a prefix tree as following.

Since the prefix of word is a string which could be used to identify a word, we can find that the last character of each prefix word is the last node whose parent has more than 1 child. For example, the prefix word of chock is “choc”. And the end of “choc”, the fourth character, is “c” which has been marked as RED tag. You can see that this node is the last node whose parent “o” has 2 children, “c” and “I”. So, we can find all the end position of prefix for each word.

Design:

After our analysis, we can design 2 classes, Node and PrefixTree.

class Node
{
  public:
    Node* parent ;
    char letter ;
    Node* child[26] ;
    int childNum ;

In class Node, we have 4 members. Member “parent” will direct to the parent node of current node. Member “letter” is used to save the character of the each word. Array “child[26]” is used to save the child nodes for current node. The last member “childNum” will show you the number of children that current node has.

class PrefixTree
{
  private:
    Node* root ;
    char** words ;
  public:
    PrefixTree(char* path) ;
    void getPrefix() ;
    
  
  private:
    char** readWordsFile(char* path) ;
    void insertNode(Node* parent, char* word, int cur) ;
    void createPrefixTree() ;
    
};

In class PrefixTree, we just have 2 variable members. First one is “root” which direct to the root node of prefix tree. And second node is “words”, which is used to store all the words that will be analyzed.

Of course, we also design 4 class methods for loading words, creating prefix tree, inserting node and analyzing prefix word in respectively.

Coding:

Following is all my codes for implement. You can understand them by the comments.

/*

COPYRIGHT ANNOUNCEMENT
Copyright ©2010 Xianyi.Ye
Following codes only could be used to study and research for individual. You are not allowed to use these codes for any commercial purposes.
Sunday, May 09, 2010
*/
#include "stdafx.h"
#include <iostream>
#include <stdio.h> 
#include <stdlib.h> 
#include <malloc.h>
using namespace std ;
class Node
{
  public:
    Node* parent ;
    
    char letter ;
    
    Node* child[26] ;
    
    int childNum ;
  public:
    Node()
    {
      parent = NULL ;
      letter = NULL ;
      childNum = 0 ;
      for(int i=0 ; i<26 ; i++)
      {
        child[i] = NULL ;
      }
    }
};
class PrefixTree
{
  private:
    Node* root ;
    char** words ;
  public:
    PrefixTree(char* path) ;
    void getPrefix() ;
    
  
  private:
    char** readWordsFile(char* path) ;
    void insertNode(Node* parent, char* word, int cur) ;
    void createPrefixTree() ;
    
};
PrefixTree::PrefixTree(char* path)
{
  words = readWordsFile(path) ;
  createPrefixTree() ;
}
void PrefixTree::getPrefix()
{
  int word_Count = _msize(words)/sizeof(char*) ;
  for(int i=0 ; i<word_Count ; i++)  //Travel each word of the line
  {
    //cout<<words[i]<<" " ;  //Output each word
                //We should be able to print the word by this statement.
                //However, since the last character of the word is '/n' or EOF,
                //our words and prefix won't be on the same line if we print the word by "cout" directly.
                //And it's a little complicated to control our output format.
                //Therefore, we will output the word when we travel the tree.
    int word_length = strlen(words[i])-1 ;
    Node* node = root ;
    for(int j=0 ; j<word_length ; j++)  //Travel each letter of the word and
    {                  //And get the node of the last letter of word
      char letter = *((words[i])+j) ;
      cout<<letter ;          //Output the letter of the word.
                      //So, we can get the whole of word when the FOR loop is finished. 
      node = node->child[letter%26] ;
    }
                      //Now, node should direct to the last letter of word 
    cout<<" " ;
                      //Backtracking and find first matched node whose parent has more than 1 child.
                      //Then, this node is the end letter for prefix word.
    node = node->parent ;
    while(node->parent->childNum==1)
    {
      node = node->parent ;
    }
    Node* end = node ;          //Now, "end" is what we wanted.
    //cout<<"end="<<node->letter<<endl ;
    node = root ;
    for(j=0 ; j<word_length ; j++)    //Travel the tree to output each letter of prefix word 
    {                  //until we arrive at the End Node.
      char letter = *((words[i])+j) ;
      node = node->child[letter%26] ;
      cout<<letter ;
      if(node==end)
        break ;
    }
    cout<<endl ;
  }
}
char** PrefixTree::readWordsFile(char* path)
{
  FILE *fp; 
  char buf[256] ;
  if ((fp=fopen(path, "rt"))==NULL) 
  { 
    printf("open file error!!/n"); 
    return NULL; 
  }
  int lineCount = 0;
  while(fgets(buf, 256, fp) != NULL) 
  { 
    lineCount++ ;
  }
  //cout<<"lineCount="<<lineCount<<endl ;
  char** words = new char*[lineCount] ;
  //init words buffer
  for(int i=0 ; i<lineCount ; i++)
  {
    words[i] = new char[256] ;
  }
  fseek(fp,0,SEEK_SET);
  i = 0 ;
  while(fgets(words[i++], 256, fp) != NULL) ;
  return words ;
  
}//end readWordsFile
void PrefixTree::insertNode(Node* parent, char* word, int cur)
{
  int word_length = strlen(word)-1 ;
  if(cur==word_length)
  {
    return ;
  }
  int pos = word[cur]%26 ;
  if(parent->child[pos]==NULL)
  {
    //cout<<"inserting "<<word[cur]<<endl ;
    parent->child[pos] = new Node() ;
    parent->child[pos]->childNum++ ;
    parent->child[pos]->letter = word[cur] ;
    parent->child[pos]->parent = parent ;
    
   }
  else
  {
    parent->child[pos]->childNum++ ;
    //cout<<word[cur]<<" has been inserted."<<endl ;
  }
  
  insertNode(parent->child[pos],word,++cur) ;
}//end insertNode
void PrefixTree::createPrefixTree()
{
  root = new Node() ;
  root->parent = NULL ;
  
  int word_Count = _msize(words)/sizeof(char*) ;
  for(int i=0 ; i<word_Count ; i++)
  {
    //cout<<words[i] ;
    insertNode(root,words[i],0) ;
  }
  
}//end createPrefixTree
int main(int argc, char* argv[])
{
  PrefixTree* tree = new PrefixTree("c://words.txt") ;
  tree->getPrefix() ;
  delete tree ;
  return 0 ;
}

Test case:

<words.txt>

abandon

aboard

absolute

accident

accomplish

list

lively

load

local

objective

oblige

observe

obtain

opposite

optional

ore

organize

ornament

outdoors

outlet

 

 

Any question, please contact me.

yexianyi@hotmail.com

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

yexianyi

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值