[LintCode] Word Abbreviation Set

本文介绍了一种高效判断单词缩写是否在字典中唯一的算法。通过预处理字典并使用哈希映射存储每个缩写及其对应的单词集合,实现了快速查询。此方法在多次查询同一字典时特别有效。

An abbreviation of a word follows the form . Below are some examples of word abbreviations:

a) it                      --> it    (no abbreviation)

     1
b) d|o|g                   --> d1g

              1    1  1
     1---5----0----5--8
c) i|nternationalizatio|n  --> i18n

              1
     1---5----0
d) l|ocalizatio|n          --> l10n

Assume you have a dictionary and given a word, find whether its abbreviation is unique in the dictionary. A word's abbreviation is unique if no other word from the dictionary has the same abbreviation.

Example

Given dictionary = [ "deer", "door", "cake", "card" ]
isUnique("dear") // return false
isUnique("cart") // return true
isUnique("cane") // return false
isUnique("make") // return true

 

Solution 1. A straightforward solution is to simply iterate the given dictionary and compare input word's abbreviation with each word's abbreviation in the dictionary. 

If the the abbreviations are the same and the the words are not the same, return false.  Assume the averge word length is k and there are n words in the dictionary,

the runtime of this solution is O(n*k) for each isUnique operation. 

 

Solution 2.  If the given dictionary does not change and isUnique is called repeatedly, solution 1 of O(n*k) is not efficient as we have to go over the dictionary 

every time.  A better approach is to design a class for this isUnique. For a dictionary, it is preprocessed and each abbreviation and its corresponding words are

stored in a hash map.  The key is each unique abbreviation and the value is a hash set of all unique words that share the same abbreviation. 

 

It takes O(n*k) time to construct this hash map. But once it is constructed, it gives us O(k) runtime for checking if isUnique. 

Compare to solution 1, this solution uses O(n*k) extra memory. So it is a typical example of time and space tradeoff in algorithm design.

 

A key note here to remember is that in Java, to iterate a hash set, the Iterator class must be used as highlighted.

 1 public class ValidWordAbbr {
 2     private HashMap<String, HashSet<String>> map;
 3     // @param dictionary a list of word
 4     public ValidWordAbbr(String[] dictionary) {
 5         map = new HashMap<String, HashSet<String>>();
 6         for(String s : dictionary){
 7             String abbrKey = convertToAbbr(s);
 8             if(map.containsKey(abbrKey)){
 9                 map.get(abbrKey).add(s); 
10             }
11             else{
12                 HashSet<String> set = new HashSet<String>();
13                 set.add(s);
14                 map.put(abbrKey, set);
15             }
16         }
17     }
18     /**
19      * @param word a string
20      * @return true if its abbreviation is unique or false
21      */
22     public boolean isUnique(String word) {
23         String abbrKey = convertToAbbr(word);
24         if(!map.containsKey(abbrKey)){
25             return true;
26         }
27         else if(map.get(abbrKey).size() == 1){
28             Iterator<String> iter = map.get(abbrKey).iterator();
29             return word.equals(iter.next());
30         }
31         return false;
32     }
33     
34     private String convertToAbbr(String word){
35         if(word.length() <= 2){
36             return word;
37         }
38         return String.valueOf(word.charAt(0)) + String.valueOf(word.length() - 2) 
39                 + String.valueOf(word.charAt(word.length() - 1));
40     }
41 }
42 
43 /**
44  * Your ValidWordAbbr object will be instantiated and called as such:
45  * ValidWordAbbr obj = new ValidWordAbbr(dictionary);
46  * boolean param = obj.isUnique(word);
47  */

 

Related Problems 

Two Sum - Data Structure Design

转载于:https://www.cnblogs.com/lz87/p/7198475.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值