[LintCode] Substring Anagrams

本文介绍了一种高效算法,用于找出一个字符串中所有指定长度子串的字母异位词,并提供了两种实现方案,详细解释了每种方法的空间和时间复杂度。

Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.

Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 40,000.

The order of output does not matter.

Example

Given s = "cbaebabacd" p = "abc"

return [0, 6]

The substring with start index = 0 is "cba", which is an anagram of "abc".
The substring with start index = 6 is "bac", which is an anagram of "abc".




Solution 1.
1. Store all characters' frequencies of p in array pMap.
2. Similarly with sliding window problems, scan each character
in s and add it to a queue and update its frequency in array sMap.
3. When the queue's size equals to p's length, compare if pMap's
values are the same with sMap's. If they are, we've found a start
index of p's anagram.
4. Remove the head character from the queue, then repeat the above
steps until we've checked all s' characters.

Runtime: O(n) * O(26) -- O(n), O(n) for iterating through both s and p;
O(26) for checking if found a substring anagram.
Space: O(26) * 2 + O(m) -- O(m), m is the length string p.

 1 public class Solution {
 2     public List<Integer> findAnagrams(String s, String p) {
 3         List<Integer> indices = new ArrayList<Integer>();
 4         if(s == null || s.length() < p.length()){
 5             return indices;
 6         }
 7         int[] sMap = new int[26];
 8         int[] pMap = new int[26];
 9         for(int i = 0; i < p.length(); i++){
10             pMap[p.charAt(i) - 'a']++;
11         }
12         Queue<Character> queue = new LinkedList<Character>();
13         int index = 0; 
14         while(index < s.length()){
15             queue.add(s.charAt(index));
16             sMap[s.charAt(index) - 'a']++;
17             if(queue.size() == p.length()){
18                 if(isAnagrams(sMap, pMap)){
19                     indices.add(index - p.length() + 1);
20                 }
21                 sMap[queue.poll() - 'a']--;
22             }
23             index++;
24         }
25         return indices;
26     }
27     private boolean isAnagrams(int[] sMap, int[] pMap){
28         for(int i = 0; i < sMap.length; i++){
29             if(sMap[i] != pMap[i]){
30                 return false;
31             }
32         }
33         return true;
34     }
35 }

 

Solution 2.

Can we do better?  

For runtime, we've already acheived BCR, so can we optimize the constance?

For space efficiency, we used O(m) extra space in Solution 1. Is it possible for us to only use O(1) space? 

 

We sure can.

In solution 1, we used 2 arrays of size 26 and 1 queue of max size m(m is the length of p);

we can reduce the space usage to only 1 array of size 26.  

The key idea here is that we initalizethis array to be p's characters appearance times.

Then we modify its element and restore it as we scan through s.

If we find a match, decrease that matched character's apperance times by 1.

To simulate the queue used in solution 1, we introduce a new variable matched that keeps track

of how many characters we've matched so far.  This matched variable also saves us from 

the O(26) anagram check each time we've found a substring of length m in solution 1.

 

 1 public class Solution {
 2     public List<Integer> findAnagrams(String s, String p) {
 3         List<Integer> ans = new ArrayList <Integer>();
 4         int[] sum = new int[26];
 5         int plength = p.length(), slength = s.length();
 6         //store all characters' frequencies from p
 7         for(char c : p.toCharArray()){
 8             sum[c - 'a']++;
 9         }
10         int start = 0, end = 0, matched = 0;
11         while(end < slength){
12             //find a character match
13             if(sum[s.charAt(end) - 'a'] >= 1){
14                 matched++;
15             }
16             sum[s.charAt(end) - 'a']--;
17             end++;
18             //if find all needed matches, add index start to final result
19             if(matched == plength) {
20                 ans.add(start);
21             }
22             //sliding window principle
23             if(end - start == plength){
24                 //found a match at index start before need to decrease matched 
25                 //by 1 as s.charAt(start) will be out of the sliding window
26                 if(sum[s.charAt(start) - 'a'] >= 0){
27                     matched--;
28                 }
29                 //restore the frequency of character s.charAt(start) for later check
30                 sum[s.charAt(start) - 'a']++;
31                 start++;
32             }
33         }
34         return ans;
35     }
36 }

 

 Rewrite of solution 2 to make it have the same code structure with Minimum Window Substring.

 1 public class Solution {
 2     public List<Integer> findAnagrams(String s, String p) {
 3         List<Integer> ans = new ArrayList <Integer>();
 4         int[] sum = new int[26];
 5         int plength = p.length(), slength = s.length();
 6         //store all characters' frequencies from p
 7         for(char c : p.toCharArray()){
 8             sum[c - 'a']++;
 9         }
10         int start = 0, end = 0, matched = 0;
11         for(end = 0; end < slength; end++) {
12             //find a character match
13             if(sum[s.charAt(end) - 'a'] > 0){
14                 matched++;
15             }
16             sum[s.charAt(end) - 'a']--;
17             //if find all needed matches, add index start to final result
18             if(matched == plength) {
19                 ans.add(start);
20             }
21             //sliding window principle
22             if(end - start  + 1 == plength){
23                 //restore the frequency of character s.charAt(start) for later check
24                 sum[s.charAt(start) - 'a']++;
25                 //found a match at index start before; need to decrease matched 
26                 //by 1 as s.charAt(start) will be out of the sliding window
27                 if(sum[s.charAt(start) - 'a'] > 0){
28                     matched--;
29                 }
30                 start++;
31             }            
32         }
33         return ans;
34     }
35 }


Related Problem
Minimum Window Substring

转载于:https://www.cnblogs.com/lz87/p/6948738.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值