Find substring with K distinct characters

最新推荐文章于 2025-08-18 22:01:01 发布

转载最新推荐文章于 2025-08-18 22:01:01 发布 · 131 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/pegasus923/p/8444653.html

文章标签：

#c/c++

本文介绍了一种使用滑动窗口算法结合哈希表的方法，解决在一个字符串中寻找长度为K且包含K个不同字符的所有子串的问题。通过移动窗口的方式遍历字符串，利用哈希表记录每个窗口内的字符频率，当窗口大小达到K时，检查是否所有字符均为唯一，如果是则将该子串加入结果集。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Given a string and number K, find the substrings of size K with K distinct characters. If no, output empty list. Remember to emit the duplicate substrings, i.e. if the substring repeated twice, only output once.

字符串中等题。Sliding window algorithm + Hash。
使用移动窗口算法，两个指针标记window起点left/终点right，还有一个计数器count记录hit count，初始值为K。另外还有一个hash数组来记录当前window中所有char。
注意hit的条件是hash[i] == 0，代表当前window中没有重复char。如果hit，则-- count代表找到一个满足条件char。
++ hash[right]来标记已经在当前window中出现过，并扩展right。
如果window size为K，那么就可以判断如果count == 0，代表已经找到K个不重复的char，可以放入结果集。这里注意下需要用STL算法find()去重。
接着要把window往右移，同时对right char做过的操作进行恢复。
注意如果hash[left] == 1，代表以前满足过hash[right] == 0，所以需要-- count来恢复。而对于hash[left] > 1，因为重复char只会hit一次，只会对count + 1，所以不需要-- count，只要等到hash[left] == 1的时候再count - 1就行。同时因为left要移出window了，所以-- hash[left]来恢复，并右移left扩展到下一个window。
find - C++ Reference
- http://www.cplusplus.com/reference/algorithm/find/?kw=find

 1 //
 2 //  main.cpp
 3 //  LeetCode
 4 //
 5 //  Created by Hao on 2017/3/16.
 6 //  Copyright © 2017年 Hao. All rights reserved.
 7 //
 8 
 9 #include <iostream>
10 #include <vector>
11 #include <unordered_map>
12 using namespace std;
13 
14 class Solution {
15 public:
16     vector<string> subStringKDist(string S, int K) {
17         vector<string> vResult;
18         
19         // corner case
20         if (S.empty()) return vResult;
21         
22         unordered_map<char, int>    hash;
23         
24         // window start/end pointer, hit count
25         int left = 0, right = 0, count = K;
26         
27         while (right < S.size()) {
28             if (hash[S.at(right)] == 0) // hit the condition 1 dup char
29                 -- count;
30             
31             ++ hash[S.at(right)]; // increase hash value to mark that the char exists in the current window
32             
33             ++ right; // move window end pointer rightward
34             
35             // window size reaches K
36             if (right - left == K) {
37                 if (0 == count) { // find K distinct chars
38                     if (find(vResult.begin(), vResult.end(), S.substr(left, K)) == vResult.end()) // using STL find() to avoid dup
39                         vResult.push_back(S.substr(left, K));
40                 }
41 
42                 // be careful for the restore condition. Count is only increased when hash[i] == 0, so only hash[i] == 1 means that count was increased.
43                 if (hash[S.at(left)] == 1)
44                     ++ count;
45                 
46                 -- hash[S.at(left)]; // decrease to restore hash value
47                 
48                 ++ left; // move window start pointer rightward
49             }
50         }
51         
52         return vResult;
53     }
54 };
55 
56 int main(int argc, char* argv[])
57 {
58     Solution    testSolution;
59     
60     vector<string>  sInputs = {"awaglknagawunagwkwagl", "abccdef", "", "aaaaaaa"};
61     vector<int>     iInputs = {4, 2, 1, 2};
62     vector<string>  result;
63     
64     /*
65      {wagl aglk glkn lkna knag gawu awun wuna unag nagw agwk kwag }
66      {ab bc cd de ef }
67      {}
68      {}
69      */
70     for (auto i = 0; i < sInputs.size(); ++ i) {
71         result = testSolution.subStringKDist(sInputs[i], iInputs[i]);
72         
73         cout << "{";
74         for (auto it : result)
75             cout << it << " ";
76         cout << "}" << endl;
77     }
78 
79     return 0;
80 }