Find substring with K distinct characters

本文介绍了一种使用滑动窗口算法结合哈希表的方法,解决在一个字符串中寻找长度为K且包含K个不同字符的所有子串的问题。通过移动窗口的方式遍历字符串,利用哈希表记录每个窗口内的字符频率,当窗口大小达到K时,检查是否所有字符均为唯一,如果是则将该子串加入结果集。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Given a string and number K, find the substrings of size K with K distinct characters. If no, output empty list. Remember to emit the duplicate substrings, i.e. if the substring repeated twice, only output once.

  • 字符串中等题。Sliding window algorithm + Hash。
  • 使用移动窗口算法,两个指针标记window起点left/终点right,还有一个计数器count记录hit count,初始值为K。另外还有一个hash数组来记录当前window中所有char。
  • 注意hit的条件是hash[i] == 0,代表当前window中没有重复char。如果hit,则-- count代表找到一个满足条件char。
  • ++ hash[right]来标记已经在当前window中出现过,并扩展right。
  • 如果window size为K,那么就可以判断如果count == 0,代表已经找到K个不重复的char,可以放入结果集。这里注意下需要用STL算法find()去重。
  • 接着要把window往右移,同时对right char做过的操作进行恢复。
  • 注意如果hash[left] == 1,代表以前满足过hash[right] == 0,所以需要-- count来恢复。而对于hash[left] > 1,因为重复char只会hit一次,只会对count + 1,所以不需要-- count,只要等到hash[left] == 1的时候再count - 1就行。同时因为left要移出window了,所以-- hash[left]来恢复,并右移left扩展到下一个window。
  • find - C++ Reference
    • http://www.cplusplus.com/reference/algorithm/find/?kw=find
 1 //
 2 //  main.cpp
 3 //  LeetCode
 4 //
 5 //  Created by Hao on 2017/3/16.
 6 //  Copyright © 2017年 Hao. All rights reserved.
 7 //
 8 
 9 #include <iostream>
10 #include <vector>
11 #include <unordered_map>
12 using namespace std;
13 
14 class Solution {
15 public:
16     vector<string> subStringKDist(string S, int K) {
17         vector<string> vResult;
18         
19         // corner case
20         if (S.empty()) return vResult;
21         
22         unordered_map<char, int>    hash;
23         
24         // window start/end pointer, hit count
25         int left = 0, right = 0, count = K;
26         
27         while (right < S.size()) {
28             if (hash[S.at(right)] == 0) // hit the condition 1 dup char
29                 -- count;
30             
31             ++ hash[S.at(right)]; // increase hash value to mark that the char exists in the current window
32             
33             ++ right; // move window end pointer rightward
34             
35             // window size reaches K
36             if (right - left == K) {
37                 if (0 == count) { // find K distinct chars
38                     if (find(vResult.begin(), vResult.end(), S.substr(left, K)) == vResult.end()) // using STL find() to avoid dup
39                         vResult.push_back(S.substr(left, K));
40                 }
41 
42                 // be careful for the restore condition. Count is only increased when hash[i] == 0, so only hash[i] == 1 means that count was increased.
43                 if (hash[S.at(left)] == 1)
44                     ++ count;
45                 
46                 -- hash[S.at(left)]; // decrease to restore hash value
47                 
48                 ++ left; // move window start pointer rightward
49             }
50         }
51         
52         return vResult;
53     }
54 };
55 
56 int main(int argc, char* argv[])
57 {
58     Solution    testSolution;
59     
60     vector<string>  sInputs = {"awaglknagawunagwkwagl", "abccdef", "", "aaaaaaa"};
61     vector<int>     iInputs = {4, 2, 1, 2};
62     vector<string>  result;
63     
64     /*
65      {wagl aglk glkn lkna knag gawu awun wuna unag nagw agwk kwag }
66      {ab bc cd de ef }
67      {}
68      {}
69      */
70     for (auto i = 0; i < sInputs.size(); ++ i) {
71         result = testSolution.subStringKDist(sInputs[i], iInputs[i]);
72         
73         cout << "{";
74         for (auto it : result)
75             cout << it << " ";
76         cout << "}" << endl;
77     }
78 
79     return 0;
80 }
View Code

 

转载于:https://www.cnblogs.com/pegasus923/p/8444653.html

D. Good Substrings time limit per test2 seconds memory limit per test512 megabytes You've got string s, consisting of small English letters. Some of the English letters are good, the rest are bad. A substring s[l...r] (1 ≤ l ≤ r ≤ |s|) of string s  =  s1s2...s|s| (where |s| is the length of string s) is string  slsl + 1...sr. The substring s[l...r] is good, if among the letters  sl, sl + 1, ..., sr there are at most k bad ones (look at the sample's explanation to understand it more clear). Your task is to find the number of distinct good substrings of the given string s. Two substrings s[x...y] and s[p...q] are considered distinct if their content is different, i.e. s[x...y] ≠ s[p...q]. Input The first line of the input is the non-empty string s, consisting of small English letters, the string's length is at most 1500 characters. The second line of the input is the string of characters "0" and "1", the length is exactly 26 characters. If the i-th character of this string equals "1", then the i-th English letter is good, otherwise it's bad. That is, the first character of this string corresponds to letter "a", the second one corresponds to letter "b" and so on. The third line of the input consists a single integer k (0 ≤ k ≤ |s|) — the maximum acceptable number of bad characters in a good substring. Output Print a single integer — the number of distinct good substrings of string s. Examples InputCopy ababab 01000000000000000000000000 1 OutputCopy 5 InputCopy acbacbacaa 00000000000000000000000000 2 OutputCopy 8 Note In the first example there are following good substrings: "a", "ab", "b", "ba", "bab". In the second example there are following good substrings: "a", "aa", "ac", "b", "ba", "c", "ca", "cb".
03-28
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值