【每日一题Day30】LC792匹配子序列的单词数 | 哈希表+ 二分多指针+队列_给定字符串s和字符串数组words,返回words中是s的子序列的字符串个数。-优快云博客

本文链接：https://blog.youkuaiyun.com/Tikitian/article/details/127902000

本文探讨了LeetCode题目792的解决方案，介绍了两种高效算法：哈希表+二分查找及多指针+队列的方法来判断字符串数组中的单词是否为给定字符串的子序列。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

匹配子序列的单词数【LC792】

给定字符串 s 和字符串数组 words, 返回 words[i] 中是s的子序列的单词个数 。

字符串的 子序列 是从原始字符串中生成的新字符串，可以从中删去一些字符(可以是none)，而不改变其余字符的相对顺序。

例如， “ace” 是 “abcde” 的子序列。

Given a string s and an array of strings words, return the number of words[i] that is a subsequence of s.

A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.

For example, "ace" is a subsequence of "abcde".

2022/11/17

思路：dp或者双指针求解每个单词是否是s的子序列–>超时
- 双指针复杂度
  - 时间复杂度： $O ((n + m) * s i z e)$ ，n为字符串长度，m为单词平均长度，size为字符串数量
  - 空间复杂度： $O (1)$
- 怎么优化单个单词的判定操作？
  - 使用双指针扫描两个字符串时，对于原串的扫描，会有大量的字符串会被跳过–>如何快速定位到下一个字符的位置？
    
    –>使用哈希表存储每个字母在原串中的位置，即哈希表的key为字母，value为对应的下标[递增顺序]，那么可以通过二分法查找下一个字符对应的位置
  - 将所有单词和字符串s同时进行匹配，存储words，然后遍历s，每个位置字母把对应的word往后移动，如果已经到word长度了，那就证明是word是s的子序列，遍历结束没匹配完的就不是子序列

哈希表+二分

实现

class Solution {
    public int numMatchingSubseq(String s, String[] words) {
        int res = 0;
        int lenS = s.length();
        Map<Character,List<Integer>> charToIndex = new HashMap<>();
        for (int i = 0; i < lenS; i++){
            List<Integer> list = charToIndex.getOrDefault(s.charAt(i),new ArrayList<>());
            list.add(i);
            charToIndex.put(s.charAt(i),list);
        }
        for (String word : words){
            int wordLen = word.length();
            int index = -1;// 记录下一个下标
            boolean  isFound = true;
            for (int i = 0; i < wordLen && isFound; i++){
                // 二分查找下一个下标
                List<Integer> list = charToIndex.getOrDefault(word.charAt(i),new ArrayList<>());
                int l = 0, r = list.size() - 1;
                while (l < r){
                    int mid = l + r >> 1;
                    if (list.get(mid) > index){
                        r = mid;
                    }else{
                        l = mid + 1;
                    }
                }
                if (r < 0 || list.get(r) <= index){
                    isFound = false; // 没有合法的下一个下标
                }else{
                    index = list.get(r);
                }             
            }
            if (isFound){
                res++;
            }
        }
        return res;
    }
}

复杂度
- 时间复杂度： $O (n + m * s i z e * l o g n)$ ，n为字符串长度，m为单词平均长度，size为字符串数量
- 空间复杂度： $O (n)$

多指针+队列

实现：
- 使用栈或者队列存储每个字符对应的单词二元组[单词下标，遍历到的下标]
- 每遍历一个字符，将其对应的栈中的元素出栈，如果二元组遍历到的下标已经单词末尾，那么证明该word是s的子序列，res++

代码

class Solution {
    public int numMatchingSubseq(String s, String[] words) {
        int res = 0;
        int lenS = s.length();
        Deque<int[]>[] deque = new LinkedList[26];
        for (int i = 0; i < deque.length; i++){
            deque[i] = new LinkedList<>();
        }
        for (int i = 0; i < words.length; i++){
            deque[words[i].charAt(0) - 'a'].addLast(new int[]{i,0});
        }
        for (int i = 0; i < lenS; i++){
            char c = s.charAt(i);
            Deque<int[]> wordList = deque[c - 'a'];
            int size = wordList.size();
            for (int j = 0; j < size; j++){
                int[] word = wordList.pollFirst();
                int index = word[0];
                if (word[1] == words[index].length() - 1){
                    res++;
                }else{
                    word[1]++;
                    deque[words[index].charAt(word[1]) - 'a'].addLast(word);
                }
            }
        }
        return res;
    }
}