LeetCode 187. Repeated DNA Sequences

最新推荐文章于 2023-03-18 22:02:57 发布

原创最新推荐文章于 2023-03-18 22:02:57 发布 · 109 阅读

0 ·

CC 4.0 BY-SA版权

本文介绍了一种算法，用于找出DNA分子中出现多次的10字母长的子序列。通过使用哈希表记录子序列出现次数，该算法能够高效地识别出重复的DNA片段。

187. Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:

Input: s = “AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT”

Output: [“AAAAACCCCC”, “CCCCCAAAAA”]

Approach

题目大意：出现不止一次的长度为10的连续子串
思路方法：暴力，用map记录所有长度为10的子串出现的次数，然后出现次数大于2的装入set中

Code

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        unordered_map<string,int> unmp;
        unordered_set<string>unst;
        int len = s.size() - 10;
        for (int i = 0; i < len; i++) {
            string substr = s.substr(i, 10);
            unmp[substr]++;
            if (unmp[substr] > 1)
                unst.insert(substr);
        }
        return vector<string>(unst.begin(), unst.end());
    }
};