All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
class Solution(object):
def findRepeatedDnaSequences(self, s):
dic = {}
for i in xrange(len(s) - 9):
t = s[i:i+10]
dic[t] = dic.get(t,0) + 1
#l = sorted(dic.iteritems(),key = lambda d:d[1],reverse = True)
res = []
for k,v in dic.items():
#print k,v
if v > 1:
res += k,
#print res
return res
"""
:type s: str
:rtype: List[str]
"""

本文介绍了一种使用Python实现的方法,该方法能够找出DNA序列中长度为10且出现次数超过一次的所有子序列。通过构建一个字典来跟踪所有可能的子序列及其出现频率,最终筛选出符合要求的重复序列。
181

被折叠的 条评论
为什么被折叠?



