Leetcode P0187"Repeated DNA Sequences" 题解

博文中会简要介绍Leetcode P0187题目分析及解题思路。

“Repeated DNA Sequences”这道题比较基础。利用HashTable可以比较容易地解决这个题目。

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

根据题目需求我们其实是在给定的字符串s里寻找长度为10并且重复出现多次的子串。最直接的想法就是我们记录下所有长度为10的子串，利用HashTable来记录每个子串出现的次数，然后次数大于1的子串便是我们想要找的子串。

以下是Java的题解代码实现。

class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        Map<String, Integer> str2CountMap = new HashMap<>();
        
        int left = 0, right = left+9, L = s.length();
        while (right < L) {
            String substr = s.substring(left, right+1);
            str2CountMap.put(substr, str2CountMap.getOrDefault(substr, 0)+1);
            
            ++left;
            ++right;
        }
        
        List<String> repeated = new ArrayList<>();
        for (Map.Entry<String, Integer> entry: str2CountMap.entrySet()) {
            if (entry.getValue() > 1)
                repeated.add(entry.getKey());
        }
        
        return repeated;
    }
}

以下是C++的题解代码实现。

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        unordered_map<string, int> str2count;
        
        int left = 0, right = left+9, L = s.length();
        while (right < L) {
            ++str2count[s.substr(left, 10)];
            
            ++left;
            ++right;
        }
        
        vector<string> repeated;
        for (auto entry: str2count) {
            if (entry.second > 1)
                repeated.push_back(entry.first);
        }
        
        return repeated;
    }
};

聖地亞歌

Leetcode P0187"Repeated DNA Sequences" 题解

Search

Table of Contents