码迷,mamicode.com
首页 > 其他好文 > 详细

[LeetCode] Repeated DNA Sequences

时间:2015-03-07 22:29:54      阅读:129      评论:0      收藏:0      [点我收藏+]

标签:

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

用Map的话超内存了,改用bitsmap,因为只有4个字母,所以只要用两位就可以做为一个字母的编码,10个字母就是20位,所以创建一个2^20大小的数组就可以解决问题了。

 1 class Solution {
 2 public:
 3     int getVal(char ch) {
 4         if (ch == A) return 0;
 5         if (ch == C) return 1;
 6         if (ch == G) return 2;
 7         if (ch == T) return 3;
 8     }
 9     
10     vector<string> findRepeatedDnaSequences(string s) {
11         set<string> st;
12         vector<string> res;
13         string str;
14         if (s.length() < 10 || s == "") return res;
15         int mp[1024*1024] = {0};
16         unsigned int val = 0;
17         for (int i = 0; i < 9; ++i) {
18             val <<= 2;
19             val |= getVal(s[i]);
20         }
21         for (int i = 9; i < s.length(); ++i) {
22             val <<= 14;
23             val >>= 12;
24             val |= getVal(s[i]);
25             ++mp[val];
26             if (mp[val] > 1) {
27                 str = s.substr(i-9, 10);
28                 st.insert(str);
29             }
30         }
31         for (set<string>::iterator i = st.begin(); i != st.end(); ++i) {
32             res.push_back(*i);
33         }
34         return res;
35     }
36 };

 

[LeetCode] Repeated DNA Sequences

标签:

原文地址:http://www.cnblogs.com/easonliu/p/4320919.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!