All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].使用位存储,每个char型占两个字节,这样可以节省很多空间
public class Solution { public List<String> findRepeatedDnaSequences(String s) { List<String> list = new ArrayList<>(); Map<Integer,Integer> map = new HashMap<>(); for(int i=0;i<s.length()-9;i++){ int bit =0; for(int j=i;j<i+10;j++){ switch(s.charAt(j)){ case 'A': bit += 0; break; case 'C': bit += 1; break; case 'G': bit += 2; break; case 'T': bit += 3; break; } bit<<=2; } Integer value = map.get(bit); if(value==null){ map.put(bit, 1); }else if(value == 1){ list.add(s.substring(i,i+10)); map.put(bit, 2); } } return list; } }
[LeetCode]Repeated DNA Sequences
原文地址:http://blog.csdn.net/guorudi/article/details/44238379