标签:
Total Accepted: 1161 Total Submissions: 6887All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",Return:["AAAAACCCCC", "CCCCCAAAAA"].
[分析]
HASHMAP方法会EXCEED SPACE LIMIT.
因为只有4个字母,所以可以创建自己的hashkey, 每两个BITS, 对应一个 incoming character. 超过20bit 即10个字符时, 只保留20bits.
[注意]
1. (hash<<2) + map.get(c) 符号优先级, << 一定要括起来.
public class Solution { public List<String> findRepeatedDnaSequences(String s) { List<String> res = new ArrayList<String>(); if(s==null || s.length() < 11) return res; int hash = 0; Map<Character, Integer> map = new HashMap<Character, Integer>(); map.put('A', 0); map.put('C', 1); map.put('G', 2); map.put('T', 3); Set<Integer> set = new HashSet<Integer>(); Set<Integer> unique = new HashSet<Integer>(); for(int i=0; i<s.length(); i++) { char c = s.charAt(i); if(i<9) { hash = (hash<<2) + map.get(c); } else { hash = (hash<<2) + map.get(c); hash &= (1<<20) - 1; if( set.contains(hash) && !unique.contains(hash)) { res.add(s.substring(i-9, i+1)); unique.add(hash); } else { set.add(hash); } } } return res; } }
leetcode 187: Repeated DNA Sequences
标签:
原文地址:http://blog.csdn.net/xudli/article/details/43666725