1. 一般来说,正则表达式就是以某种方式来描述字符串。
在其他语言中,\\表示“我想要在正则表达式中插入一个普通的(字面上的)反斜线,请不要给它任何特殊的意义。”而在Java中,\\的意思是“我要插入一个正则表达式的反斜线,所以其后的字符具有特殊的意义。”例如,如果你想表示一位数字,那么正则表达式应该是\\d。如果你想插入一个普通的反斜线,则应该这样\\\\。不过换行和制表符之类的东西只需使用单反斜线:\n\t。
?表示可能有某个字符。如-?表示可能有一个负号在前面。
+表示一个或多个之前的表达式。
2. String类自带正则表达式工具:
1)matches , 检查string是否匹配正则表达式
"-1234".matches("-?\\d+"); //true 2) split , 将字符串从正则表达式匹配的地方切开(匹配的部分被删除)
"you must do it".split("\\W+");//you, must, do, it 3) replaceFirst ,replaceAll , 替换
"you found it".replaceFirst("f\\w+","located"); //"you located it"
"you found it".replaceAll("f\\w+","located"); //"you located it"3. 创建正则表达式(导入java.util.regex)| Construct | Matches |
|---|---|
| Characters | |
| x | The character x |
| \\ | The backslash character |
| \0n | The character with octal value 0n (0 <= n <= 7) |
| \0nn | The character with octal value 0nn (0 <= n <= 7) |
| \0mnn | The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) |
| \xhh | The character with hexadecimal value 0xhh |
| \uhhhh | The character with hexadecimal value 0xhhhh |
| \x{h...h} | The character with hexadecimal value 0xh...h (Character.MIN_CODE_POINT <= 0xh...h <= Character.MAX_CODE_POINT) |
| \t | The tab character (‘\u0009‘) |
| \n | The newline (line feed) character (‘\u000A‘) |
| \r | The carriage-return character (‘\u000D‘) |
| \f | The form-feed character (‘\u000C‘) |
| \a | The alert (bell) character (‘\u0007‘) |
| \e | The escape character (‘\u001B‘) |
| \cx | The control character corresponding to x |
| Character classes | |
[abc] |
a, b, or c (simple class) |
[^abc] |
Any character except a, b, or
c (negation) |
[a-zA-Z] |
a through z or A throughZ, inclusive (range) |
[a-d[m-p]] |
a through d, or m throughp:[a-dm-p] (union) |
[a-z&&[def]] |
d, e, or f (intersection) |
[a-z&&[^bc]] |
a through z, except for b andc:[ad-z] (subtraction) |
[a-z&&[^m-p]] |
a through z, and not m throughp:[a-lq-z](subtraction) |
| Predefined character classes | |
| . | Any character (may or may not match line terminators) |
| \d | A digit: [0-9] |
| \D | A non-digit: [^0-9] |
| \h | A horizontal whitespace character: [ \t\xA0\u1680\u180e\u2000-\u200a\u202f\u205f\u3000] |
| \H | A non-horizontal whitespace character: [^\h] |
| \s | A whitespace character: [ \t\n\x0B\f\r] |
| \S | A non-whitespace character: [^\s] |
| \v | A vertical whitespace character: [\n\x0B\f\r\x85\u2028\u2029] |
| \V | A non-vertical whitespace character: [^\v] |
| \w | A word character: [a-zA-Z_0-9] |
| \W | A non-word character: [^\w] |
interface CharSequence {
charAt(int i);
length();
subSequence(int start, int end);
toString();
}6. Pattern和MatcherPattern p=Pattern.compile("abc+");
Matcher m=p.matcher("abcabcac");
while(m.find){
println("Match \""+m.group()+"\" at positions "+m.start()+"-"+(m.end()-1));
//Match "abc" at positions 0-2
//Match "abc" at positions 3-5
}Matcher还有matches,原文地址:http://blog.csdn.net/libinjlu/article/details/23875201