标签:
. ^ $ * + ? { } [ ] \ | ( )
[], 用来指定一个字符集(character class),字符集可以单个列出或者指定一个范围,For example,
[abc] will match any of the characters a, b, or c; this is the same as [a-c], which uses a range to express the same set of characters. If you wanted to match only lowercase letters, your RE would be [a-z]
在 [] 中,元字符不起特殊作用,For example, [akm$] will match any of the characters ‘a‘, ‘k‘, ‘m‘, or ‘$‘; ‘$‘ is usually a metacharacter, but inside a character class it’s stripped of its special nature
在 [] 中使用 ‘^‘ 可以表示取非,For example, [^5] will match any character except ‘5‘
使用 \ (backslash) 转义,if you need to match a [ or \, you can precede them with a backslash to remove their special meaning: \[ or \\
predefined sets of characters:
\d == [0-9] \D == [^0-9] \s == [ \t\n\r\f\v] # 所有的空格字符 \S == [^ \t\n\r\f\v] # 所有的非空字符 \w == [a-zA-Z0-9_] \W == [^a-zA-Z0-9_] 字符集可以嵌套使用,For example, [\s,.] is a character class that will match any whitespace character, or ‘,‘ or ‘.‘
‘.‘ matches anything except a newline character
‘*‘ it specifies that the previous character can be matched zero or more times
‘+‘ which matches one or more times
‘?‘ matches either once or zero times, For example, home-?brew matches either homebrew or home-brew
正则表达式被编译成模式对象(pattern objects),模式对象可以用很多种方法进行匹配或者操作。
>>> import re >>> p = re.compile(‘ab*‘) >>> p <_sre.SRE_Pattern object at 0x...>
使用原生字符串(raw string notation: r)
Regular String | Raw string |
"ab*" | r"ab*" |
"\\\\section" | r"\\section" |
"\\w+\\s+\\1" | r"\w+\s+\1" |
标签:
原文地址:http://www.cnblogs.com/garyang/p/5591792.html