# 正则表达式

re.表示调用某种方法

1;"abcde".find("b")

>>>1

2;abcde".find("bc")

>>>1   [其中把bc看做是一个整体，第一个是把b单独看做是一个整体】

3;

```"abcd".split("b")
>>>["a","cd"]
```

4；split分割复符

```"abcd".split("b")
>>>["a","cd"]```

```import re
p = re.compile(r"\d+")
r = p.split("one1two2three3four4")

print(r )
[‘one‘, ‘two‘, ‘three‘, ‘four‘, ‘‘]

import re
p = re.compile(r"\d+")
r = p.split("4one1two2three3four4")

print(r )
[‘‘，‘one‘, ‘two‘, ‘three‘, ‘four‘, ‘‘]
```

```re.split("[bc]","abcd")
["a"," ","d"]b分完后得acd   acd再用c去分得```
`["a"," ","d"]`

5；replace替代符

```"abcd".replace("ab","ee")
"eecd"
```

```re.sub("g.t","have","I get A,  I got B , I gut C")
"I have A,  I have B , I have C"
re.sub("g.t","have","I get A,  I got B , I gut C",2)
"I have A,  I have B ,I gut C"
re.subN("g.t","have","I get A,  I got B , I gut C")
("I have A,  I have B , I have C",3)
```

sub中.　表示一个字符其中2表示可随意定义替换几次1次两次三次

subn 中其输出结果后面的三表示替换了三次 是二则表示替换了两次。

6；compile查找所有包含“oo”的单词compile在英语中表示编辑 汇编

```import re
text = "JGood is a handsome boy, he is cool,clever,and so"
regex = re.compile(r"\w*oo\w*")
print(regex.findall(text))
[‘JGood‘, ‘cool‘]
```

1普通字符：大多数字符和字母都会和自身匹配

```findall("alex","wqjdalexjh")
["alex"]```
`findall("alex","wqjdalexjhalex")`
`["alex","alex"]`

2　元字符：     .   ^   \$   *   +   ？    {}    []   |  ()   \

.   .表示通配符只能匹配一个元素

```.表示除了换行符任何一个字符都能匹配上，中间只能衔接一个元素。
findall("al.x","wqjdalexjhalex")
["alex"]
findall("alex.w","wqjdalexswjh")
["alexsw"]
```

^    ^匹配的内容必须是“^”开头的元素(尖角符）^必须放在开头

```re.findall("^alex","wqjdalexjh")
[]
re.findall("^alex","alexjh")
["alex"]
re.findall("^alex","ssw^alexjh")
[]```

\$      \$匹配的内容必须是“\$”结尾的元素(到了符）\$必须放在结尾

```re.findall("\$alex","sadsaqalex")
["alex"]
```

*    *匹配零到多个重复元素,也是贪婪匹配后面有几个匹配几个。

```re.findall("alex*","sadsaqalex")
["alex"]
["alexxxx"]
["ale"]
```

+    +匹配一到多次重复元素，也是贪婪匹配后面有几个匹配几个

```re.findall("alex+","sadsaqalexxxx")
["alexxxx"]
[]
```

？   ？匹配 零到一重复   非贪婪匹配

```re.findall("alex?","sadsaqalexxxx")
["alex"]
["ale"]
["alex"]
```

{ }    { }匹配到n个重复元素若{3,5}表示[3,5]内闭区间的元素3个4个5个，想几次就几次。非贪婪匹配。

```re.findall("alex{3}","sadsaqalexxxxx")
["alexxx"]
["alexxx"]
["alexxxx"]
["alexxxxx"]
```

[ ]    [ ]举例[bc]匹配括号内bc

```re.findall("a[bc]d","wwwabd")
["abd"]
re.findall("a[bc]d","wwwacd")
["acd"]
re.findall("a[bc]d","wwwabcd")
[]原字符在字符集里没有意义```
```re.findall("a[.]d","wwwaqd")
[]```
```re.findall("a[.]d","wwwaod")
[]```
```re.findall("a[.]d","wwwa.d")
["a.d"]```

```re.findall("[a-z]"),"wwwa.d")
["w"，“w”，“w”，“a”，“d”]
re.findall("[1-9]"),"w3wwa8.d")
[“3”，“8”]
re.findall("[1-9]"),"w3wwa8.d")表示非的意思即除了1到9其它都要。
["w"，“w”，“w”，“a”，“。”，“d”]
re.findall("[1-9]"),"w3wwa8.d0")
["w"，“w”，“w”，“a”，“。”，“d”，“0”]```
```re.findall("[\d]","ww3wa8.d0")
["3","8","0"]```

\      \:1.反斜杠后面的元字符去除特殊功能，

2.反斜杠后边的跟普通字符实现特殊功能

3.引用序号对应的字组所匹配的字符串

\d   匹配任何十进制数；它相当于类【 0-9】。匹配一个数字

\D   匹配任何非数字字符；它相当于类【 ^0-9】。

\s   匹配任何空白字符；它相当于类【\t\n\r\f\v】。

\S   匹配任何非空白字符；它相当于类【^\t\n\r\f\v】。

\w   匹配任何字母数字字符；它相当于类【a-zA-ZO-9】.。

\W   匹配任何非字母数字字符；它相当于类【^a-zA-ZO-9】。

\b;   匹配一个单词边界，也就是指单词和空格间的位置。

```re.findall("abc\b",asdas abc ")
[]
re.findall(r"abc\b",asdas abc ")
["abc"]
re.findall(r"abc\b",asdas abc*")
["abc"]
re.findall(r"I\b","I MISS IOU")
["I"]
re.findall(r"I\b"," MISS IOU")
[]
re.findall(r"\Ib","I MISS IOU")
["I","I"]

re.findall("abc\\b",asdas abc ")
["abc"]

re.findall(r"abc\b",asdas abc ")
["abc"]
```

```re.findall("\d","ww3wa8.d0")
["3","8","0"]
re.findall("\w","ww3wa8.d0")
[ "w", "w","3", "w","a","8","d","0"]
re.findall("\s","ww3wa8.d0")
[" "]
re.findall("[\d]","ww3wa8.d0")
["3","8","0"]```

search与match的区别

```re.search("(ab)*","aba").group()
"ab"
re.search("alex*","abalex").group()
"alex"
re.search("alex*","abalexdsfaalex").group()
"alex"
re.match("alex*","abalexdsfaalex")
none
```
```import rea = "123abc456"
re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(0)
"123abc456"
re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(1)
"123"
re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(2)
"abc"
re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(3)
"456"
```
`其中group内的零表示取得是括号内的所有元组1代表第一个元组0表示所有元组2表示第二个元组3表示第三个元组`

( )   ( )如果有括号表示先清算括号里面的

```re.findall(r"a(\d+)","a23b")
["23"]
re.search(r"a(\d+)","a23b").group()
"a23"
re.search(r"a(\d+)","a2355555888b").group()
"a2355555888"
re.search(r"a(\d+?)","a2366666b").group()
"a2"
re.search(r"a(\d*?)","a2366666b").group()
"a"
```

```re.findall(r"a(\d+)b","a23b")

["23"]
```

re.I   使匹配对大小写不敏感

re.L  做本地化识别（locale-aware）匹配

re.M  多行匹配，影响 ^ 和 \$

re.S   使 .　匹配包括换行在内的所有字符

re.U    根据Unicode字符集解析字符。这个标志影响 \w,\W,　\b

补充：

```re.findall("www.(baidu|laonanhai).com,"sadd www.baidu.com")
["baidu"]其中findall表示优先取元组里的即输出["baidu"]
["laonanhai"]
["laonanhai"]
？：表示去掉在python中优先取元组元素的原则
```

(0)
(0)

0条