标签:品质 saas one container ++ containe ali index alt
import requests import re url = ‘http://www.jd.com/‘
#url=‘http://www.eastmoney.com/‘ r=requests.get(url) r.encoding=‘utf-8‘ data=re.findall(‘<title>(.*?)</title>‘,r.text,re.S) print(data)
[‘京东(JD.COM)-正品低价、品质保障、配送及时、轻松购物!‘]
[‘东方财富网:财经门户,提供专业的财经、股票、行情、证券、基金、理财、银行、保险、信托、期货、黄金、股吧、博客等各类财经资讯及数据‘]
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
|
import re# 提取pythonkey = "javapythonc++php"re.findall("python", key)[0]"""python"""# 提取出hello worldkey = "<html><h1>hello world</h1></html>"re.findall(‘<h1>hello world</h1>‘, key)"""[‘<h1>hello world</h1>‘]"""# 提取170string = "我喜欢身高为170的女孩"# re.findall("170", string)[0]re.findall(‘\d+‘, string)"""[‘170‘]"""# 提取出http://和https://key = ‘http://www.baidu.com and https://boob.com‘re.findall(‘https{0,1}‘, key) # {}前的字符出现0次或1次"""[‘http‘, ‘https‘]"""# 提取出hit.key = "bobo@hit.edu.com"re.findall("h.*\.", key) # .表示任意字符(\n除外);*表示匹配0个或多个;\表示对.转义"""[‘hit.edu.‘]"""# 贪婪模式:根据正则表达式尽可能多地提取数据。# 切换为非贪婪模式,加一个"?"re.findall("h.*?\.", key)"""[‘hit.‘]"""# 匹配sas和saaskey = "saas and sas and saaas"re.findall(‘sa{1,2}s‘, key) # 匹配1-2次由前面表达式定义的片段"""[‘saas‘, ‘sas‘]"""# 匹配i开头的行 re.S:基于单行匹配 re.M:基于多行匹配string = ‘‘‘fall in love with youi love you very muchi love shei love her‘‘‘re.findall("^i.*", string, re.M)"""[‘i love you very much‘, ‘i love she‘, ‘i love her‘]"""# 匹配所有的行string = """<div>静夜思床前明月光疑是地上霜举头望明月低头思故乡</div>"""re.findall(‘<div>.*</div>‘, string, re.S)"""[‘<div>静夜思\n床前明月光\n疑是地上霜\n举头望明月\n低头思故乡\n</div>‘]"""
|
标签:品质 saas one container ++ containe ali index alt
原文地址:https://www.cnblogs.com/adam012019/p/12327129.html