标签:标准库 .com odi com man 人民币 webkit HERE python
###页面抓取###
1、urllib3
是一个功能强大且好用的HTTP客户端,弥补了Python标准库中的不足
安装: pip install urllib3
使用:
import urllib3 http = urllib3.PoolManager() response = http.request(‘GET‘, ‘http://news.qq.com‘) print(response.headers) result = response.data.decode(‘gbk‘) print(result)
发送HTTPS协议的请求
安装依赖 : pip install certifi
import certifi import urllib3 http = urllib3.PoolManager(cert_reqs = ‘CERT_REQUIRED‘, ca_certs = certifi.where()) #添加证书 resp = http.request(‘GET‘, ‘http://news.baidu.com/‘) print(resp.data.decode(‘utf-8‘))
####带上参数
import urllib3 from urllib.parse import urlencode http = urllib3.PoolManager() args = {‘wd‘ : ‘人民币‘} # url = ‘http://www.baidu.com/s?%s‘ % (args) url = ‘http://www.baidu.com/s?%s‘ % (urlencode(args)) print(url) # resp = http.request(‘GET‘ , url) # print(resp.data.decode(‘utf-8‘)) headers = { ‘Accept‘ : ‘text/javascript, application/javascript, application/ecmascript, application/x-ecmascript, **; q=0.01‘, ‘Accept-Encoding‘ : ‘gzip, deflate, br‘, ‘Accept-Language‘ : ‘zh-CN,zh;q=0.9‘, ‘Connection‘ : ‘keep-alive‘, ‘Host‘ : ‘www.baidu.com‘, ‘Referer‘ : ‘https://www.baidu.com/s?wd=人民币‘, ‘User-Agent‘ : "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36" } resp8 = requests.get(url8, fields=args8, headers=headers8) print(resp8.text)
标签:标准库 .com odi com man 人民币 webkit HERE python
原文地址:https://www.cnblogs.com/Albert-w/p/9013194.html