Python爬虫学习（二））requests库

时间：2020-05-05 12:38:20 阅读：73 评论：0 收藏：0 [点我收藏+]

标签：cep param key txt 状态包括执行格式 parse

一、urllib库

1、了解urllib

Urllib是python内置的HTTP请求库

包括：urllib.request 请求模块

　　 urllib.error 异常处理模块

urllib.parse url解析模块

urllib.robotparser robot.txt解析模块

二、Requests库

1、简单使用

import requests

response = requests.get(url)

print(type(response))

print(response.status_code)
print(response.cookies)

print(response.text)

print(response.content)
print(response.content.decode("utf-8"))

注意：

很多情况下直接用response.text会出现乱码问题，所以常使用response.content，返回二进制格式的数据，在通过decode()转换成utf-8

也可以使用以下方式进行避免乱码的问题

response = requests.get(url)

response.encoding = ‘utf-8‘
print(response.text)

2、请求

get请求

　　（1）基本get请求

　　（2）带参数的get请求

　　　　　 get?key=val

response = requests.get("http://httpbin.org/get?name=zhaofan&age=23")

print(response.text)

　　　　　　通过params关键字传递参数

data = {
            “name”:"zhaofan" ,
            "age":22
}

response = requests.get("http://httpbin.org/get",params=data)
print(response.url)
print(response.text)

　　　解析json requests.json执行了json.loads()方法，两者执行的结果一致

import json
import requests

response = request.get("http://httpbin.org/get")

print(response.json())

print(json.loads(response.text))

　　添加headers 有些网站（如知乎）直接通过requests请求访问时，默认是无法访问

在谷歌浏览器里输入chrome://version，就可以看到用户代理，将用户代理添加到头部信息

import requests
headers = {
                 "User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
}    

response = requests.get("https://www.zhihu.com",headers=headers)

print(response.text)

post请求

添加data参数

import requests
data = {
          “name”:"zhaofan",
          "age":23
}

response = requests.post("http://httpbin.org/post",data=data)

print(response.text)

响应

通过response可以获得很多属性

import requests

response = requests.get("http://www.baidu.com")

print(response.status_code)
print(response.headers)
print(response.cookies)
print(response.url)
print(response.history)

状态码判断

202：accepted

404：not_found

Python爬虫学习（二））requests库

标签：cep param key txt 状态包括执行格式 parse

原文地址：https://www.cnblogs.com/cola-1998/p/12827430.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行