标签:区分 shortcut 127.0.0.1 images idt 逗号 head cut agent
安装requests模块
pip3 install requests
安装beautifulsoup4模块
[更多参考]https://blog.csdn.net/sunhuaqiang1/article/details/65936616
pip install beautifulsoup4
【更多参考】http://www.cnblogs.com/wupeiqi/articles/6283017.html
requests.post(url="", data="data", json="json", **kwargs) requests.get(url="", params="", **kwargs) requests.options(url="", **kwargs) requests.put(url="", data="data", **kwargs) requests.delete(url="", **kwargs) requests.head(url="", **kwargs)
requests.get请求实例
import requests from bs4 import BeautifulSoup response = requests.get(url="https://www.sogou.com/sgo?query=小猪佩奇") # print("GET请求结果:", response.text) soup = BeautifulSoup(response.text, "html.parser") str = soup.find_all(name="div", class_="rt-news151127") # 因为class是关键字,所以这里带了下划线 print("BS解析后的内容:", str)
requests.post请求实例
import requests from bs4 import BeautifulSoup form_data = { ‘phone‘: ‘13235‘, ‘password‘: ‘asdf‘, ‘oneMonth‘: 1 } response_post = requests.post( url=‘http://dig.chouti.com/login‘, data=form_data ) print(response_post.text)
【更多参考】http://www.cnblogs.com/wupeiqi/articles/6283017.html
- requests模块
a. 基本参数:method,url,params,data,json,headers,cookies
b. 其他参数:files,auth,proxies....
实例演示POST/GET请求参数
settings.py
INSTALLED_APPS = [ ... ‘app01‘, # 注册app ] MIDDLEWARE = [ ... # ‘django.middleware.csrf.CsrfViewMiddleware‘, ... ] STATICFILES_DIRS = (os.path.join(BASE_DIR, "statics"),) # 现添加的配置,这里是元组,注意逗号 TEMPLATES = [ ... ‘DIRS‘: [os.path.join(BASE_DIR, ‘templates‘)], ]
urls.py
from django.contrib import admin from django.urls import path from django.conf.urls import url, include from app01 import views urlpatterns = [ url(‘test/‘, views.Test), ]
views.py
from django.shortcuts import render, redirect, HttpResponse from app01 import models def Test(request): print("request.method:", request.method) print("request.GET:", request.GET) print("request.POST:", request.POST) print("request.body:", request.body) return HttpResponse("OK ")
test.py -->[Django的服务端启动后执行该py文件,get和post分开请求]
import requests # POST请求中data和json参数并无实际意义 requests.request( method=‘get‘, # get请求的参数都会在浏览器内显示 url=‘http://127.0.0.1:8000/test/‘, # 这里是字典形式的拼接 params={‘username‘: ‘hhh‘, ‘passwd‘: ‘hhh800@‘}, # rqeuests会自动拼接为 test?username=hhh&passwd=hhh800@ # 直接传递拼接好的字符串也是可以的 # params="username=hhh&passwd=hhh800@" # test?username=hhh&passwd=hhh800@ ) # POST请求中可有params、data和json参数 import json requests.request( method=‘post‘, url=‘http://127.0.0.1:8000/test/‘, # 这里是字典形式的拼接 # params参数需要: request.GET.get(‘username‘)来获取 # 直接传递拼接好的字符串也是可以的 # params="username=hhh&passwd=hhh800@" # test?username=hhh&passwd=hhh800@ params={‘username‘: ‘hhh‘, ‘passwd‘: ‘hhh800@‘}, # rqeuests会自动拼接为 test?username=hhh&passwd=hhh800@ # data 参数需要 request.POST.get(‘username‘)来获取 # data可以直接传递字符串过去: data="username=hhh;passwd=hhh800@" 【用封号区分开,实际上也是这样发送数据的】 # data属性默认的请求头为: content-type: application/x-www-form-urlencoded data={‘age‘: 24, ‘school‘: ‘peking‘}, # 这里的请求参数是以Form_Data传递过去,不再浏览器显示 # json默认请求头是: content-type: application/json,所以body有内容,POST内无内容 # json.dumps后的结果是字符串 # json=json.dumps({‘age‘: 24, ‘school‘: ‘peking‘}) )
Data格式的POST后台显示:
JSON格式的POST后台显示:
GET后台显示
如果需要手动添加App则命令为:
python manage.py startapp app01
实例演示Header请求
一般我们会在post请求的headers里面放2个参数:
‘User-Agent‘: ‘告诉服务器是正常的浏览器访问服务【Chrome/64.0.3282.186 Safari/537.36】‘,
‘Referer‘ : ‘告诉服务器我不是直接登录,上次访问过官网,这次是在上次访问基础上登录操作
import requests response = requests.post( url="https://www.zhihu.com/", headers={ ‘User-Agent‘:‘Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36‘, ‘Referer‘: ‘https://www.zhihu.com‘, # 告诉网站我上次访问过本官网 } ) print("带header的请求:\n", response.text)
不带请求头的访问:
带请求头的访问:
实例演示Cookies请求:session和cookie都是用于保持和服务器之间的对话
一般我们在post请求的Cookies里面放的参数都是根据前台获取的cookies,进行参数传递
import requests response = requests.post( url="https://home.cnblogs.com/set/", # 进入设置页面 cookies={ ‘.Cnblogs.AspNetCore.Cookies‘:‘CfDJ8Gf34cttDnEy2UYRcGZ0x3iHRU51QX‘, ‘.CNBlogsCookie‘:‘4BB40C02AC6BB1861B8A9835F7FC06D‘ # 这里仅举例,非正常cookie内容 } ) print("带cookie进行请求:\n", response.text)
前台登录成功后的cookies信息:
后台访问设置页面:
Python学习---爬虫学习[requests模块]180411
标签:区分 shortcut 127.0.0.1 images idt 逗号 head cut agent
原文地址:https://www.cnblogs.com/ftl1012/p/9419282.html