Python网络爬虫---使用已登录的cookie访问需要登录的网页

时间：2017-04-08 11:34:08 阅读：1005 评论：0 收藏：0 [点我收藏+]

标签：http log .com 分享 htm 编码 title 打印手动

使用已登录的Cookie访问登录的网站在网络爬虫中经常使用

1.使用浏览器手动登录网站，点击你需要访问的页面，比如我想访问的资源地址是

http://27.24.159.151:8005/student/GradeQueryPersonal.aspx

访问之后，使用F12启动调试

技术分享

可以看到访问该资源地址的所需要的Cookie信息

2.开始编码，使用Python2.7的自带的urllib2模块发送带cookie信息的请求头去访问对应的资源地址

#-*-coding:utf-8-*-
‘‘‘
登录教务系统
‘‘‘
import urllib2
from bs4 import BeautifulSoup

cook = ‘ASP.NET_SessionId=dgrfqd55ugkjc545efy41c45;0090541E6504B0D2010090AC655D12B0D20100002F000000‘
HEADERS = {"cookie": cook}  #填写你访问对应的资源地址时对应的Cookie
url = ‘http://27.24.159.151:8005/student/GradeQueryPersonal.aspx‘
request = urllib2.Request(url, headers=HEADERS)
html = urllib2.urlopen(request).read()

soup = BeautifulSoup(html, ‘lxml‘)
title = soup.find(‘title‘).get_text()
print title

运行结果，打印输出了该页面的标题

python stucookie.py

        学生个人成绩查询

Python网络爬虫---使用已登录的cookie访问需要登录的网页

标签：http log .com 分享 htm 编码 title 打印手动

原文地址：http://www.cnblogs.com/shootercheng/p/6680921.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行