实例：模拟登陆豆瓣

时间：2018-06-09 13:14:17 阅读：160 评论：0 收藏：0 [点我收藏+]

标签：.com import img www 发送 ack mda bin post

# -*- coding: utf-8 -*-
import scrapy
import urllib.request

# https://accounts.douban.com/login

class DoubanSpider(scrapy.Spider):
　　name = ‘douban‘
　　allowed_domains = [‘www.douban.com‘, ‘accounts.douban.com‘]
　　start_urls = [‘https://accounts.douban.com/login‘]

　　def parse(self, response):
　　　# 查找验证码图片，看有没有验证码
　　　image = response.xpath(‘//img[@id="captcha_image"]/@src‘)
　　　# 判断image这个列表是否为空，如果为空，就是没有验证码
　　　if len(image) == 0:
　　　　print(‘不带验证码的‘ * 10)
　　　　# 不带验证码的
　　　　formdata = {
　　　　‘source‘: ‘index_nav‘,
　　　　‘form_email‘: ‘1090509990@qq.com‘,
　　　　‘form_password‘: ‘lizhibin666‘,
　　　　}
　　　else:
　　　　print(‘带验证码的‘ * 10)
　　　　# 通过属性选择器获取得到
　　　　captchaid = response.css(‘input[name="captcha-id"]::attr(value)‘).extract_first()
　　　　# 获取验证码链接
　　　　image_url = image.extract_first()
　　　　# print(‘*‘ * 50)
　　　　# print(captchaid)
　　　　# print(image_url)
　　　　# print(‘*‘ * 50)
　　　　urllib.request.urlretrieve(image_url, ‘code.png‘)
　　　　code = input(‘请输入验证码:‘)
　　　　# 带验证码的
　　　　formdata = {
　　　　‘source‘: ‘None‘,
　　　　‘redir‘: ‘https://www.douban.com/‘,
　　　　‘form_email‘: ‘1090509990@qq.com‘,
　　　　‘form_password‘: ‘lizhibin666‘,
　　　　‘captcha-solution‘: code,
　　　　‘captcha-id‘: captchaid,
　　　　‘login‘: ‘登录‘,
　　　　}

　　　　post_url = ‘https://accounts.douban.com/login‘
　　　　# 发送post请求
　　　　yield scrapy.FormRequest(url=post_url, formdata=formdata, callback=self.lala)
　　#保存文件，查看是否登录
　　def lala(self, response):
　　　　print(‘*‘ * 50)
　　　　with open(‘douban.html‘, ‘wb‘) as fp:
　　　　fp.write(response.body)
　　　　print(‘*‘ * 50)

实例：模拟登陆豆瓣

标签：.com import img www 发送 ack mda bin post

原文地址：https://www.cnblogs.com/airapple/p/9158846.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行