scrapy

时间：2019-10-30 16:29:44 阅读：58 评论：0 收藏：0 [点我收藏+]

标签：encoding code div auth utf-8 fir color mini coding

__author__ = ‘Administrator‘
# -*- encoding:utf-8 -*-
import scrapy
class QuoteSpider(scrapy.Spider):
    name = ‘poxiao‘
    start_urls=[‘https://www.poxiao.com/type/movie/‘]
    def parse(self, response):#固定的
        quotes=response.xpath(‘//li/h3‘)#内容
        for quote in quotes:
            yield {
                ‘name‘:quote.xpath(‘./a/text()‘).extract_first(),
                ‘author‘:‘https://www.poxiao.com‘+quote.xpath(‘./a/@href‘).extract_first()
            }
            next_page=response.xpath(‘//div[@class="list-pager"]/a[last()-1]/@href‘).extract_first()
            if next_page:
                yield response.follow(next_page,self.parse)

用SCRAPY爬取某网页链接地址

scrapy runspider ***.py 运行此工程

SCRAPY runspider ***.py -o aa.json 保存成JSON文件

scrap runspider ***.py -o aa.csv -t csv 保存成EXCEL

scrapy

标签：encoding code div auth utf-8 fir color mini coding

原文地址：https://www.cnblogs.com/xupanfeng/p/11765545.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行