scrapy框架学习（六）日志设置和数据存储

时间：2021-02-01 12:18:26 阅读：0 评论：0 收藏：0 [点我收藏+]

标签：encoding dump 转化 mamicode diff htm spider pytho 自己

日志设置

CRITICAL ：严重错误
ERROR ：一般错误
WARNING : 警告
INFO : 一般的信息
DEBUG ：调试信息
默认的显示级别是DEBUG

# 设置错误显示级别
LOG_LEVEL = ‘DEBUG‘
# 将日志信息写到文件中，不要显示到屏幕中
LOG_FILE = ‘log.txt‘

技术图片

数据存储

在pipelines.py文件中创建3个函数

def open_spider(self,spider)    爬虫启动时会调用此方法

def close_spider(self,spider)   爬虫结束时会调用此方法

def download(self,item)         自己构建的下载文件的方法

# Define your item pipelines here
#
# Don‘t forget to add your pipeline to the ITEM_PIPELINES setting
# See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html


# useful for handling different item types with a single interface
from itemadapter import ItemAdapter

import json
import os
import urllib.request

class MyfirstScrapydemoPipeline:

    #爬虫启动时会调用此方法
    def open_spider(self,spider):
        print(‘爬虫启动‘)
        self.fp= open(‘qiushibaike.txt‘,‘w‘,encoding=‘utf8‘)

    def process_item(self, item, spider):
        #下载头像图片
        self.download(item)
        #将数据转化为字典
        obj= dict(item)
        #将字典数据装换为json格式
        string= json.dumps(obj,ensure_ascii=False)
        #将数据写入文件
        self.fp.write(string+‘\n‘)
        #不注释return item时只会下载第一页的头像
        #return item

    def download(self,item):
        #定义头像图片存放路径
        dirpath= r‘F:\python_project\爬虫\myfirst_scrapyDemo\myfirst_scrapyDemo\spiders\头像‘
        #获取每一张图片的名称
        name= item[‘name‘]+‘.jpg‘
        #拼接每一张头像图片的存储路径
        filepath=os.path.join(dirpath,name)
        #下载头像图片
        urllib.request.urlretrieve(item[‘face_src‘],filepath)


    # 爬虫结束时会调用此方法
    def close_spider(self,spider):
        print(‘爬虫结束‘)
        self.fp.close()

技术图片

scrapy框架学习（六）日志设置和数据存储

标签：encoding dump 转化 mamicode diff htm spider pytho 自己

原文地址：https://www.cnblogs.com/gostClimbers/p/14350367.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行