码迷,mamicode.com
首页 > 其他好文 > 详细

Scrapy指定顺序输出 -《狗嗨默示录》-

时间:2017-08-20 00:48:06      阅读:203      评论:0      收藏:0      [点我收藏+]

标签:pos   tin   分隔符   文件   time   tips   class   spi   ini   

items.py

import scrapy

class CollectipsItem(scrapy.Item):

    IP = scrapy.Field()
    PORT = scrapy.Field()
    POSITION = scrapy.Field()
    TYPE = scrapy.Field()
    SPEED = scrapy.Field()
    CONNECT_TIME = scrapy.Field()
    SURVIVE_TIME = scrapy.Field()
    LAST_CHECK_TIME = scrapy.Field()

(1)在spiders中增加文件csv_item_exporter.py

from scrapy.conf import settings
from scrapy.contrib.exporter import CsvItemExporter

class MyProjectCsvItemExporter(CsvItemExporter):

    def __init__(self, *args, **kwargs):
        delimiter = settings.get(CSV_DELIMITER, ,)
        kwargs[delimiter] = delimiter

        fields_to_export = settings.get(FIELDS_TO_EXPORT, [])
        if fields_to_export :
            kwargs[fields_to_export] = fields_to_export

        super(MyProjectCsvItemExporter, self).__init__(*args, **kwargs)

(2)在settings.py中配置

FEED_EXPORTERS = {
    csv: CollectIPs.spiders.csv_item_exporter.MyProjectCsvItemExporter,
} #CollectIPs为工程名

FIELDS_TO_EXPORT = [
    IP,
    PORT,
    POSITION,
    TYPE,
    SPEED,
    CONNECT_TIME,
    SURVIVE_TIME,
    LAST_CHECK_TIME
]

在settings.py中也可以指定csv文件中的分隔符

CSV_DELIMITER = "\t"

 

Scrapy指定顺序输出 -《狗嗨默示录》-

标签:pos   tin   分隔符   文件   time   tips   class   spi   ini   

原文地址:http://www.cnblogs.com/LiGoHi/p/7398320.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!