码迷,mamicode.com
首页 > 其他好文 > 详细

Scrapy--1安装和运行

时间:2015-10-30 17:01:15      阅读:261      评论:0      收藏:0      [点我收藏+]

标签:

1.Scrapy安装问题

一开始是按照官方文档上直接用pip安装的,创建项目的时候并没有报错,

然而在运行 scrapy crawl dmoz 的时候错误百粗/(ㄒoㄒ)/~~比如:

ImportError: No module named _cffi_backend

Unhandled error in Deferred 等等,发现是依赖包好多没有装上,就去百度安装各种包,
有好多大神把这些都总结好了:膜拜!^_^

 http://blog.csdn.net/niying/article/details/27103081

http://blog.csdn.net/pleasecallmewhy/article/details/19354723

2.没有得到数据,发现是拼写错误.

E:\tutorial>scrapy crawl dmoz
2015-10-30 13:44:02 [scrapy] INFO: Scrapy 1.0.3 started (bot: tutorial)
2015-10-30 13:44:02 [scrapy] INFO: Optional features available: ssl, http11
2015-10-30 13:44:02 [scrapy] INFO: Overridden settings: {NEWSPIDER_MODULE: tu
torial.spiders, SPIDER_MODULES: [tutorial.spiders], BOT_NAME: tutorial}

2015-10-30 13:44:02 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsol
e, LogStats, CoreStats, SpiderState
2015-10-30 13:44:03 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddl
eware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultH
eadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMidd
leware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2015-10-30 13:44:03 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddlewa
re, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2015-10-30 13:44:03 [scrapy] INFO: Enabled item pipelines:
2015-10-30 13:44:03 [scrapy] INFO: Spider opened
2015-10-30 13:44:03 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 i
tems (at 0 items/min)
2015-10-30 13:44:03 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2015-10-30 13:44:03 [scrapy] INFO: Closing spider (finished)
2015-10-30 13:44:03 [scrapy] INFO: Dumping Scrapy stats:
{finish_reason: finished,
 finish_time: datetime.datetime(2015, 10, 30, 5, 44, 3, 292000),
 log_count/DEBUG: 1,
 log_count/INFO: 7,
 start_time: datetime.datetime(2015, 10, 30, 5, 44, 3, 282000)}
2015-10-30 13:44:03 [scrapy] INFO: Spider closed (finished)

 在spiders目录下的dmoz_spiders.py文件中将start_urls写成了start_url ,哎,╮(╯▽╰)╭

1 start_urls = [
2         "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
3         "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
4     ]

 

Scrapy--1安装和运行

标签:

原文地址:http://www.cnblogs.com/RoundGirl/p/4920426.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!