码迷,mamicode.com
首页 > 其他好文 > 详细

31、当当图书榜单爬虫

时间:2019-05-12 01:54:05      阅读:343      评论:0      收藏:0      [点我收藏+]

标签:item   pytho   bit   header   str   temp   容错   eth   url   

练习介绍
要求:
    请使用Scrapy,爬取当当网2018年图书销售榜单前3页的数据(图书名、作者和书的价格)。  
 
    当当网2018年图书销售榜单链接:
 
目的:
    练习定义item
    练习编写spiders文件
    练习修改settings文件
 
1、创建当当爬虫的项目
 
1 D:\USERDATA\python>scrapy startproject dangdang
2 New Scrapy project dangdang, using template directory c:\users\www1707\appdata\local\programs\python\python37\lib\site-packages\scrapy\templates\project, created in:
3     D:\USERDATA\python\dangdang
4 
5 You can start your first spider with:
6     cd dangdang
7     scrapy genspider example example.com
8 
9 D:\USERDATA\python>
 
2、新建爬虫文件 D:\USERDATA\python\dangdang\dangdang\spiders\dangdang.py
 
 1 import scrapy
 2 import bs4
 3 from ..items import DangdangItem
 4 
 5 class DangdangSpider(scrapy.Spider):
 6     name = dangdang
 7     allowed_domains = [http://bang.dangdang.com]
 8     start_urls = []
 9     for x in range(1,4):
10         url = http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1- + str(x)
11         start_urls.append(url)
12 
13     def parse(self,response):
14         bs = bs4.BeautifulSoup(response.text,html.parser)
15         datas = bs.find(ul,class_=bang_list_mode).find_all(li)
16         for data in datas:
17             item = DangdangItem()
18             item[bang_num] = data.find(div,class_=list_num).text
19             item[book_name] = data.find(div,class_=name).text
20             item[book_author] = data.find(div,class_=publisher_info).text
21             item[price] = data.find(span,class_=price_n).text
22             yield item

 

3、编辑 D:\USERDATA\python\dangdang\dangdang\items.py
 
1 import scrapy
2 
3 class DangdangItem(scrapy.Item):
4     bang_num = scrapy.Field()
5     book_name = scrapy.Field()
6     book_author = scrapy.Field()
7     price = scrapy.Field()

 

4、编辑 D:\USERDATA\python\dangdang\dangdang\settings.py
 
1 BOT_NAME = dangdang
2 SPIDER_MODULES = [dangdang.spiders]
3 NEWSPIDER_MODULE = dangdang.spiders
4 USER_AGENT = Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36
5 ROBOTSTXT_OBEY = True

 

5、在D:\USERDATA\python\dangdang 下执行命令 scrapy crawl dangdang
 
  1 D:\USERDATA\python\dangdang>scrapy crawl dangdang
  2 2019-05-08 17:00:28 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: dangdang)
  3 2019-05-08 17:00:28 [scrapy.utils.log] INFO: Versions: lxml 4.3.3.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.0, Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 19.0.0 (OpenSSL 1.1.1b  26 Feb 2019), cryptography 2.6.1, Platform Windows-10-10.0.17134-SP0
  4 2019-05-08 17:00:28 [scrapy.crawler] INFO: Overridden settings: {BOT_NAME: dangdang, NEWSPIDER_MODULE: dangdang.spiders, SPIDER_MODULES: [dangdang.spiders], USER_AGENT: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36}
  5 2019-05-08 17:00:28 [scrapy.extensions.telnet] INFO: Telnet Password: f05741387a33a05e
  6 2019-05-08 17:00:28 [scrapy.middleware] INFO: Enabled extensions:
  7 [scrapy.extensions.corestats.CoreStats,
  8 scrapy.extensions.telnet.TelnetConsole,
  9 scrapy.extensions.logstats.LogStats]
 10 2019-05-08 17:00:28 [scrapy.middleware] INFO: Enabled downloader middlewares:
 11 [scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware,
 12 scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware,
 13 scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware,
 14 scrapy.downloadermiddlewares.useragent.UserAgentMiddleware,
 15 scrapy.downloadermiddlewares.retry.RetryMiddleware,
 16 scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware,
 17 scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware,
 18 scrapy.downloadermiddlewares.redirect.RedirectMiddleware,
 19 scrapy.downloadermiddlewares.cookies.CookiesMiddleware,
 20 scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware,
 21 scrapy.downloadermiddlewares.stats.DownloaderStats]
 22 2019-05-08 17:00:28 [scrapy.middleware] INFO: Enabled spider middlewares:
 23 [scrapy.spidermiddlewares.httperror.HttpErrorMiddleware,
 24 scrapy.spidermiddlewares.offsite.OffsiteMiddleware,
 25 scrapy.spidermiddlewares.referer.RefererMiddleware,
 26 scrapy.spidermiddlewares.urllength.UrlLengthMiddleware,
 27 scrapy.spidermiddlewares.depth.DepthMiddleware]
 28 2019-05-08 17:00:28 [scrapy.middleware] INFO: Enabled item pipelines:
 29 []
 30 2019-05-08 17:00:28 [scrapy.core.engine] INFO: Spider opened
 31 2019-05-08 17:00:28 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
 32 2019-05-08 17:00:28 [py.warnings] WARNING: c:\users\www1707\appdata\local\programs\python\python37\lib\site-packages\scrapy\spidermiddlewares\offsite.py:61: URLWarning: allowed_domains accepts only domains, not URLs. Ignoring URL entry http://bang.dangdang.com in allowed_domains.
 33   warnings.warn(message, URLWarning)
 34 
 35 
 36 2019-05-08 17:00:28 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
 37 2019-05-08 17:00:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2> (referer: None)
 38 2019-05-08 17:00:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3> (referer: None)
 39 2019-05-08 17:00:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1> (referer: None)
 40 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 41 {bang_num: 21.,
 42 book_author: [日]东野圭吾 著,新经典 出品,
 43 book_name: 东野圭吾:白夜行(2017版,易烊千玺、韩雪推荐,东野圭吾无冕之...,
 44 price: ¥41.10}
 45 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 46 {bang_num: 22.,
 47 book_author: (法)安托万·德·圣埃克苏佩里 著,李继宏 译,果麦文化 出品,
 48 book_name: 小王子(畅销300万册,作者基金会官方认证简体中文版)【果麦经典...,
 49 price: ¥17.60}
 50 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 51 {bang_num: 23.,
 52 book_author: 刘慈欣,
 53 book_name: 三体:全三册 刘慈欣代表作,亚洲首部“雨果奖”获奖作品!,
 54 price: ¥55.80}
 55 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 56 {bang_num: 24.,
 57 book_author: 钱钟书\u3000著,
 58 book_name: 围城,
 59 price: ¥24.90}
 60 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 61 {bang_num: 25.,
 62 book_author: 黄仁宇,
 63 book_name: 万历十五年 一本好书 腾讯视频栏目推荐,
 64 price: ¥24.70}
 65 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
 66 {bang_num: 41.,
 67 book_author: 海明威 罗曼·罗兰 塞尔玛·拉格洛夫 等,张荣梅 策划,小当当童书馆 出品,
 68 book_name: 诺奖少年版(全套30册)2018当当童书畅销书,日销售ZUI高达50000...,
 69 price: ¥248.10}
 70 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
 71 {bang_num: 1.,
 72 book_author: 余华,
 73 book_name: 活着(2017年新版),
 74 price: ¥28.00}
 75 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 76 {bang_num: 26.,
 77 book_author: (美)加·泽文 Gabrielle Zevin 著;孙仲旭、李玉瑶 译;读客文化 出品,
 78 book_name: 岛上书店(每个人的生命中,都有无比艰难的那一年,将人生变得美...,
 79 price: ¥29.80}
 80 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 81 {bang_num: 27.,
 82 book_author: 贾平凹 著 时代华语 出品,
 83 book_name: 自在独行     贾平凹的独行世界,
 84 price: ¥28.00}
 85 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 86 {bang_num: 28.,
 87 book_author: 姜自霞,
 88 book_name: 魔法拼音国(套装 共7册),
 89 price: ¥49.00}
 90 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 91 {bang_num: 29.,
 92 book_author: 张嘉佳 著,博集天卷 出品,
 93 book_name: 云边有个小卖部,
 94 price: ¥21.00}
 95 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
 96 {bang_num: 30.,
 97 book_author: 路遥 著,新经典 出品,
 98 book_name: 平凡的世界:全三册(朱一龙推荐,八年级下册自主阅读推荐),
 99 price: ¥74.50}
100 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
101 {bang_num: 31.,
102 book_author: 戴尔·卡耐基 著,陶曚 译,果麦文化 出品,
103 book_name: 人性的弱点(薛之谦推荐,畅销100万册)【果麦经典】,
104 price: ¥28.50}
105 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
106 {bang_num: 32.,
107 book_author: 大冰 著,博集天卷 出品,
108 book_name: 我不(大冰作品。十个月狂销200万册,不容错过的奇书!),
109 price: ¥21.50}
110 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
111 {bang_num: 33.,
112 book_author: 〔英〕毛姆 著  苏福忠 译,
113 book_name: 月亮和六便士(全新导读无删节详注版! 半年创当当110000名读者五...,
114 price: ¥24.30}
115 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
116 {bang_num: 34.,
117 book_author: 陈磊(二混子) 著;读客文化 出品,
118 book_name: 半小时漫画中国史(修订版)(看半小时漫画,通五千年历史!《半...,
119 price: ¥35.90}
120 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
121 {bang_num: 35.,
122 book_author: (美)怀特\u3000著,任溶溶\u3000译,
123 book_name: 夏洛的网(新),
124 price: ¥19.50}
125 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
126 {bang_num: 36.,
127 book_author: 克莱儿·麦克福尔,白马时光 出品,
128 book_name: 摆渡人2:重返荒原(系列畅销千万册。每一个镌刻着爱与善意的灵魂...,
129 price: ¥38.80}
130 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
131 {bang_num: 37.,
132 book_author: 东野圭吾 著,娄美莲 译,新经典 出品,
133 book_name: 东野圭吾:恶意(2016版,东野圭吾四大杰作之一),
134 price: ¥27.30}
135 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
136 {bang_num: 38.,
137 book_author: (哥伦)马尔克斯\u3000 著,杨玲 译,新经典 出品,
138 book_name: 霍乱时期的爱情(2015版)  一本好书 腾讯视频栏目推荐,
139 price: ¥34.20}
140 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
141 {bang_num: 39.,
142 book_author: 埃德加·斯诺 著;董乐山 译,
143 book_name: 红星照耀中国(青少版)人民文学出版社,
144 price: ¥20.50}
145 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-2>
146 {bang_num: 40.,
147 book_author: 蔡崇达 著,果麦文化 出品,
148 book_name: 皮囊(畅销300万册的国民读本,刘德华、李敬泽作序。繁体版面世即...,
149 price: ¥21.80}
150 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
151 {bang_num: 42.,
152 book_author: (日)山下英子,
153 book_name: 断舍离,
154 price: ¥27.10}
155 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
156 {bang_num: 43., book_author: 杨绛, book_name: 我们仨, price: ¥23.00}
157 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
158 {bang_num: 44.,
159 book_author: 老杨的猫头鹰,
160 book_name: 好看的皮囊千篇一律,有趣的灵魂万里挑一(老杨的猫头鹰最新作品...,
161 price: ¥31.00}
162 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
163 {bang_num: 45.,
164 book_author: (美)玛兹丽施\u3000著,安燕玲\u3000译,
165 book_name: 如何说孩子才会听 怎么听孩子才肯说(全新修订版),
166 price: ¥36.80}
167 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
168 {bang_num: 46.,
169 book_author: 王小波 著,新经典 出品,
170 book_name: 一只特立独行的猪,
171 price: ¥22.80}
172 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
173 {bang_num: 47.,
174 book_author: (美)莱曼·弗兰克·鲍姆,(德)格林兄弟,(丹)安徒生等著,张荣梅 策划,小当当童书馆 出品,
175 book_name: 百年童话绘本·典藏版(全套30册)当当2018年度常青藤畅销书奖,...,
176 price: ¥208.60}
177 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
178 {bang_num: 48.,
179 book_author: 杨绛,
180 book_name: 我们仨(新版),
181 price: ¥15.80}
182 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
183 {bang_num: 49.,
184 book_author: 著 (日)莳田晋至,译 吴佳芬,绘 (日)长谷川知子,
185 book_name: 在教室说错了没关系,
186 price: ¥18.00}
187 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
188 {bang_num: 50.,
189 book_author: 高春香,邵敏 著,许明振,李婧 绘,
190 book_name: 这就是二十四节气(中国二十四节气彩绘版,文津图书奖获奖绘本,...,
191 price: ¥50.00}
192 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
193 {bang_num: 51.,
194 book_author: 慕颜歌  著,文通天下  出品,
195 book_name: 你的善良必须有点锋芒,
196 price: ¥29.20}
197 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
198 {bang_num: 52.,
199 book_author: 余华,
200 book_name: 许三观卖血记(新版),
201 price: ¥32.00}
202 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
203 {bang_num: 53.,
204 book_author: 陈磊(笔名:二混子) 著;读客文化 出品,
205 book_name: 半小时漫画世界史(看半小时漫画,通五千年历史!其实是一本严谨...,
206 price: ¥35.90}
207 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
208 {bang_num: 54.,
209 book_author: 高铭 著,磨铁图书 出品,
210 book_name: 天才在左 疯子在右(2018全新完整版),
211 price: ¥44.10}
212 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
213 {bang_num: 55.,
214 book_author: [日]稻盛和夫 著,曹岫云 译,
215 book_name: 阿米巴经营——畅销十周年纪念版,当当全国独家(团购,请致电40...,
216 price: ¥27.30}
217 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
218 {bang_num: 56.,
219 book_author: 东野圭吾 著,刘子倩 译,新经典 出品,
220 book_name: 东野圭吾:嫌疑人X的献身(王凯、张鲁一推荐,至为纯粹的爱情,绝...,
221 price: ¥26.30}
222 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
223 {bang_num: 57.,
224 book_author: 曹文轩   著,
225 book_name: 曹文轩文集典藏版(全7册),
226 price: ¥84.00}
227 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
228 {bang_num: 58.,
229 book_author: 史蒂芬·霍金,
230 book_name: 时间简史(插图本)(央视《朗读者》推荐),
231 price: ¥32.60}
232 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
233 {bang_num: 2.,
234 book_author: 周国平,
235 book_name: 我喜欢生命本来的样子(周国平经典散文作品集),
236 price: ¥40.50}
237 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
238 {bang_num: 3.,
239 book_author: 乔安娜柯尔\u3000著 布鲁斯迪根 图\u3000施芳\u3000译,
240 book_name: 神奇校车·桥梁书版(全20册),
241 price: ¥75.00}
242 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
243 {bang_num: 4.,
244 book_author: 郑利强 段虹(绘) 步印童书 出品,
245 book_name: 我的第一本地理启蒙书,
246 price: ¥24.90}
247 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
248 {bang_num: 59.,
249 book_author: 〔英〕安东尼·布朗,
250 book_name: 我爸爸+我妈妈(全2册),
251 price: ¥46.50}
252 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-3>
253 {bang_num: 60.,
254 book_author: 陈卫平、陈雨岚等 步印童书 出品,
255 book_name: 写给儿童的中国地理(全14册),
256 price: ¥196.00}
257 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
258 {bang_num: 5.,
259 book_author: (日)太宰治\u3000著,杨伟\u3000译,
260 book_name: 人间失格(日本小说家太宰治的自传体小说),
261 price: ¥22.50}
262 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
263 {bang_num: 6.,
264 book_author: (荷)丹姆  著,漆仰平,爱桐  译,
265 book_name: 小熊和最好的爸爸(全7册),
266 price: ¥17.50}
267 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
268 {bang_num: 7.,
269 book_author: 戴维·伽特森,
270 book_name: 雪落香杉树 (福克纳奖得主,全球畅销500万册),
271 price: ¥46.80}
272 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
273 {bang_num: 8.,
274 book_author: 张嘉骅,
275 book_name: 少年读史记(套装全5册),
276 price: ¥50.00}
277 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
278 {bang_num: 9.,
279 book_author: (美)乔安娜柯尔 著 ,(美)布鲁斯·迪根 图,
280 book_name: 神奇校车·图画书版(全12册,新增《科学博览会》1册),
281 price: ¥99.00}
282 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
283 {bang_num: 10.,
284 book_author: 大冰 著,博集天卷 出品,
285 book_name: 你坏(大冰2018作品!预售10分钟8.6万册+,30分钟突破11.8万册,...,
286 price: ¥27.30}
287 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
288 {bang_num: 11.,
289 book_author: (日)东野圭吾 著,新经典 出品,
290 book_name: 东野圭吾:解忧杂货店(王俊凯、迪丽热巴主演,这家店帮你找回内...,
291 price: ¥27.30}
292 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
293 {bang_num: 12.,
294 book_author: 沈复 著 , 张佳玮 译,果麦文化 出品,
295 book_name: 浮生六记(汪涵、胡歌推荐,畅销250万册。沈复给芸娘的绝美情书)...,
296 price: ¥15.20}
297 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
298 {bang_num: 13.,
299 book_author: [美] 简·尼尔森,
300 book_name: 《正面管教》修订版,
301 price: ¥18.90}
302 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
303 {bang_num: 14.,
304 book_author: 陈卫平著 步印童书 出品,
305 book_name: 写给儿童的中国历史(全14册),
306 price: ¥177.50}
307 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
308 {bang_num: 15.,
309 book_author: 毛姆 著,徐淳刚 译,大星文化 出品,作家榜经典文库,高更 绘,
310 book_name: 月亮与六便士(新版未删节!当当名著销量桂冠!豆瓣阅读桂冠!上...,
311 price: ¥29.90}
312 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
313 {bang_num: 16.,
314 book_author: [英]克莱儿·麦克福尔,白马时光 出品,
315 book_name: 摆渡人(系列畅销千万册。如果命运是一条孤独的河流,谁会是你灵...,
316 price: ¥32.60}
317 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
318 {bang_num: 17.,
319 book_author: 加西亚·马尔克斯 著,新经典 出品,
320 book_name: 马尔克斯:百年孤独(50周年纪念版),
321 price: ¥41.30}
322 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
323 {bang_num: 18.,
324 book_author: [美]卡勒德·胡赛尼(Khaled Hosseini) 著,李继宏 译,
325 book_name: 追风筝的人(2018年新版),
326 price: ¥18.00}
327 2019-05-08 17:00:28 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
328 {bang_num: 19.,
329 book_author: 佐佐木圭一 著 程亮 译 时代华语 出品,
330 book_name: 所谓情商高,就是会说话,
331 price: ¥23.00}
332 2019-05-08 17:00:29 [scrapy.core.scraper] DEBUG: Scraped from <200 http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-year-2018-0-1-1>
333 {bang_num: 20.,
334 book_author: 李思圆 著,文通天下 出品,
335 book_name: 生活需要仪式感 (把温暖和感动带给你在乎的人),
336 price: ¥32.10}
337 2019-05-08 17:00:29 [scrapy.core.engine] INFO: Closing spider (finished)
338 2019-05-08 17:00:29 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
339 {downloader/request_bytes: 1044,
340 downloader/request_count: 3,
341 downloader/request_method_count/GET: 3,
342 downloader/response_bytes: 354134,
343 downloader/response_count: 3,
344 downloader/response_status_count/200: 3,
345 finish_reason: finished,
346 finish_time: datetime.datetime(2019, 5, 8, 9, 0, 29, 1579),
347 item_scraped_count: 60,
348 log_count/DEBUG: 63,
349 log_count/INFO: 9,
350 log_count/WARNING: 1,
351 response_received_count: 3,
352 scheduler/dequeued: 3,
353 scheduler/dequeued/memory: 3,
354 scheduler/enqueued: 3,
355 scheduler/enqueued/memory: 3,
356 start_time: datetime.datetime(2019, 5, 8, 9, 0, 28, 449885)}
357 2019-05-08 17:00:29 [scrapy.core.engine] INFO: Spider closed (finished)

 技术图片

 
 
 
 
 
 

31、当当图书榜单爬虫

标签:item   pytho   bit   header   str   temp   容错   eth   url   

原文地址:https://www.cnblogs.com/www1707/p/10850678.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!