码迷,mamicode.com
首页 > 编程语言 > 详细

Python urllib2爬虫豆瓣小说名称和评分

时间:2017-07-31 11:46:39      阅读:176      评论:0      收藏:0      [点我收藏+]

标签:https   tps   pen   open   imp   pre   tag   com   dal   

#-*- coding:utf-8 -*-
import urllib2
import re

url = https://book.douban.com/tag/%E5%B0%8F%E8%AF%B4
request = urllib2.Request(url)
urlopen = urllib2.urlopen(request)
content = urlopen.read()
reg_0 = re.findall(rtitle.+"\s*on, content)
reg_1 = re.findall(rrating_nums">.*<, content)
for title,score in zip(reg_0,reg_1):
    title = re.split(r",title)
    score = re.split(r>|<,score)
    print title[1],score[1]



#<span class="rating_nums">8.6</span>

 

Python urllib2爬虫豆瓣小说名称和评分

标签:https   tps   pen   open   imp   pre   tag   com   dal   

原文地址:http://www.cnblogs.com/lovephysics/p/7262282.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!