码迷,mamicode.com
首页 > 编程语言 > 详细

python爬虫

时间:2016-05-25 20:22:24      阅读:213      评论:0      收藏:0      [点我收藏+]

标签:

作者:匿名用户
链接:https://www.zhihu.com/question/28485416/answer/55894693
来源:知乎
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

#!/usr/bin/python3
# -*- coding: utf-8 -*-
import requests
from bs4 import BeautifulSoup
from threading import Thread
import queue 

_session = requests.session()
url_base = URL WAS HERE
page = 1
for page in range(0,100):
    print(page)
    r = _session.get(url_base + str(page))
    print(r)
    soup = BeautifulSoup(r.content)
    box = soup.find_all(div, class_=item)
    av = []
    for i in range(len(box)):
        av += [[box[i].div.img[src], box[i].div.img[title],box[i].find(date).text]]
    DMM_session = requests.session()
    for i in av:
        img = DMM_session.get(i[0].replace(ps,pl),stream=True)
        with open(covers/+i[2]+"__"+i[1][:40].replace([^\w],‘‘)+.jpg,wb) as f:
            f.write(img.content)
            f.close()

 

python爬虫

标签:

原文地址:http://www.cnblogs.com/zrui513/p/5528168.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!