python爬虫

时间：2019-05-20 17:00:24 阅读：123 评论：0 收藏：0 [点我收藏+]

标签：xpath tag gecko tree content split file img get

import requests as app
from lxml import etree


header = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36"
}

def getImg(url,heade={}):
    resp = app.get(url,headers = heade)
    html = etree.HTML(resp.text)
    imgs = html.xpath(".//img/@data-original")
    for img in imgs:
        filename = img.split(‘/‘)[-1]
        fileimg = app.get(img,heade)
        with open(‘./img/{}‘.format(filename),‘wb‘) as f:
            f.write(fileimg.content)


for i in range(1,6):
    url = "http://www.win4000.com/meinvtag4_{}.html".format(i)
    getImg(url,heade=header)

python爬虫

标签：xpath tag gecko tree content split file img get

原文地址：https://www.cnblogs.com/kjtt/p/10894917.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行