Python爬虫之BeautifulSoup和requests的使用

时间：2018-06-14 14:53:08 阅读：210 评论：0 收藏：0 [点我收藏+]

标签：col local 代码 center html oid urllib 图片 coding

requests，Python HTTP 请求库，相当于 Android 的 Retrofit，它的功能包括 Keep-Alive 和连接池、Cookie 持久化、内容自动解压、HTTP 代理、SSL 认证、连接超时、Session 等很多特性，同时兼容 Python2 和 Python3。

第三方库的安装：

pip install urllib

pip install requests

小爬虫代码如下：

# -* - coding: UTF-8 -* -
#导入第三方库 
import urllib
from bs4 import BeautifulSoup
import requests
url=‘https://www.phb123.com/junshi/lishi/9679_2.html‘
local="E:\\py\\imgs\\"    #保存图片的文件夹
html_doc=requests.get(url).text
soup=BeautifulSoup(html_doc,‘lxml‘)   #解析 html_doc
contens=soup.find_all(‘center‘)
x=1
for con in contens:
    imgs=con.find_all(‘img‘) #获取center标签下的img标签
    for img in imgs:
        urllib.request.urlretrieve(img[‘src‘], local + ‘%s.jpg‘ % (x))
        x =x+1

Python爬虫之BeautifulSoup和requests的使用

标签：col local 代码 center html oid urllib 图片 coding

原文地址：https://www.cnblogs.com/ling-yu/p/9182277.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行