码迷,mamicode.com
首页 > 编程语言 > 详细

Python 【解析库BeautifulSoup】

时间:2020-04-21 15:23:08      阅读:67      评论:0      收藏:0      [点我收藏+]

标签:技术   HERE   www   port   alt   lin   doc   body   doctype   

一.简介

技术图片

二.安装命令

pip install beautifulsoup4

三.基本使用

1.基本使用

技术图片技术图片
html =‘‘‘
<!DOCTYPE html>
<html>
<head>
    <title>故事</title>
</head>
<body>
   <p class="title" name="dromouse"><b>这个是dromouse</b></p>
   <p class="story">Once upon a time there were three little sister;
       and their names were
       <a href="http://www.baidu.com" class="sister" id="link1"><!--GH--></a>
       <a href="http://www.baidu.com/oracle" class="sister" id="link2">Local</a>and
       <a href="http://www.baidu.com/title" class="sister" id="link3">Tillie</a>;
   and they lived at the bottom of a well.</p>
   <p class="story">...</p>

</body>
</html>

‘‘‘

from bs4 import BeautifulSoup

soup = BeautifulSoup(html,lxml)

#将网页以标准格式输出
soup.prettify()

#输出title节点的内容
title = soup.title.string

print(title)
View Code

2.节点选择器

技术图片技术图片
html =‘‘‘
<!DOCTYPE html>
<html>
<head>
    <title>故事</title>
</head>
<body>
   <p class="title" name="dromouse"><b>这个是dromouse</b></p>
   <p class="story">Once upon a time there were three little sister;
       and their names were
       <a href="http://www.baidu.com" class="sister" id="link1"><!--GH--></a>
       <a href="http://www.baidu.com/oracle" class="sister" id="link2">Local</a>and
       <a href="http://www.baidu.com/title" class="sister" id="link3">Tillie</a>;
   and they lived at the bottom of a well.</p>
   <p class="story">...</p>

</body>
</html>

‘‘‘

from bs4 import BeautifulSoup

soup = BeautifulSoup(html,lxml)

#将网页以标准格式输出
soup.prettify()

#输出title节点的内容
title = soup.title.string

#输出节点的名称
name = soup.title.name

print(name)
View Code

 

Python 【解析库BeautifulSoup】

标签:技术   HERE   www   port   alt   lin   doc   body   doctype   

原文地址:https://www.cnblogs.com/Crown-V/p/12726000.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!