标签:技术 HERE www port alt lin doc body doctype
pip install beautifulsoup4
html =‘‘‘ <!DOCTYPE html> <html> <head> <title>故事</title> </head> <body> <p class="title" name="dromouse"><b>这个是dromouse</b></p> <p class="story">Once upon a time there were three little sister; and their names were <a href="http://www.baidu.com" class="sister" id="link1"><!--GH--></a> <a href="http://www.baidu.com/oracle" class="sister" id="link2">Local</a>and <a href="http://www.baidu.com/title" class="sister" id="link3">Tillie</a>; and they lived at the bottom of a well.</p> <p class="story">...</p> </body> </html> ‘‘‘ from bs4 import BeautifulSoup soup = BeautifulSoup(html,‘lxml‘) #将网页以标准格式输出 soup.prettify() #输出title节点的内容 title = soup.title.string print(title)
html =‘‘‘ <!DOCTYPE html> <html> <head> <title>故事</title> </head> <body> <p class="title" name="dromouse"><b>这个是dromouse</b></p> <p class="story">Once upon a time there were three little sister; and their names were <a href="http://www.baidu.com" class="sister" id="link1"><!--GH--></a> <a href="http://www.baidu.com/oracle" class="sister" id="link2">Local</a>and <a href="http://www.baidu.com/title" class="sister" id="link3">Tillie</a>; and they lived at the bottom of a well.</p> <p class="story">...</p> </body> </html> ‘‘‘ from bs4 import BeautifulSoup soup = BeautifulSoup(html,‘lxml‘) #将网页以标准格式输出 soup.prettify() #输出title节点的内容 title = soup.title.string #输出节点的名称 name = soup.title.name print(name)
标签:技术 HERE www port alt lin doc body doctype
原文地址:https://www.cnblogs.com/Crown-V/p/12726000.html