BeautifulSoup库未写明解析器警告

时间：2019-02-09 17:38:41 阅读：3799 评论：0 收藏：0 [点我收藏+]

标签：virt htm urlopen parse 导致 for sed imp str

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://www.pythonscraping.com/pages/page1.html")
bsObj = BeautifulSoup(html.read())
print(bsObj.h1)

代码运行之后警告如下：
UserWarning: No parser was explicitly specified, so I‘m using the best available HTML parser for this system ("lxml"). This usually isn‘t a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 4 of the file D:/Python/venv/test8.py. To get rid of this warning, pass the additional argument ‘features="lxml"‘ to the BeautifulSoup constructor.

翻译如下：
用户警告：没有显式指定语法分析器，因此我使用了此系统的最佳可用HTML语法分析器（“lxml”）。这通常不是问题，但是如果您在另一个系统上运行此代码，或者在不同的虚拟环境中运行此代码，它可能会使用不同的解析器并表现出不同的行为。

导致此警告的代码位于文件d:/python/venv/test8.py的第4行。要消除此警告，请将附加参数‘features=“lxml”‘传递给beautifulsoup构造函数。

解决：指定解析器，一般使用‘lxml‘

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://www.pythonscraping.com/pages/page1.html")
bsObj = BeautifulSoup(html.read(),‘lxml‘)
print(bsObj.h1)

BeautifulSoup库未写明解析器警告

标签：virt htm urlopen parse 导致 for sed imp str

原文地址：http://blog.51cto.com/12884584/2348995

踩

(1)

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行