标签:htm url html import pytho cto 对象 pat www.
python创建Selector对象有2中方式:
1、将页面html文档字符串传递给Selector构造器方法的text参数
>>> from scrapy.selector import Selector
>>> text = """
<html>
<body>
<h1>hello world</h1>
<h1>hello scrapy</h1>
<b>hello python</b>
<ul>
<li>c++</li>
<li>java</li>
<li>python<li>
</ul>
</body>
</html>
"""
>>> selector = Selector(text = text)
>>> selector
<Selector xpath=None data=‘<html>\n\t<body>\n\t\t<h1>hello world</h1>\n\t\t‘>
2、使用一个response对象构造Selector对象
>>> from scrapy.selector import Selector
>>> from scrapy.http import HtmlResponse
>>> text = """
<html>
<body>
<h1>hello world</h1>
<h1>hello scrapy</h1>
<b>hello python</b>
<ul>
<li>c++</li>
<li>java</li>
<li>python<li>
</ul>
</body>
</html>
"""
>>> response = HtmlResponse(url = "http://www.example.com",body = text,encoding = "utf8")
>>> selector = Selector(response = response)
>>> selector
<Selector xpath=None data=‘<html>\n\t<body>\n\t\t<h1>hello world</h1>\n\t\t‘>
此外,在实际开发中,几乎不需要手动创建Selector对象,Response对象会自动创建Selector对象。可直接使用response.xpath()~
标签:htm url html import pytho cto 对象 pat www.
原文地址:https://www.cnblogs.com/soldier-lj/p/9070917.html