标签:mic nbsp bsp ace 需要 内容 code title print
from lxml import etree html = ‘‘‘ <li class="tag_1">需要的内容1 <a>需要的内容2</a> </li> ‘‘‘ selector = etree.HTML(html) contents = selector.xpath(‘//li[@class = "tag_1"]‘) contents1 = selector.xpath(‘//li[@class = "tag_1"]‘)[0] contents2 = contents1.xpath(‘string(.)‘) contents3 = selector.xpath(‘//li[@class = "tag_1"]/text()‘) print(contents) # [<Element li at 0x2c55e88>] print(contents1) # <Element li at 0x2c55e88> print(contents2) print(contents3)
输出结果
对于contents3的输出中带有‘\n‘,逗号等字符,我们可以用replace替换成我们想要的字符或空格,具体用法参考https://www.runoob.com/python/att-string-replace.html
标签:mic nbsp bsp ace 需要 内容 code title print
原文地址:https://www.cnblogs.com/1061321925wu/p/12297383.html