标签:read urlopen http sele urllib 分析 type href import
from urllib.request import urlopen from bs4 import BeautifulSoup as BS url = "http://www.lagou.com" # (1)获取response对象 response = urlopen(url) # (2)获得response对象下的源码 html = response.read().decode() # (3)创建BS对象 bs = BS(html,"html.parser") # (4)信息提取 a_list = bs.select("a") for i in a_list: print(i) # select和find find_all完全同bs对象下的方法一致,也就是可以对i进行进一步的标签分析 # print(i.select("font")) # print(type(i)) # 1)i.get(key) key代表传入的属性 # print(i.get("href")) # 2)获得标签中间夹的文件内容 print(i.text)
标签:read urlopen http sele urllib 分析 type href import
原文地址:https://www.cnblogs.com/ayajing/p/14928296.html