码迷,mamicode.com
首页 > 其他好文 > 详细

网易财经爬取

时间:2019-12-19 17:40:10      阅读:112      评论:0      收藏:0      [点我收藏+]

标签:tree   headers   use   gecko   mozilla   webkit   lis   list   gen   

import requests
from lxml import etree

url = ‘http://quotes.money.163.com/old/‘
headers = {
‘User-Agent‘: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36‘
}

html = requests.get(url=url,headers=headers).text

tree = etree.HTML(html)

content = tree.xpath(‘//li[@qid="HS"]//li[@id="f0-f7"]/ul/li‘)
for con in content:
one = con.xpath(‘./a/text()‘)[0]
print(one)
two_list = con.xpath(‘./ul/li‘)
for t in two_list:
qid = t.xpath(‘./@qid‘)[0]
print(qid)
two = t.xpath(‘./a/text()‘)[0]
print(two)

网易财经爬取

标签:tree   headers   use   gecko   mozilla   webkit   lis   list   gen   

原文地址:https://www.cnblogs.com/Iceredtea/p/12069065.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!