码迷,mamicode.com
首页 > 其他好文 > 详细

爬取雪球网上的房产信息

时间:2018-08-15 22:42:32      阅读:190      评论:0      收藏:0      [点我收藏+]

标签:aliyun   request   ike   values   web   .com   span   请求头   5.0   

爬取雪球网上的房产信息

源码:

 1 import requests
 2 import json
 3 import pymysql
 4 
 5 # 建立数据库连接
 6 db = pymysql.connect(host=127.0.0.1, user=root, password=123456, port=3306, database=xueqiu)
 7 # 创建游标对象
 8 cursor = db.cursor()
 9 
10 # 定义请求头信息
11 headers = {
12     "Accept": "*/*",
13     # "Accept-Encoding": "gzip, deflate, br",
14     "Accept-Language": "zh-CN,zh;q=0.9",
15     "Connection": "keep-alive",
16     "Cookie": "aliyungf_tc=AQAAAO+yOl0mxQEAUhVFeV0ZK5j5OLZs; xq_a_token=584d0cf8d5a5a9809761f2244d8d272bac729ed4; xq_a_token.sig=x0gT9jm6qnwd-ddLu66T3A8KiVA; xq_r_token=98f278457fc4e1e5eb0846e36a7296e642b8138a; xq_r_token.sig=2Uxv_DgYTcCjz7qx4j570JpNHIs; _ga=GA1.2.857846928.1534331621; _gid=GA1.2.1996927600.1534331621; Hm_lvt_1db88642e346389874251b5a1eded6e3=1534331622; Hm_lpvt_1db88642e346389874251b5a1eded6e3=1534331622; u=831534331622164; device_id=6715ed8e4eba695ab8a41bd752dbd204",
17     "Host": "xueqiu.com",
18     "Referer": "https://xueqiu.com/",
19     "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36",
20     "X-Requested-With": "XMLHttpRequest",
21 }
22 
23 max_id = -1
24 # 循环三次,爬取3页信息
25 for i in range(3):
26     # 生成url
27     url = https://xueqiu.com/v4/statuses/public_timeline_by_category.json?since_id=-1&max_id={}&count=15&category=111.format(max_id)
28     # 发送get请求
29     response = requests.get(url, headers=headers)
30     # print(response.json())
31     # 响应字典格式数据
32     res = response.json()
33     # 重新赋值下一次的max_id
34     max_id = res[next_max_id]
35     # print(res[‘list‘])
36     for dict_ in res[list]:
37         # print(dict_)
38         # 将json数据转成字典
39         dic = json.loads(dict_[data])
40         # print(type(dic),dic)
41         id = str(dic[id])
42         title = dic[title]
43         description = dic[description]
44         target = dic[target]
45         # print(id)
46         # print(title)
47         # print(description)
48         # print(target)
49         # 拼接sql语句
50         sql = "insert into news(id,title,description,target) values(‘"+id+"‘,‘"+title+"‘,‘"+description+"‘,‘"+target+"‘);"
51         print(正在插入数据:\n+sql)
52         # 执行sql
53         cursor.execute(sql)
54         # 提交
55         db.commit()
56 # 关闭游标对象
57 cursor.close()
58 # 关闭数据库连接
59 db.close()

 

爬取雪球网上的房产信息

标签:aliyun   request   ike   values   web   .com   span   请求头   5.0   

原文地址:https://www.cnblogs.com/zhxd-python/p/9484232.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!