码迷,mamicode.com
首页 > 其他好文 > 详细

bilibili番剧评分爬虫

时间:2018-01-05 23:32:25      阅读:534      评论:0      收藏:0      [点我收藏+]

标签:语言   ram   info   file   gpo   return   ons   https   tca   

python选修课学习中练手写的,主要就是查询bilibili提供得api

# -*- coding:utf-8 -*-

import requests
import json
import csv
import sys

#将windows系统默认语言从gbk-2312设置为utf-8
reload(sys)
sys.setdefaultencoding(‘utf-8‘)

def rating(bangumi_id):
    payload = {‘callback‘: ‘seasonListCallback‘}
    response = requests.get(‘https://bangumi.bilibili.com/jsonp/seasoninfo/{0}.ver‘.format(bangumi_id), params=payload)
    data = json.loads(response.text[19:-2])
    try:
        season_id = int(data[‘result‘][‘season_id‘])
        title = ‘{0}‘.format(data[‘result‘][‘media‘][‘title‘])
        score = float(data[‘result‘][‘media‘][‘rating‘][‘score‘])
        count = int(data[‘result‘][‘media‘][‘rating‘][‘count‘])
        is_finish = int(data[‘result‘][‘is_finish‘])  
        try:
            writer.writerow([season_id, title, score, count, is_finish])
        except sqlite3.IntegrityError:
            pass
    except KeyError:
        try:
            season_id = int(data[‘result‘][‘season_id‘])
            title = ‘{0}‘.format(data[‘result‘][‘title‘])
            score=float(0)
            count=int(0)
            is_finish = int(data[‘result‘][‘is_finish‘])
            try:
                writer.writerow([season_id, title, score, count, is_finish])
            except sqlite3.IntegrityError:
                pass
        except KeyError:
            return None
        return None
if __name__ == ‘__main__‘:
    with open(‘bangumi.csv‘, ‘wb+‘) as csv_file:
        writer = csv.writer(csv_file, delimiter=‘,‘)
        writer.writerow([‘序号‘,‘名称‘, ‘评分(默认0分)‘, ‘评分人数(人数不足为0人)‘, ‘是否完结(1:表示已完结)‘])
        for i in range(7000):
            rating(i)

  整理结果csv https://pan.baidu.com/s/1jHX2fJ4

bilibili番剧评分爬虫

标签:语言   ram   info   file   gpo   return   ons   https   tca   

原文地址:https://www.cnblogs.com/kagari/p/8207233.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!