码迷,mamicode.com
首页 > 编程语言 > 详细

Python 对新浪微博的博文元素 (Word, Screen Name)的频率分析

时间:2014-07-10 19:45:37      阅读:191      评论:0      收藏:0      [点我收藏+]

标签:数据挖掘   新浪微博   

CODE:

#!/usr/bin/python 
# -*- coding: utf-8 -*-

'''
Created on 2014-7-9
@author: guaguastd
@name: weiboFrequencyAnalysis.py
'''

if __name__ == '__main__':
    
    # get weibo_api to access sina api
    from sinaWeiboLogin import sinaWeiboLogin
    sinaWeiboApi = sinaWeiboLogin()
    
    # import sinaWeibo
    from sinaWeibo import extractWeiboEntities
    
    # import sinaWeoboStatuses
    from sinaWeiboStatuses import publicTimeline
    
    # import sinaWeiboFrequency
    from sinaWeiboFrequency import weiboFrequencyAnalysis
    
    # get the new 5 weibo
    weiboNum = 5
    statuses = publicTimeline(sinaWeiboApi, weiboNum)
    status_texts,screen_names,words = extractWeiboEntities(statuses)  

    for label, data in (('Word', words),
                        ('Screen Name', screen_names)):
        weiboFrequencyAnalysis(label, data, weiboNum)

RESULT:

+------------------------------------------+-------+
| Word                                     | Count |
+------------------------------------------+-------+
| http://t.cn/8snKY0S                      |     1 |
| [围观]CANNCI千姿百袋2014新款牛皮菱格女包 |     1 |
| 时尚潮流单肩包                           |     1 |
| 浪漫RI系「喜欢请赞                       |     1 |
| ??????                             |     1 |
+------------------------------------------+-------+
+--------------------+-------+
| Screen Name        | Count |
+--------------------+-------+
| 马傻强             |     1 |
| 手机用户2360148561 |     1 |
| 潮流爆款搭V        |     1 |
| star爱上泡面猫     |     1 |
| 美容潮搭健康       |     1 |
+--------------------+-------+


Python 对新浪微博的博文元素 (Word, Screen Name)的频率分析,布布扣,bubuko.com

Python 对新浪微博的博文元素 (Word, Screen Name)的频率分析

标签:数据挖掘   新浪微博   

原文地址:http://blog.csdn.net/guaguastd/article/details/37591245

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!