码迷,mamicode.com
首页 > 编程语言 > 详细

Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的频率分析

时间:2014-07-02 11:26:40      阅读:258      评论:0      收藏:0      [点我收藏+]

标签:twitter   数据挖掘   

#!/usr/bin/python 
# -*- coding: utf-8 -*-

'''
Created on 2014-7-2
@author: guaguastd
@name: tweet_frequency_analysis.py
'''

if __name__ == '__main__':

    # import Counter
    from collections import Counter
    
    # pip install prettytable
    from prettytable import PrettyTable
    
    # import login, see http://blog.csdn.net/guaguastd/article/details/31706155 
    from login import oauth_login

    # get the twitter access api
    twitter_api = oauth_login()
    
    # import tweet, see http://blog.csdn.net/guaguastd/article/details/36163301
    from tweets import tweet

    while 1:
        query = raw_input('\nInput the query (eg. #MentionSomeoneImportantForYou, exit to quit): ')
        
        if query == 'exit':
            print 'Successfully exit!'
            break
        
        status_texts,screen_names,hashtags,words = tweet(twitter_api, query)  

        for label, data in (('Word', words),
                            ('Screen Name', screen_names),
                            ('Hashtag', hashtags)):
            pt = PrettyTable(field_names=[label, 'Count'])
            c = Counter(data)
            [ pt.add_row(kv) for kv in c.most_common()[:10]]
            pt.align[label], pt.align['Count'] = 'l', 'r'
            print pt


Result:

Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): Hello world
Length of statuses 99
'next_results'
+-------+-------+
| Word  | Count |
+-------+-------+
| the   |    99 |
| hello |    52 |
| is    |    50 |
| in    |    50 |
| me    |    46 |
| best  |    46 |
| you   |    46 |
| world |    44 |
| it    |    42 |
| tweet |    40 |
+-------+-------+
+--------------+-------+
| Screen Name  | Count |
+--------------+-------+
| Harry_Styles |    39 |
| justinbieber |     6 |
| shots        |     6 |
| john         |     6 |
| WHATCHAKNO   |     4 |
| hatahata88   |     2 |
| Michael5SOS  |     2 |
| Oprah_World  |     1 |
| kuga_aimu    |     1 |
| chriscobbins |     1 |
+--------------+-------+
+--------------+-------+
| Hashtag      | Count |
+--------------+-------+
| MoneyAnthem  |     4 |
| MILLIONBUCKS |     4 |
| New          |     4 |
| MUSTHEAR     |     4 |
| WorldCup2014 |     2 |
| gousa        |     1 |
| Lukaku       |     1 |
| USA          |     1 |
| BEL          |     1 |
| MGWV         |     1 |
+--------------+-------+

Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): 



Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的频率分析,布布扣,bubuko.com

Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的频率分析

标签:twitter   数据挖掘   

原文地址:http://blog.csdn.net/guaguastd/article/details/36345821

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!