码迷,mamicode.com
首页 > 编程语言 > 详细

Python 分析Twitter用户喜爱的推文

时间:2014-08-05 07:22:08      阅读:329      评论:0      收藏:0      [点我收藏+]

标签:twitter   数据挖掘   

CODE:

#!/usr/bin/python 
# -*- coding: utf-8 -*-

'''
Created on 2014-8-5
@author: guaguastd
@name: analyze_favorite_tweet.py
'''

if __name__ == '__main__':
    
    # import json
    #import json
    
    # import search
    from search import search_for_tweet
    
    # import get_friends_followers_ids
    from user import crawl_followers
    
    # import login, see http://blog.csdn.net/guaguastd/article/details/31706155
    from login import twitter_login

    # import tweet
    from tweet import analyze_favorites_tweet
    
    # get the twitter access api
    twitter_api = twitter_login()
    
    # import twitter_text
    import twitter_text
    
    while 1:
        query = raw_input('\nInput the query (eg. #MentionSomeoneImportantForYou, exit to quit): ')
        
        if query == 'exit':
            print 'Successfully exit!'
            break
        
        statuses = search_for_tweet(twitter_api, query) 
        ex = twitter_text.Extractor(statuses)     
        
        screen_names = ex.extract_mentioned_screen_names_with_indices()
        screen_names = [screen_name['screen_name']
                        for screen_name in screen_names]
                
        for screen_name in screen_names:
            #print json.dumps(screen_names, indent=1)  
            analyze_favorites_tweet(twitter_api, screen_name)

RESULT:

Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): Core Python
Length of statuses 59
Number of favorites: 200

Common entities in favorites...
+--------+------------------------+
| Entity |                  Count |
+--------+------------------------+
| 72     |                    the |
| 72     |                     to |
| 57     |                      a |
| 56     |                     of |
| 53     |                     in |
| 44     |                     on |
| 37     |                     is |
| 36     |                    for |
| 34     |                    and |
| 29     |                      I |
| 28     |                    you |
| 24     |                     my |
| 21     |                      - |
| 21     |                     at |
| 19     |                   with |
| 17     |                     be |
| 17     |                     by |
| 15     |                   talk |
| 15     |                    are |
| 15     |                   from |
| 14     |                    The |
| 14     |                   this |
| 13     |                    can |
| 13     |                   that |
| 13     |                  snim2 |
| 12     |                 @snim2 |
| 12     |                     an |
| 11     |                 Python |
| 11     |                   your |
| 11     |                  about |
| 10     |                     it |
| 10     |                    was |
| 10     |                    all |
| 10     |                   ep14 |
| 9      |             europython |
| 9      |                    now |
| 9      |                     or |
| 8      |                    via |
| 7      |                      A |
| 7      |                   Here |
| 7      |                     if |
| 7      |                    not |
| 7      |                    our |
| 7      |                   have |
| 7      |                    who |
| 7      |                  #ep14 |
| 7      |                     as |
| 6      |                    new |
| 6      |                     me |
| 6      |                   just |
| 6      |            #europython |
| 6      |                 slides |
| 6      |                  & |
| 5      |            concurrency |
| 5      |                     My |
| 5      |                IPython |
| 5      |                     so |
| 5      |                   more |
| 5      |                  paper |
| 5      |                   also |
| 5      |                   most |
| 5      |                    see |
| 5      |              available |
| 5      |                  video |
| 5      |                  write |
| 5      |                    out |
| 5      |                  piece |
| 5      |               software |
| 4      |                    has |
| 4      |                   when |
| 4      |                     :) |
| 4      |               Research |
| 4      |                  here: |
| 4      |                   take |
| 4      |                     If |
| 4      |                  being |
| 4      |                   code |
| 4      |                   what |
| 4      |                   help |
| 4      |                 really |
| 4      |                    For |
| 4      |                   some |
| 4      |                     up |
| 4      |                 python |
| 4      |                   This |
| 4      |                  based |
| 4      |                   will |
| 4      |                    You |
| 4      |                     he |
| 3      |                Haskell |
| 3      |            @europython |
| 3      |                   much |
| 3      |                  photo |
| 3      |                #python |
| 3      |                   easy |
| 3      |                   post |
| 3      |                    own |
| 3      |                  #LGBT |
| 3      |                 papers |
| 3      |                   time |
| 3      |                    Our |
| 3      |                    Why |
| 3      |                 answer |
| 3      |                  first |
| 3      |                    one |
| 3      |                   open |
| 3      |                   than |
| 3      |                 ep2014 |
| 3      |                    get |
| 3      |                   LGBT |
| 3      |                   Gaza |
| 3      |                   read |
| 3      |                 Slides |
| 3      |           presentation |
| 3      |                  large |
| 3      |                learned |
| 3      |                  learn |
| 3      |                  don't |
| 3      |                   good |
| 3      |                    did |
| 3      |                 Thanks |
| 3      |                   like |
| 3      |          tweets/second |
| 3      |                    his |
| 3      |                  wrote |
| 3      |                 please |
| 3      |               Software |
| 3      |               analysis |
| 3      |                 Here's |
| 3      |                     .. |
| 3      |                     An |
| 3      |                  great |
| 3      |                    use |
| 3      |                      | |
| 3      |             EuroPython |
| 3      |                 you're |
| 3      |                  their |
| 3      |                    but |
| 3      |                    why |
| 3      |                 should |
| 3      |                  means |
| 3      |                #ep2014 |
| 3      |                keynote |
| 3      |                    day |
| 3      |                   know |
| 3      |                because |
| 3      |                  Great |
| 2      |                  under |
| 2      |                 Amazon |
| 2      |                 Church |
| 2      |                  Group |
| 2      |                  aware |
| 2      |                   must |
| 2      |                   want |
| 2      |                    how |
| 2      |              interview |
| 2      |                  after |
| 2      |                 things |
| 2      |               feedback |
| 2      |                   over |
| 2      |                   them |
| 2      |                  Check |
| 2      |                Shakira |
| 2      |                    got |
| 2      |               messages |
| 2      |                   days |
| 2      |                 Please |
| 2      |               Notebook |
| 2      |       @parallellaboard |
| 2      |                   “Can |
| 2      |                   mine |
| 2      |                Twisted |
| 2      |                     do |
| 2      |           #concurrency |
| 2      |             officially |
| 2      |                     w/ |
| 2      |                   John |
| 2      |                   said |
| 2      |                  never |
| 2      |                   I've |
| 2      |                   been |
| 2      |          twistedmatrix |
| 2      |                   make |
| 2      |                  jobs. |
| 2      |            #EuroPython |
| 2      |                    Use |
| 2      |                    way |
| 2      |                   role |
| 2      |                   test |
| 2      |                 update |
| 2      |        parallellaboard |
| 2      |                  daily |
| 2      |                   Just |
| 2      |                     MT |
| 2      |                     MP |
| 2      |                   It's |
| 2      |              following |
| 2      |                    may |
| 2      |                  Model |
| 2      |                 switch |
| 2      |                     RT |
| 2      |                 tweets |
| 2      |                 WeAreN |
| 2      |                   name |
| 2      |               attended |
| 2      |            programming |
| 2      |                  think |
| 2      |                message |
| 2      |                  short |
| 2      |                     Do |
| 2      |                 online |
| 2      |               science, |
| 2      |                #WeAreN |
| 2      |                  going |
| 2      |                 Growth |
| 2      |                  where |
| 2      |                 #synod |
| 2      |                      3 |
| 2      |                   jobs |
| 2      |                   many |
| 2      |                 Jeremy |
| 2      |                  those |
| 2      |                  these |
| 2      |            engineering |
| 2      |                    GNU |
| 2      |              different |
| 2      |           surveillance |
| 2      |                   week |
| 2      |                   blog |
| 2      |          LindaWoodhead |
| 2      |                  start |
| 2      |                      ? |
| 2      |                    How |
| 2      |                watched |
| 2      |                  trash |
| 2      |                #Python |
| 2      |               coverage |
| 2      |         @LindaWoodhead |
| 2      |                 remote |
| 2      |               consider |
| 2      |                program |
| 2      |                   very |
| 2      |                     St |
| 2      |                   Your |
| 2      |                 github |
| 2      |                 that's |
| 2      |                    its |
| 2      |                    it. |
| 2      |                    it: |
| 2      |                 c_of_e |
| 2      |               research |
| 2      |               together |
| 2      |                without |
| 2      |                nothing |
| 2      |              pre-print |
| 2      |                 during |
| 2      |                   Part |
| 2      |                   last |
| 2      |                  Steve |
| 2      |                  point |
| 2      |                 church |
| 2      |                  Women |
| 2      |                  error |
| 2      |                  arXiv |
| 2      | http://t.co/0yBSWrVaUW |
| 2      |                 person |
| 2      |                  Names |
| 2      |                 docker |
| 2      |           Reproducible |
| 2      |                 source |
| 2      |                popular |
| 2      |                   back |
| 2      |         @twistedmatrix |
| 2      |                     am |
| 2      |                   < |
| 2      |               @PyConUK |
| 2      |                     AV |
| 2      |              Implement |
| 2      |                asyncio |
| 2      |                    Git |
| 2      |                    try |
| 2      |                 making |
| 2      |               involved |
| 2      |           Algorithm?”: |
| 2      |                  tools |
| 2      |                      … |
| 2      |                  Video |
| 2      |                  links |
| 2      |                profile |
| 2      |                  lines |
| 2      |                    One |
| 2      |                   2015 |
| 2      |                    Can |
| 2      |                lecture |
| 2      |                   data |
| 2      |                   need |
| 2      |                  which |
| 2      |                   Some |
| 2      |                 Bishop |
| 2      |                   fact |
| 2      |                  local |
| 2      |               computer |
| 2      |                   Good |
| 2      |                  synod |
| 2      |                passing |
| 2      |                   it's |
| 2      |                PyConUK |
| 2      |               #asyncio |
| 2      |                  intro |
| 2      |                 Oxford |
| 2      |                 single |
| 2      |                 latest |
| 2      |                   CofE |
| 2      |                  async |
| 2      |              Telegraph |
| 2      |                 growth |
| 2      |                Science |
| 2      |                problem |
| 2      |                  this: |
+--------+------------------------+

Some statistics about the content of the favorities...

Lexical diversity (words): 0.605255023184
Lexical diversity (screen names): 1.0
Lexical diversity (hashtags): 0.831932773109
Averge words per tweet: 16.175
Number of favorites: 2

Common entities in favorites...
+--------+-------------+
| Entity |       Count |
+--------+-------------+
| 2      | @AndersInno |
| 2      |  AndersInno |
+--------+-------------+

Some statistics about the content of the favorities...

Lexical diversity (words): 0.9375
Lexical diversity (screen names): 1.0
Lexical diversity (hashtags): 0.75
Averge words per tweet: 8.0
Number of favorites: 6

Common entities in favorites...
+--------+-------+
| Entity | Count |
+--------+-------+
| 4      |   the |
| 3      |    to |
| 2      |    be |
| 2      |    of |
| 2      |  this |
| 2      |    is |
| 2      |    in |
| 2      |     I |
| 2      |     a |
+--------+-------+

Some statistics about the content of the favorities...

Lexical diversity (words): 0.872340425532
Lexical diversity (screen names): 1.0
Lexical diversity (hashtags): 1.0
Averge words per tweet: 15.6666666667
Number of favorites: 0

Common entities in favorites...
+--------+-------+
| Entity | Count |
+--------+-------+
+--------+-------+

Some statistics about the content of the favorities...

No statuses to analyze


Python 分析Twitter用户喜爱的推文,布布扣,bubuko.com

Python 分析Twitter用户喜爱的推文

标签:twitter   数据挖掘   

原文地址:http://blog.csdn.net/guaguastd/article/details/38378857

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!