码迷,mamicode.com
首页 > 编程语言 > 详细

Python 提取Twitter tweets中的元素(包含text, screen names, hashtags)

时间:2014-07-01 09:20:10      阅读:295      评论:0      收藏:0      [点我收藏+]

标签:twitter   数据挖掘   

#!/usr/bin/python 
# -*- coding: utf-8 -*-

'''
Created on 2014-7-1
@author: guaguastd
@name: tweets.py
'''

import json

# import search, see http://blog.csdn.net/guaguastd/article/details/35537781
from search import search
    
# import login, see http://blog.csdn.net/guaguastd/article/details/31706155
from login import oauth_login

# get the twitter access api
twitter_api = oauth_login()
    
query = raw_input('\nInput the query (eg. #MentionSomeoneImportantForYou): ')
        
statuses = search(twitter_api, query)
print json.dumps(statuses[0], indent=1)
        
# Extracting text, screen names, and hashtags from tweets
status_texts = [status['text'] 
                for status in statuses]

screen_names = [user_mention['screen_name']
                for status in statuses
                    for user_mention in status['entities']['user_mentions']]

hashtags = [hashtag['text']
            for status in statuses
                for hashtag in status['entities']['hashtags']]

# Compute a collection of all words from all tweets
words = [w
         for t in status_texts
             for w in t.split()]

# Explore the first 5 items for each...
print json.dumps(status_texts[0:5], indent=1)
print json.dumps(screen_names[0:5], indent=1)
print json.dumps(hashtags[0:5], indent=1)
print json.dumps(words[0:5], indent=1)

Python 提取Twitter tweets中的元素(包含text, screen names, hashtags),布布扣,bubuko.com

Python 提取Twitter tweets中的元素(包含text, screen names, hashtags)

标签:twitter   数据挖掘   

原文地址:http://blog.csdn.net/guaguastd/article/details/36163301

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!