码迷,mamicode.com
首页 > 其他好文 > 详细

文件方式实现完整的英文词频统计实例

时间:2017-09-26 21:08:20      阅读:113      评论:0      收藏:0      [点我收藏+]

标签:could   字符串   计数   tar   park   diamond   images   key   单词   

1.读入待分析的字符串

2.分解提取单词 

3.计数字典

4.排除语法型词汇

5.排序

6.输出TOP(20)

fo=open(123.txt,w)
fo.write(‘‘‘Twinkle, twinkle, little star, How I wonder what you are.
Up above the world so high, Like a diamond in the sky. 
Twinkle, twinkle, little star, How I wonder what you are!
When the blazing sun is gone, 
When he nothing shines upon, 
Then you show your little light, 
Twinkle, twinkle, all the night. 
Twinkle, twinkle, little star, 
How I wonder what you are! 
Then the traveler in the dark Thanks you for your tiny spark;
He could not see which way to go, If you did not twinkle so.
Twinkle, twinkle, little star, How I wonder what you are!
Twinkle Twinkle Little Star‘‘‘)
fo.close()


fo =open(123.txt,r)
A= fo.read()
exc={the,and,to,of,in,a,for,with,‘‘}
for i in ,.?!\n":
    A=A.replace(i, )
A=A.lower()
A=A.split(" ")
words=set(A)
dic={}
keys=set(A)#出现过单词的集合,字典的KEY
keys=keys-exc
for i in keys:
    dic[i]=A.count(i)
w=list(dic.items())
w.sort(key=lambda x:x[1],reverse=True)
for i in range(20):
    print(w[i])
fo.close()

技术分享

 

文件方式实现完整的英文词频统计实例

标签:could   字符串   计数   tar   park   diamond   images   key   单词   

原文地址:http://www.cnblogs.com/honghui/p/7598447.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!