标签:统计文本英文单词个数 python shell sort uniq
现有plain text titled test.txt,统计其中的单词出现的个数。
test.txt的内容:
i have have application someday oneday day demo
i have some one coma ideal naive i
用python实现的代码:
import re
count = {}
f = open(‘test‘,‘r‘)
b = f.read()
#print b
cd = re.split(‘[ \\n]+‘,b) #注意split的用法
print cd
for i in cd:
count[i] = count.get(i,0) + 1#注意get()方法的用法
print count
执行代码后得到的结果:
[‘i‘, ‘have‘, ‘have‘, ‘application‘, ‘someday‘, ‘oneday‘, ‘day‘, ‘demo‘, ‘i‘, ‘have‘, ‘some‘, ‘one‘, ‘coma‘, ‘ideal‘, ‘naive‘, ‘i‘]
{‘someday‘: 1, ‘i‘: 3, ‘demo‘: 1, ‘naive‘: 1, ‘some‘: 1, ‘one‘: 1, ‘application‘: 1, ‘ideal‘: 1, ‘have‘: 3, ‘coma‘: 1, ‘oneday‘: 1, ‘day‘: 1}
shell实现的方法为:
tr " " "\\n"
运行结果为
1 application
1 coma
1 day
1 demo
3 have
3 i
1 ideal
1 naive
1 one
1 oneday
1 some
1 someday
本文出自 “Jason的博客” 博客,请务必保留此出处http://jason83.blog.51cto.com/12723827/1982168
任意一个英文的纯文本文件,统计其中的单词出现的个数(shell python 两种语言实现)
标签:统计文本英文单词个数 python shell sort uniq
原文地址:http://jason83.blog.51cto.com/12723827/1982168