码迷,mamicode.com
首页 > 其他好文 > 详细

计算词频

时间:2015-08-27 14:47:20      阅读:150      评论:0      收藏:0      [点我收藏+]

标签:

这个在自留地里写过了

>>> from collections import Counter
>>> c = Counter()
>>> for ch in programming:
...     c[ch] = c[ch] + 1
...
>>> c
Counter({g: 2, m: 2, r: 2, a: 1, i: 1, o: 1, n: 1, p: 1})

 

 写了个实现,继承dict,修改__getitem__

class Counter(dict):
    def __getitem__(self,key):
        if key not in self:
            return 0
        else:
            return super(Counter,self).__getitem__(key)

c=Counter()
for ch in programmingA:
    c[ch]=c[ch]+1
for key in sorted(c.keys()):
    print {} : {}.format(key,c[key])

 

想起了另外一个东西,一堆数据,按照首字母分类,没有区分大小写

data=[a,box,tea,apple,banana,banner,a test!,Zoo,Z]
t={}
for e in data:
    t.setdefault(e[0].upper(),[]).append(e)
else:
    print t

-------------------------------------
得到{A: [a, apple, a test!], Z: [Zoo, Z], B: [box, banana, banner], T: [tea]}

so:

data=programmingA
t={}
for e in data:
    t.setdefault(e,[]).append(e)
else:
    print {key:len(value) for key,value  in t.items()} 
-------------------------------------------------------
{‘a‘: 1, ‘A‘: 1, ‘g‘: 2, ‘i‘: 1, ‘m‘: 2, ‘o‘: 1, ‘n‘: 1, ‘p‘: 1, ‘r‘: 2}

 

计算词频

标签:

原文地址:http://www.cnblogs.com/Citizen/p/4763066.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!