collections应用

时间：2018-03-05 20:49:38 阅读：132 评论：0 收藏：0 [点我收藏+]

最近做项目时发现一个很好用的包——collections，这它是Python内建的一个集合模块，提供了许多有用的集合类，下面记录一些我觉得很有用的类或方法。
1.Counter
Counter可以帮我们直接计算出元素的数量

import collections
data1 = [‘a‘,‘b‘,‘c‘,‘a‘,‘b‘,‘a‘]
col_1 = collections.Counter(data1)
data2 = ‘python and pyspark‘
col_2 = collections.Counter(data2)
print(col_1)
print(col_2)

输出结果： Counter({‘a‘: 3, ‘b‘: 2, ‘c‘: 1})
Counter({‘ ‘: 2,‘a‘: 2,‘d‘: 1,‘h‘: 1,‘k‘: 1,‘n‘: 2,‘o‘: 1,‘p‘: 3,‘r‘: 1,‘s‘: 1,‘t‘: 1,‘y‘: 2})

2.most_common
most_common是Counter的一种方法，它可以快速取出Counter类型变量中排名靠前的数据，比如取出上例中排名前5的数据

print(col_2.most_common(5))

输出结果：[(‘p‘, 3), (‘a‘, 2), (‘ ‘, 2), (‘n‘, 2), (‘y‘, 2)]

3.defaultdict
defaultdict可以很好的避免dict中Key不存在时抛出的KeyError，当dict中key不存在时，可使用一个默认值代替

a = collections.defaultdict(int)
for item in data2:
    a[item] += 1
print(a)

输出结果：defaultdict(int,{‘ ‘: 2,‘a‘: 2,‘d‘: 1,‘h‘: 1,‘k‘: 1,‘n‘: 2,‘o‘: 1,‘p‘: 3,‘r‘: 1,‘s‘: 1,‘t‘: 1,‘y‘: 2})

4.namedtuple
namedtuple可以为不变集合tuple中的元素指定名称

Info = collections.namedtuple(‘Info‘, [‘name‘,‘gender‘,‘age‘])
i = Info(‘abe‘,‘m‘,‘30‘)
print(i,i.name)

输出结果：(Info(name=‘abe‘, gender=‘m‘, age=30), ‘abe‘)

collections应用

标签：collections 数据统计

原文地址：http://blog.51cto.com/abezoo/2083210

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行