jieba库及词频统计

时间：2019-04-03 23:49:43 阅读：232 评论：0 收藏：0 [点我收藏+]

标签：continue alt read div str .com mini mat 单个字符

 1 import jieba
 2 txt = open("C:\\Users\\Administrator\\Desktop\\流浪地球.txt", "r", encoding=‘utf-8‘).read()
 3 words  = jieba.lcut(txt)
 4 counts = {}
 5 for word in words:
 6     if len(word) == 1:  #排除单个字符的分词结果
 7         continue
 8     else:
 9         counts[word] = counts.get(word,0) + 1
10 items = list(counts.items())
11 items.sort(key=lambda x:x[1], reverse=True) 
12 for i in range(10):
13     word, count = items[i]
14     print ("{0:<10}{1:>5}".format(word, count))

技术图片

jieba库及词频统计

标签：continue alt read div str .com mini mat 单个字符

原文地址：https://www.cnblogs.com/wjaihui/p/10652366.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行