码迷,mamicode.com
首页 > 其他好文 > 详细

运用jieba库统计词频及制作词云

时间:2020-04-08 11:34:30      阅读:103      评论:0      收藏:0      [点我收藏+]

标签:txt   count   color   items   运用   div   lam   class   odi   

一、对新时代中国特色社会主义做词频统计

import jieba
txt = open("新时代中国特色社会主义.txt","r",encoding="utf-8").read()
words = jieba.lcut(txt)
counts = {}
for word in words:
    if len(word) == 1:
        continue
    else:
        counts[word] = counts.get(word,0)+1
items = list(counts.items())
items.sort(key=lambda x:x[1], reverse=True)
for i in range(20):
    word, count = items[i]
    print("{0:<10}{1:>5}".format(word, count))

技术图片

二、根据词频制作词云

#GovRptWordCloudv2.py
import jieba
import wordcloud
from imageio import imread
mask = imread("dd.png")
f = open("新时代中国特色社会主义.txt","r",encoding="utf-8")
t = f.read()
f.close()
ls = jieba.lcut(t)
txt = " ".join(ls)
w = wordcloud.WordCloud(font_path = "simkai.ttf",mask = mask,width = 1000,height = 700,background_color = "black",max_words = 20)
w.generate(txt)
w.to_file("grwordcloud.png")

技术图片

运用jieba库统计词频及制作词云

标签:txt   count   color   items   运用   div   lam   class   odi   

原文地址:https://www.cnblogs.com/slj-xt/p/12658666.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!