wordcloud 入门

时间：2019-08-24 22:27:57 阅读：148 评论：0 收藏：0 [点我收藏+]

标签：detail example line 设置图图片 false imshow 效果 odi

wordcloud 安装

pip安装

1 python3.6 -m pip install wordcloud -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

conda安装

1 conda install -c conda-forge wordcloud

快速生成词云

from wordcloud import WordCloud, STOPWORDS
#
# sample_text_path = "../data/constitution.txt"
#
# # 读取文本
# text = open(sample_text_path).read()
#
# # 生成词云实例，generate对text分词
# word_cloud = WordCloud().generate(text)
#
# # 展示生成的图片
# # 使用matplotlib
# import matplotlib.pyplot as plt
# plt.imshow(word_cloud, interpolation=‘bilinear‘)
# plt.axis("off")
# plt.show()
#
# # max_font_size 调低最大字体
# word_cloud = WordCloud(max_font_size=40).generate(text)
# plt.figure()
# plt.imshow(word_cloud, interpolation=‘bilinear‘)
# plt.axis(‘off‘)
# plt.show()
#
# # width,height,margin可以设置图片属性
# # font_path参数来设置字体集
# # background_color参数为设置背景颜色,默认颜色为黑色
#
# # 保存图片
# word_cloud.to_file(‘./test.png‘)

效果：

技术图片

利用背景图片生成词云，设置停用词词集

 1 from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
 2 from PIL import Image
 3 import numpy as np
 4 import matplotlib.pyplot as plt
 5 
 6 sample_text_path = "../data/alice.txt"
 7 sample_image_path = ‘../data/alice_mask.png‘
 8 
 9 # 读取文本
10 text = open(sample_text_path).read()
11 
12 # 读取 mask image
13 alice_mask = np.array(Image.open(sample_image_path))
14 
15 stop_words = set(STOPWORDS)
16 stop_words.add(‘said‘)
17 
18 
19 word_cloud = WordCloud(background_color=‘white‘,
20                        max_words=2000,
21                        mask=alice_mask,
22                        stopwords=stop_words,
23                        contour_width=3,
24                        contour_color=‘steelblue‘).generate(text)
25 # word_cloud.to_file(‘./alice/png‘)
26 
27 
28 plt.imshow(word_cloud, interpolation=‘bilinear‘)
29 
30 
31 plt.axis(‘off‘)
32 plt.figure()
33 plt.show()

效果：

技术图片

自定义字体颜色

 1 from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
 2 from PIL import Image
 3 import numpy as np
 4 import matplotlib.pyplot as plt
 5 
 6 sample_text_path = "../data/alice.txt"
 7 sample_image_path = ‘../data/alice_mask.png‘
 8 
 9 # 读取文本
10 text = open(sample_text_path).read()
11 
12 # 读取 mask image
13 alice_mask = np.array(Image.open(sample_image_path))
14 
15 stop_words = set(STOPWORDS)
16 stop_words.add(‘said‘)
17 
18 
19 word_cloud = WordCloud(background_color=‘white‘,
20                        max_words=2000,
21                        mask=alice_mask,
22                        stopwords=stop_words,
23                        contour_width=3,
24                        contour_color=‘steelblue‘).generate(text)
25 # word_cloud.to_file(‘./alice/png‘)
26 
27 #方法一：单独列
28 image_colors_byImg = ImageColorGenerator(alice_mask)
29 
30 plt.imshow(word_cloud, interpolation=‘bilinear‘)
31 
32 #方法，加到imshow中
33 plt.imshow(word_cloud.recolor(image_colors_byImg), interpolation=‘bilinear‘)
34 
35 plt.axis(‘off‘)
36 plt.figure()
37 plt.show()

会报错 NotImplementedError: Gray-scale images TODO，方法是换过一张图片

参照：https://blog.csdn.net/heyuexianzi/article/details/76851377

增加中文分词处理中文文本

使用jeiba

 1 import jieba 
 2 
 3 # The function for processing text with Jieba
 4 def jieba_processing_txt(text):
 5     for word in userdict_list:
 6         jieba.add_word(word)
 7 
 8     mywordlist = []
 9     seg_list = jieba.cut(text, cut_all=False)
10     liststr = "/ ".join(seg_list)
11 
12     with open(stopwords_path, encoding=‘utf-8‘) as f_stop:
13         f_stop_text = f_stop.read()
14         f_stop_seg_list = f_stop_text.splitlines()
15 
16     for myword in liststr.split(‘/‘):
17         if not (myword.strip() in f_stop_seg_list) and len(myword.strip()) > 1:
18             mywordlist.append(myword)
19     return ‘ ‘.join(mywordlist)

参考文档：

https://github.com/amueller/word_cloud/tree/master/examples

wordcloud 入门

标签：detail example line 设置图图片 false imshow 效果 odi

原文地址：https://www.cnblogs.com/huangm1314/p/11334567.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行