标签:etc class pre 来源 python jieba use sse com
import jieba import os import jieba.analyse data = cleaned_comments # 数据来源于评论数据 seg = jieba.lcut(data) print(seg) # 增加自定义词表库 mydict = os.getcwd()+"/mydict.txt" jieba.load_userdict(mydict) seg = jieba.lcut(data) print(seg) import jieba.posseg as pseg posseg = pseg.lcut(data) print(posseg) # 抽取出现次数最多的词汇 extracttext = jieba.analyse.extract_tags(data, topK=20,withWeight=False, allowPOS=()) print(extracttext)
标签:etc class pre 来源 python jieba use sse com
原文地址:http://www.cnblogs.com/zhzhang/p/7208951.html