码迷,mamicode.com
首页 > 其他好文 > 详细

topic extraction

时间:2018-01-23 16:40:44      阅读:153      评论:0      收藏:0      [点我收藏+]

标签:pymysql   sea   mysq   rip   arch   auth   galaxy   ==   解决方案   

#Author:Mini
#!/usr/bin/env python
import jieba

import jieba.posseg
sentence=""
jieba.load_userdict("C:/Users/Administrator/Desktop/tripadvisor_gm/tripadvisor_code_python/galaxy_macau_dict.txt")
word6=jieba.posseg.cut(sentence)
for item in word6:
print (item.word+","+item.flag)

import jieba.analyse
tag=jieba.analyse.extract_tags(sentence,2)
print(tag)

word8=jieba.tokenize(sentence)
for item in word8:
print (item)
word9=jieba.tokenize(sentence,mode="search")
for item in word9:
print(item)
print ("")
#conn= pymysql.connect(host="127.0.0.1", user="root", passwd="wangmianny111", db="galaxy_macau_ad",charset=‘utf8‘)
#data=open("C:/Users/Administrator/Desktop/txt1.txt","r",encoding="utf8").read()
#编码解决方案
import urllib.request
data=urllib.request.urlopen("http://127.0.0.1/txt1.txt").read().decode("utf-8","ignore")
words=jieba.posseg.cut(data)
datas=""
for item in words:
if item.flag=="c":
item.word=""
if item.flag=="d":
item.word=""
if item.flag=="x":
item.word=""
if item.flag=="w":
item.word=""
if item.flag=="p":
item.word = ""
if item.flag=="r":
item.word = ""
if item.flag=="t":
pass
if item.flag=="nt":
item.word = ""
if item.flag=="m":
item.word = ""
else:
pass
#print (item.word+","+item.flag)
datas += item.word
word10=jieba.analyse.extract_tags(datas,200)
topic=""
for item in word10:
topic += item+" "
fh = open("C:/Users/Administrator/Desktop/topic_test.txt", "a", encoding="utf_8")
fh.write(topic)

topic extraction

标签:pymysql   sea   mysq   rip   arch   auth   galaxy   ==   解决方案   

原文地址:https://www.cnblogs.com/rabbittail/p/8336280.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!