码迷,mamicode.com
首页 > 其他好文 > 详细

词频统计

时间:2018-06-20 22:43:26      阅读:209      评论:0      收藏:0      [点我收藏+]

标签:ext   for   range   http   alt   reverse   word   print   replace   

#1.
loveFile = open(love.txt,mode=r,encoding=utf-8)
loveText = loveFile.read()
loveFile.close()
print(loveText)

#2.
replaceList = [,,.,"",\n]
for c in replaceList:
    loveTxt = loveText.replace(c, )
print(loveText)

#3.
print(loveText.split( ))
loveList = loveText.split( )

#4.
loveSet = set(loveList)
print(loveSet)

loveDict = set(loveList)
print(loveSet)

loveDict = {}
for word in loveSet:
    loveDict[word] =loveList.count(word)

    print(loveDict)
    for d in loveDict:
        print(d,loveDict[d])
wordCountList = list(loveDict.items())
print(wordCountList)
wordCountList.sort(key=lambda x:x[1],reverse=True)
print(wordCountList)

for i in range(20):
    print(wordCountList[i])

loveCountFile = open(loveCount.txt, mode=a,encoding=utf-8)
for i in range(len(wordCountList)):
    loveCountFile.write(str(wordCountList[i][1])+ +wordCountList[i][0]+\n)
loveCountFile.close()

技术分享图片

 

词频统计

标签:ext   for   range   http   alt   reverse   word   print   replace   

原文地址:https://www.cnblogs.com/Minwon/p/9206137.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!