标签:
Natural Language Processing with Python
Chapter 6.2
1 import nltk 2 from nltk.corpus import nps_chat as nchat 3 4 def dialogue_act_features(post): 5 features={} 6 for word in nltk.word_tokenize(post): 7 features[‘contains(%s)‘ % word.lower()] = True 8 return features 9 10 def test_dialogue_act_types(): 11 posts=nchat.xml_posts()[:10000] 12 featuresets = [(dialogue_act_features(post.text),post.get(‘class‘)) 13 for post in posts] 14 size=int(len(featuresets)*0.1) 15 train_set, test_set = featuresets[size:],featuresets[:size] 16 classifier = nltk.NaiveBayesClassifier.train(train_set) 17 print nltk.classify.accuracy(classifier,test_set) 18 classifier.show_most_informative_features(5)
运行结果:
0.668
Most Informative Features
contains(hi) = True Greet : System = 408.2 : 1.0
contains(>) = True Other : System = 384.6 : 1.0
contains(empty) = True Other : System = 339.4 : 1.0
contains(part) = True System : Statem = 302.0 : 1.0
contains(no) = True nAnswe : System = 262.3 : 1.0
标签:
原文地址:http://www.cnblogs.com/gui0901/p/4454364.html