The content is from this paper: Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables, by Tetsuji Nakagawa.
A typical approach for sentiment classification is to use supervised machine learning algorithms with bag-of-words as features. A subjective sen- tence is represented as a set of words in the sentence, ignoring word order and head-modifier relation between words. However, sentiment classifi- cation is different from traditional topic-based text classification. Topic-based text classification is generally a linearly separable problem. For example, when a document con- tains some domain-specific words, the document will probably belong to the domain. However, in sentiment classification, sentiment polarities can be reversed. In sentiment classification, a sentence which contains positive (or negative) polar- ity words does not necessarily have the same polar- ity as a whole, and we need to consider interactions between words instead of handling words indepen- dently.
One issue of the approach to use sentence composition and machine learning is that only the whole sentence is labeled with its polarity in general corpora for sentiment classification, and each component of the sentence is not labeled.
原文地址:http://www.cnblogs.com/wintor12/p/3777662.html