标签:combine geo bad enc evel multi wiki modified tip
BLEU is designed to approximate human judgement at a corpus level, and performs badly if used to evaluate the quality of individual sentences.
https://en.wikipedia.org/wiki/BLEU
To produce a score for the whole corpus the modified precision scores for the segments are combined using the geometric meanmultiplied by a brevity penalty to prevent very short candidates from receiving too high a score.
bilingual evaluation understudy
标签:combine geo bad enc evel multi wiki modified tip
原文地址:http://www.cnblogs.com/yuanjiangw/p/7553408.html