标签:测试 call set rip core tokenizer loading 图片 esc
我研究了3个例子:北京大学的wiki2bio、谷歌的ToTTo、微软的WIKITABLETEXT
Running with the following variables:
PREDICTION_PATH : language/totto/sample/output_sample.txt
TARGET_PATH : language/totto/sample/dev_sample.jsonl
OUTPUT_DIR : temp
MODE : test
Writing references.
Writing tables in PARENT format.
Preparing predictions.
Writing predictions.
Running detokenizers.
======== EVALUATE OVERALL ========
Computing BLEU (overall)
BLEU+case.mixed+numrefs.3+smooth.exp+tok.13a+version.1.2.6 = 45.55 86.0/63.5/44.7/31.0 (BP = 0.869 ratio = 0.877 hyp_len = 57 ref_len = 65)
Computing PARENT (overall)
Evaluated 5 examples.
Precision = 76.11 Recall = 43.83 F-score = 53.34
======== EVALUATE OVERLAP SUBSET ========
Computing BLEU (overlap subset)
BLEU+case.mixed+numrefs.3+smooth.exp+tok.13a+version.1.2.6 = 37.15 84.8/56.7/37.0/25.0 (BP = 0.809 ratio = 0.825 hyp_len = 33 ref_len = 40)
Computing PARENT (overlap subset)
Evaluated 3 examples.
Precision = 71.40 Recall = 31.35 F-score = 41.34
======== EVALUATE NON-OVERLAP SUBSET ========
Computing BLEU (non-overlap subset)
BLEU+case.mixed+numrefs.3+smooth.exp+tok.13a+version.1.2.6 = 58.26 87.5/72.7/55.0/38.9 (BP = 0.959 ratio = 0.960 hyp_len = 24 ref_len = 25)
Computing PARENT (non-overlap subset)
Evaluated 2 examples.
Precision = 83.17 Recall = 62.56 F-score = 71.35
https://github.com/moses-smt/mosesdecoder.git
不需要完全下载,准备好temp/mosesdecoder/scripts/tokenizer/detokenizer.perl
即可sacrebleu.tokenize_13a
https://github.com/xuehuiping/language/tree/master/language/totto
标签:测试 call set rip core tokenizer loading 图片 esc
原文地址:https://www.cnblogs.com/xuehuiping/p/13576891.html