关于表格的文本生成：Table-to-Text

时间：2020-09-08 20:44:10 阅读：56 评论：0 收藏：0 [点我收藏+]

标签：测试 call set rip core tokenizer loading 图片 esc

我研究了3个例子：北京大学的wiki2bio、谷歌的ToTTo、微软的WIKITABLETEXT

北京大学的wiki2bio

技术图片

Liu, T., Wang, K., Sha, L., Chang, B., & Sui, Z. (2018). Table-to-text generation by structure-aware seq2seq learning. 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, 4881–4888. https://arxiv.org/pdf/1711.09724.pdf
https://github.com/tyliupku/wiki2bio
注意数据格式，不是二维表格

谷歌的ToTTo

Running with the following variables:
PREDICTION_PATH   : language/totto/sample/output_sample.txt
TARGET_PATH       : language/totto/sample/dev_sample.jsonl 
OUTPUT_DIR        : temp
MODE              : test
Writing references.
Writing tables in PARENT format.
Preparing predictions.
Writing predictions.
Running detokenizers.
======== EVALUATE OVERALL ========
Computing BLEU (overall)
BLEU+case.mixed+numrefs.3+smooth.exp+tok.13a+version.1.2.6 = 45.55 86.0/63.5/44.7/31.0 (BP = 0.869 ratio = 0.877 hyp_len = 57 ref_len = 65)
Computing PARENT (overall)
Evaluated 5 examples.
Precision = 76.11 Recall = 43.83 F-score = 53.34
======== EVALUATE OVERLAP SUBSET ========
Computing BLEU (overlap subset)
BLEU+case.mixed+numrefs.3+smooth.exp+tok.13a+version.1.2.6 = 37.15 84.8/56.7/37.0/25.0 (BP = 0.809 ratio = 0.825 hyp_len = 33 ref_len = 40)
Computing PARENT (overlap subset)
Evaluated 3 examples.
Precision = 71.40 Recall = 31.35 F-score = 41.34
======== EVALUATE NON-OVERLAP SUBSET ========
Computing BLEU (non-overlap subset)
BLEU+case.mixed+numrefs.3+smooth.exp+tok.13a+version.1.2.6 = 58.26 87.5/72.7/55.0/38.9 (BP = 0.959 ratio = 0.960 hyp_len = 24 ref_len = 25)
Computing PARENT (non-overlap subset)
Evaluated 2 examples.
Precision = 83.17 Recall = 62.56 F-score = 71.35

代码：https://github.com/google-research/language.git，示例数据可以运行。
测试脚本依赖的https://github.com/moses-smt/mosesdecoder.git不需要完全下载，准备好temp/mosesdecoder/scripts/tokenizer/detokenizer.perl即可
代码中sacrebleu的调用路径可直接使用sacrebleu.tokenize_13a
参考https://github.com/xuehuiping/language/tree/master/language/totto

微软的WIKITABLETEXT

技术图片

Bao, J., Tang, D., Duan, N., Yan, Z., Lv, Y., Zhou, M., & Zhao, T. (2018). Table-to-text: Describing table region with natural language. 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, 5020–5027. https://arxiv.org/pdf/1805.11234.pdf
https://github.com/tangduyu/Table-Intelligence 没下载下来

关于表格的文本生成：Table-to-Text

标签：测试 call set rip core tokenizer loading 图片 esc

原文地址：https://www.cnblogs.com/xuehuiping/p/13576891.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行