首页 > 其他好文 > 详细

ESIM

时间：2019-10-03 20:12:07 阅读：107 评论：0 收藏：0 [点我收藏+]

标签：alignment 原理 chain res 推理 format lan 文本分类 paper

短文本匹配&自然语言推理模型--ESIM

论文链接：http://tongtianta.site/paper/11096

一、原理

ESIM，简称 “Enhanced LSTM for Natural Language Inference“。顾名思义，一种专为自然语言推断而生的加强版 LSTM。至于它是如何加强 LSTM，听我细细道来。

Unlike the previous top models that use very complicated network
architectures, we first demonstrate that carefully designing sequential inference
models based on chain LSTMs can outperform all previous models.
Based on this, we further show that by explicitly considering recursive
architectures in both local inference modeling and inference composition,
we achieve additional improvement.

上面一段话摘选自ESIM论文的摘要，总结来说，ESIM 能比其他短文本分类算法牛逼主要在于两点：

精细的设计序列式的推断结构。
考虑局部推断和全局推断。

作者主要是用句子间的注意力机制(intra-sentence attention)，来实现局部的推断，进一步实现全局的推断

技术图片

1 模型输入编码层

没啥可说的，就是输入两句话分别接 embeding + BiLSTM

2 local inference modeling（局部推理建模）

local inference 之前需要将两句话进行 alignment，这里是使用 soft_align_attention。

怎么做呢，首先计算两个句子 word 之间的相似度，得到2维的相似度矩阵，技术图片

然后才进行两句话的 local inference。用之前得到的相似度矩阵，结合 a，b 两句话，互相生成彼此相似性加权后的句子，维度保持不变。

技术图片

在 local inference 之后，进行 Enhancement of local inference information。这里的 enhancement 就是计算 a 和 align 之后的 a 的差和点积，体现了一种差异性吧，更利用后面的学习.

技术图片

3 inference composition

最后一步了，比较简单。

再一次用 BiLSTM 提前上下文信息，同时使用 MaxPooling 和 AvgPooling 进行池化操作, 最后接一个全连接层。这里倒是比较传统。没啥可说的。

标签：alignment 原理 chain res 推理 format lan 文本分类 paper

原文地址：https://www.cnblogs.com/rise0111/p/11620363.html

踩

(0)

赞

(0)

举报

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

更多

友情链接

兰亭集智国之画百度统计站长统计阿里云 chrome插件新版天听网

关于我们 - 联系我们 - 留言反馈

© 2014 mamicode.com 版权所有联系我们:gaon5@hotmail.com

迷上了代码！