如何可视化深度学习网络中Attention层

时间：2020-04-18 16:01:37 阅读：402 评论：0 收藏：0 [点我收藏+]

前言

在训练深度学习模型时，常想一窥网络结构中的attention层权重分布，观察序列输入的哪些词或者词组合是网络比较care的。在小论文中主要研究了关于词性POS对输入序列的注意力机制。同时对比实验采取的是words的self-attention机制。

技术图片

效果

下图主要包含两列：word_attention是self-attention机制的模型训练结果，POS_attention是词性模型的训练结果。
可以看出，相对于word_attention，POS的注意力机制不仅能够捕捉到评价的aspect，也能根据aspect关联的词借助情感语义表达的词性分布，care到相关词性的情感词。

技术图片

核心代码

可视化样例

# coding: utf-8
def highlight(word, attn):
    html_color = ‘#%02X%02X%02X‘ % (255, int(255*(1 - attn)), int(255*(1 - attn)))
    return ‘<span style="background-color: {}">{}</span>‘.format(html_color, word)

def mk_html(seq, attns):
    html = ""
    for ix, attn in zip(seq, attns):
        html += ‘ ‘ + highlight(
            ix,
            attn
        )
    return html + "<br>"

from IPython.display import HTML, display
batch_size = 1
seqs = [["这", "是", "一个", "测试", "样例", "而已"]]
attns = [[0.01, 0.19, 0.12, 0.7, 0.2, 0.1]]

for i in range(batch_size):
    text = mk_html(seqs[i], attns[i])
    display(HTML(text))

接入model

需要在model的返回列表中，添加attention_weight的输出，理论上维度应该和输入序列的长度是一致的。

# load model
import torch
# if you train on gpu, you need to move onto cpu
model = torch.load("../docs/model_chk/2018-11-07-02:45:37", map_location=lambda storage, location: storage)

from torch.autograd import Variable
for batch_idx, samples in enumerate(test_loader, 0):
    v_word = Variable(samples[‘word_vec‘])
    v_final_label = samples[‘top_label‘]

    model.eval()
    final_probs, att_weight = model(v_word, v_pos)

    batch_words = toWords(samples["word_vec"].numpy(), idx_word)  # id转化为word
    batch_att = getAtten(batch_words, att_weight.data.numpy())    # 去除padding词，根据words的长度截取attention
    labels = toLabel(samples[‘top_label‘].numpy())  # 真实标签
    pre_labels = toLabel(final_probs.data.numpy() >= 0.5)   # 预测标签

    for i in range(len(batch_words)):
        text = mk_html(batch_words[i], batch_att[i])
        print(labels[i], pre_labels[i])
        display(HTML(text))

总结

建议把可视化独立出来，用jupyter-notebook编辑，方便分段调试和copy；同时因为是借助html渲染的，所以需要notebook
项目代码我后期后同步到github上，欢迎一起交流

如何可视化深度学习网络中Attention层

标签：对比 shu 深度调试评价 style 标签 code 同步

原文地址：https://www.cnblogs.com/CocoML/p/12726004.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行