基于Keras的attention实战

最新推荐文章于 2024-04-30 16:55:54 发布

原创

最新推荐文章于 2024-04-30 16:55:54 发布 · 2.4w 阅读

108 ·

CC 4.0 BY-SA版权

文章标签：

#Keras #Python #Attention

本文是关于基于Keras实现Attention机制的实战教程，探讨Attention如何在模型中进行特征选择。通过代码实例，展示了如何计算Attention权重并应用于数据，以揭示其在特征相关性中的作用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

要点：
该教程为基于Kears的Attention实战，环境配置：
Wn10+CPU i7-6700
Pycharm 2018
python 3.6
numpy 1.14.5
Keras 2.0.2
Matplotlib 2.2.2
强调：各种库的版本型号一定要配置对，因为Keras以及Tensorflow升级更新比较频繁，很多函数更新后要么更换了名字，要么没有这个函数了，所以大家务必重视。
相关代码我放在了我的代码仓库里哈，欢迎大家下载，这里附上地址：基于Kears的Attention实战
笔者信息：Next_Legend QQ:1219154092 人工智能自然语言处理图像处理神经网络
——2018.8.21于天津大学

一、导读

最近两年，尤其在今年，注意力机制(Attention)及其变种Attention逐渐热了起来，在很多顶会Paper中都或多或少的用到了attention,所以小编出于好奇，整理了这篇基于Kears的Attention实战，本教程仅从代码的角度来看Attention。通过一个简单的例子，探索Attention机制是如何在模型中起到特征选择作用的。

二、代码实战（一）

1、导入相关库文件

import numpy as np
from attention_utils import get_activations, get_data

np.random.seed(1337)  # for reproducibility
from keras.models import *
from keras.layers import Input, Dense, merge
import tensorflow as tf

2、数据生成函数

def get_data(n, input_dim, attention_column=1):
    """
    Data generation. x is purely random except that it's first value equals the target y.
    In practice, the network should learn that the target = x[attention_column].
    Therefore, most of its attention should be focused on the value addressed by attention_column.
    :param n: the number of samples to retrieve.
    :param input_dim: the number of dimensions of each element in the series.
    :param attention_column: the column linked to the target. Everything else is purely random.
    :return: x: model inputs, y: model targets
    """
    x = np.random.standard_normal(size=(n, input_dim))
    y = np.random.randint(low=0, high=2, size=(n, 1))
    x[:, attention_column] = y[:,