时序预测:多头注意力+宽度学习

Multi-Attn BLS模型复现与优化


✨✨ 欢迎大家来访Srlua的博文(づ ̄3 ̄)づ╭~✨✨

🌟🌟 欢迎各位亲爱的读者,感谢你们抽出宝贵的时间来阅读我的文章。

我是Srlua小谢,在这里我会分享我的知识和经验。🎥

希望在这里,我们能一起探索IT世界的奥妙,提升我们的技能。🔮

记得先点赞👍后阅读哦~ 👏👏

📘📚 所属专栏:传知代码论文复现

欢迎访问我的主页:Srlua小谢 获取更多信息和资源。✨✨🌙🌙

​​

概述

Liyun Su, Lang Xiong和Jialing Yang在2024年发表了题为“Multi-Attn BLS: Multi-head attention mechanism with broad learning system for chaotic time series prediction”的论文,发表在《Applied Soft Computing》杂志上(CiteScore14.3,影响因子8.7)。这篇论文针对混沌时间序列数据的高复杂性和非线性提出了一种新的范式,即将宽度学习模型与多头自注意力机制相结合。在此之前,将这两种高度非线性映射算法融合的主要方法是使用堆叠的多头自注意力来提取特征,然后使用宽度学习模型进行分类预测。这篇论文提出了一种直接将多头注意力模块集成到宽度学习中的方法,从而实现了端到端的预测模型。

需要本文的详细复现过程的项目源码、数据和预训练好的模型可从该地址处获取完整版:地址

深度神经网络虽然具有残差连接来确保信息完整性,但需要较长的训练时间。宽度学习模型则采用级联结构实现信息重用,保证原始信息的完整性。它是一个单一、简单且专门化的网络,无需重新训练,并具有大多数机器学习模型的快速解决能力和大多数深度学习模型的拟合能力。对于宽度学习模型的更深入理解,请参阅原文(链接提供)。此外,该论文指出,多头注意力机制能够充分提取不同维度和层次的关键特征,并有效利用这些关键特征。他们通过列举之前的研究表明,带有注意力机制的模型可以通过捕获一部分语义信息来确保信息的有效性,从而在不同层次捕获丰富的信息。

因此,作者提出了使用宽度学习系统(BLS)来扩展混沌时间序列数据的维度,并引入多头注意力机制来提取不同级别的语义信息,包括线性和非线性相关性、混沌机制和噪声。同时,他们还利用残差连接来确保信息完整性。

文章的主要贡献点

1、提出了一种名为“Multi-Attn BLS”的BLS新范式,可以用于动态建模混沌时序数据。该模型可以通过级联和注意机制最大程度地丰富固定特征,并从混沌时间序列系统中有效提取语义信息。 2、Multi-Attn BLS使用带有位置编码的多头注意力机制来学习复杂的混沌时间序列模式,并通过捕捉时空关系最大化地提取语义信息。 3、Multi-Attn BLS在三个基准测试上取得了出色的预测效果,其它在混沌时间序列中也具有很强的可解释性。

Multi-Attn整体架构

在这里插入图片描述 Multi-Attn BLS主要可分为三个部分:1)混沌时序数据预处理;2)基于BLS随机映射的非线性动态特征重新激活;3)利用多头注意力机制进行多层语义信息提取。

首先,根据相空间重构理论,Liyun Su,Lang Xiong和Jialing Yang使用C-C方法来解决嵌入维度和延迟时间,以恢复混沌系统,并将混沌时间序列转变为可预测模式。然后,重新构建的混沌时间序列数据被BLS的特征层和增强层随机映射并增强到高维系统,从而生成含有不同模式的混沌时间序列的混合特征。最后,使用多头注意力机制和残差连接来提取系统中保留的时空关系,包括线性相关、非线性确定性和噪声。

混沌时序数据预处理:基于相空间重构理论的混沌系统恢复 混沌时间序列是动力系统产生的单变量或多变量时间序列。相空间重构定理将混沌时间序列映射到高维空间,以重构原始动力系统的一组表示。根据Takens嵌入定理,必须重构相空间以恢复原始的混沌吸引子,并使混沌系统的时间序列获得固定维度。

在这里插入图片描述

基于BLS随机映射的非线性动态特征重新激活

在这里插入图片描述

BLS的整体架构如上图所示,在这里我们实际上只用到了它的映射能力,即特征节点层和增强节点层,也就是上面的mapping feature nodes和enhancement feature nodes。这两层的搭建方法如下:

在这里插入图片描述 在这里插入图片描述 在这里插入图片描述

利用多头注意力机制进行多层语义信息提取

在这一部分,作者使用了堆叠的多头自注意力机制来处理经BLS映射得到的高维节点。因此,我们着重介绍了多头自注意力机制的原理。

多头自注意力机制的推导过程如下所示:首先,我们定义了多头自注意力操作:

在这里插入图片描述

其中,每个头的计算过程为:

在这里插入图片描述

在这里插入图片描述

再经过足够多的多头注意力模块处理后,作者使用了一个全连接层将结果映射到输出空间。

核心代码复现

在本文中,我们主要关注MultiAttn-BLS中多头自注意力机制和BLS模型的融合,对时序数据预处理的复现不是本文重点。在这里给出由笔者复现的MultiAttn-BLS代码,代码采用pytorch框架搭建模型框架:

<span style="background-color:#f8f8f8"><span style="color:#333333"><span style="color:#aa5500">#  BLS映射层</span>
<span style="color:#770088">import</span> <span style="color:#000000">numpy</span> <span style="color:#770088">as</span> <span style="color:#000000">np</span>
<span style="color:#770088">from</span> <span style="color:#000000">sklearn</span> <span style="color:#770088">import</span> <span style="color:#000000">preprocessing</span>
<span style="color:#770088">from</span> <span style="color:#000000">numpy</span> <span style="color:#770088">import</span> <span style="color:#000000">random</span>
<span style="color:#770088">from</span> <span style="color:#000000">scipy</span> <span style="color:#770088">import</span> <span style="color:#000000">linalg</span> <span style="color:#770088">as</span> <span style="color:#000000">LA</span>
​
<span style="color:#770088">def</span> <span style="color:#0000ff">show_accuracy</span>(<span style="color:#000000">predictLabel</span>, <span style="color:#000000">Label</span>):
    <span style="color:#000000">count</span> <span style="color:#981a1a">=</span> <span style="color:#116644">0</span>
    <span style="color:#000000">label_1</span> <span style="color:#981a1a">=</span> <span style="color:#000000">Label</span>.<span style="color:#000000">argmax</span>(<span style="color:#000000">axis</span><span style="color:#981a1a">=</span><span style="color:#116644">1</span>)
    <span style="color:#000000">predlabel</span> <span style="color:#981a1a">=</span> <span style="color:#000000">predictLabel</span>.<span style="color:#000000">argmax</span>(<span style="color:#000000">axis</span><span style="color:#981a1a">=</span><span style="color:#116644">1</span>)
    <span style="color:#770088">for</span> <span style="color:#000000">j</span> <span style="color:#770088">in</span> <span style="color:#3300aa">list</span>(<span style="color:#3300aa">range</span>(<span style="color:#000000">Label</span>.<span style="color:#000000">shape</span>[<span style="color:#116644">0</span>])):
        <span style="color:#770088">if</span> <span style="color:#000000">label_1</span>[<span style="color:#000000">j</span>] <span style="color:#981a1a">==</span> <span style="color:#000000">predlabel</span>[<span style="color:#000000">j</span>]:
            <span style="color:#000000">count</span> <span style="color:#981a1a">+=</span> <span style="color:#116644">1</span>
    <span style="color:#770088">return</span> (<span style="color:#3300aa">round</span>(<span style="color:#000000">count</span> <span style="color:#981a1a">/</span> <span style="color:#3300aa">len</span>(<span style="color:#000000">Label</span>), <span style="color:#116644">5</span>))
​
​
<span style="color:#770088">def</span> <span style="color:#0000ff">tansig</span>(<span style="color:#000000">x</span>):
    <span style="color:#770088">return</span> (<span style="color:#116644">2</span> <span style="color:#981a1a">/</span> (<span style="color:#116644">1</span> <span style="color:#981a1a">+</span> <span style="color:#000000">np</span>.<span style="color:#000000">exp</span>(<span style="color:#981a1a">-</span><span style="color:#116644">2</span> <span style="color:#981a1a">*</span> <span style="color:#000000">x</span>))) <span style="color:#981a1a">-</span> <span style="color:#116644">1</span>
​
​
<span style="color:#770088">def</span> <span style="color:#0000ff">sigmoid</span>(<span style="color:#000000">data</span>):
    <span style="color:#770088">return</span> <span style="color:#116644">1.0</span> <span style="color:#981a1a">/</span> (<span style="color:#116644">1</span> <span style="color:#981a1a">+</span> <span style="color:#000000">np</span>.<span style="color:#000000">exp</span>(<span style="color:#981a1a">-</span><span style="color:#000000">data</span>))
​
​
<span style="color:#770088">def</span> <span style="color:#0000ff">linear</span>(<span style="color:#000000">data</span>):
    <span style="color:#770088">return</span> <span style="color:#000000">data</span>
​
​
<span style="color:#770088">def</span> <span style="color:#0000ff">tanh</span>(<span style="color:#000000">data</span>):
    <span style="color:#770088">return</span> (<span style="color:#000000">np</span>.<span style="color:#000000">exp</span>(<span style="color:#000000">data</span>) <span style="color:#981a1a">-</span> <span style="color:#000000">np</span>.<span style="color:#000000">exp</span>(<span style="color:#981a1a">-</span><span style="color:#000000">data</span>)) <span style="color:#981a1a">/</span> (<span style="color:#000000">np</span>.<span style="color:#000000">exp</span>(<span style="color:#000000">data</span>) <span style="color:#981a1a">+</span> <span style="color:#000000">np</span>.<span style="color:#000000">exp</span>(<span style="color:#981a1a">-</span><span style="color:#000000">data</span>))
​
​
<span style="color:#770088">def</span> <span style="color:#0000ff">relu</span>(<span style="color:#000000">data</span>):
    <span style="color:#770088">return</span> <span style="color:#000000">np</span>.<span style="color:#000000">maximum</span>(<span style="color:#000000">data</span>, <span style="color:#116644">0</span>)
​
​
<span style="color:#770088">def</span> <span style="color:#0000ff">pinv</span>(<span style="color:#000000">A</span>, <span style="color:#000000">reg</span>):
    <span style="color:#770088">return</span> <span style="color:#000000">np</span>.<span style="color:#000000">mat</span>(<span style="color:#000000">reg</span> <span style="color:#981a1a">*</span> <span style="color:#000000">np</span>.<span style="color:#000000">eye</span>(<span style="color:#000000">A</span>.<span style="color:#000000">shape</span>[<span style="color:#116644">1</span>]) <span style="color:#981a1a">+</span> <span style="color:#000000">A</span>.<span style="color:#000000">T</span>.<span style="color:#000000">dot</span>(<span style="color:#000000">A</span>)).<span style="color:#000000">I</span>.<span style="color:#000000">dot</span>(<span style="color:#000000">A</span>.<span style="color:#000000">T</span>)
​
​
<span style="color:#770088">def</span> <span style="color:#0000ff">shrinkage</span>(<span style="color:#000000">a</span>, <span style="color:#000000">b</span>):
    <span style="color:#000000">z</span> <span style="color:#981a1a">=</span> <span style="color:#000000">np</span>.<span style="color:#000000">maximum</span>(<span style="color:#000000">a</span> <span style="color:#981a1a">-</span> <span style="color:#000000">b</span>, <span style="color:#116644">0</span>) <span style="color:#981a1a">-</span> <span style="color:#000000">np</span>.<span style="color:#000000">maximum</span>(<span style="color:#981a1a">-</span><span style="color:#000000">a</span> <span style="color:#981a1a">-</span> <span style="color:#000000">b</span>, <span style="color:#116644">0</span>)
    <span style="color:#770088">return</span> <span style="color:#000000">z</span>
​
<span style="color:#770088">def</span> <span style="color:#0000ff">sparse_bls</span>(<span style="color:#000000">A</span>, <span style="color:#000000">b</span>): <span style="color:#aa5500">#A:映射后每个窗口的节点,b:加入bias的输入数据</span>
    <span style="color:#000000">lam</span> <span style="color:#981a1a">=</span> <span style="color:#116644">0.001</span>
    <span style="color:#000000">itrs</span> <span style="color:#981a1a">=</span> <span style="color:#116644">50</span>
    <span style="color:#000000">AA</span> <span style="color:#981a1a">=</span> <span style="color:#000000">A</span>.<span style="color:#000000">T</span>.<span style="color:#000000">dot</span>(<span style="color:#000000">A</span>)
    <span style="color:#000000">m</span> <span style="color:#981a1a">=</span> <span style="color:#000000">A</span>.<span style="color:#000000">shape</span>[<span style="color:#116644">1</span>]
    <span style="color:#000000">n</span> <span style="color:#981a1a">=</span> <span style="color:#000000">b</span>.<span style="color:#000000">shape</span>[<span style="color:#116644">1</span>]
    <span style="color:#000000">x1</span> <span style="color:#981a1a">=</span> <span style="color:#000000">np</span>.<span style="color:#000000">zeros</span>([<span style="color:#000000">m</span>, <span style="color:#000000">n</span>])
    <span style="color:#000000">wk</span> <span style="color:#981a1a">=</span> <span style="color:#000000">x1</span>
    <span style="color:#000000">ok</span> <span style="color:#981a1a">=</span> <span style="color:#000000">x1</span>
    <span style="color:#000000">uk</span> <span style="color:#981a1a">=</span> <span style="color:#000000">x1</span>
    <span style="color:#000000">L1</span> <span style="color:#981a1a">=</span> <span style="color:#000000">np</span>.<span style="color:#000000">mat</span>(<span style="color:#000000">AA</span> <span style="color:#981a1a">+</span> <span style="color:#000000">np</span>.<span style="color:#000000">eye</span>(<span style="color:#000000">m</span>)).<span style="color:#000000">I</span>
    <span style="color:#000000">L2</span> <span style="color:#981a1a">=</span> (<span style="color:#000000">L1</span>.<span style="color:#000000">dot</span>(<span style="color:#000000">A</span>.<span style="color:#000000">T</span>)).<span style="color:#000000">dot</span>(<span style="color:#000000">b</span>)
    <span style="color:#770088">for</span> <span style="color:#000000">i</span> <span style="color:#770088">in</span> <span style="color:#3300aa">range</span>(<span style="color:#000000">itrs</span>):
        <span style="color:#000000">ck</span> <span style="color:#981a1a">=</span> <span style="color:#000000">L2</span> <span style="color:#981a1a">+</span> <span style="color:#000000">np</span>.<span style="color:#000000">dot</span>(<span style="color:#000000">L1</span>, (<span style="color:#000000">ok</span> <span style="color:#981a1a">-</span> <span style="color:#000000">uk</span>))
        <span style="color:#000000">ok</span> <span style="color:#981a1a">=</span> <span style="color:#000000">shrinkage</span>(<span style="color:#000000">ck</span> <span style="color:#981a1a">+</span> <span style="color:#000000">uk</span>, <span style="color:#000000">lam</span>)
        <span style="color:#000000">uk</span> <span style="color:#981a1a">=</span> <span style="color:#000000">uk</span> <span style="color:#981a1a">+</span> <span style="color:#000000">ck</span> <span style="color:#981a1a">-</span> <span style="color:#000000">ok</span>
        <span style="color:#000000">wk</span> <span style="color:#981a1a">=</span> <span style="color:#000000">ok</span>
    <span style="color:#770088">return</span> <span style="color:#000000">wk</span>
​
<span style="color:#770088">def</span> <span style="color:#0000ff">generate_mappingFeaturelayer</span>(<span style="color:#000000">train_x</span>, <span style="color:#000000">FeatureOfInputDataWithBias</span>, <span style="color:#000000">N1</span>, <span style="color:#000000">N2</span>, <span style="color:#000000">OutputOfFeatureMappingLayer</span>, <span style="color:#000000">u</span><span style="color:#981a1a">=</span><span style="color:#116644">0</span>):
    <span style="color:#000000">Beta1OfEachWindow</span> <span style="color:#981a1a">=</span> <span style="color:#3300aa">list</span>()
    <span style="color:#000000">distOfMaxAndMin</span> <span style="color:#981a1a">=</span> []
    <span style="color:#000000">minOfEachWindow</span> <span style="color:#981a1a">=</span> []
    <span style="color:#770088">for</span> <span style="color:#000000">i</span> <span style="color:#770088">in</span> <span style="color:#3300aa">range</span>(<span style="color:#000000">N2</span>):
        <span style="color:#000000">random</span>.<span style="color:#000000">seed</span>(<span style="color:#000000">i</span> <span style="color:#981a1a">+</span> <span style="color:#000000">u</span>)
        <span style="color:#000000">weightOfEachWindow</span> <span style="color:#981a1a">=</span> <span style="color:#116644">2</span> <span style="color:#981a1a">*</span> <span style="color:#000000">random</span>.<span style="color:#000000">randn</span>(<span style="col
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值