Multivariate LSTM-FCNs for Time Series Classification 论文学习记录

本文提出了一种新的多变量时间序列分类模型,该模型通过在全卷积块中加入挤压-激励块,改进了现有的单变量时间序列分类模型。新模型在多个数据集上表现优秀,且对数据预处理要求较低。

Abstract

Over the past decade, multivariate time series classification has
received great attention. We propose transforming the existing
univariate time series classification models, the Long Short Term
Memory Fully Convolutional Network (LSTM-FCN) and Attention LSTM-FCN (ALSTM-FCN), into a multivariate time series classification model by
augmenting the fully convolutional block with a squeeze-and-excitation
block to further improve accuracy. Our proposed models outperform most
state-of-the-art models while requiring minimum preprocessing. The
proposed models work efficiently on various complex multivariate time
series classification tasks such as activity recognition or action
recognition. Furthermore, the proposed models are highly efficient at
test time and small enough to deploy on memory constrained systems.

在过去的几十年里,多变量时间序列分类问题引起了广泛的关注。我们提出转换现存的单变量时间序列分类模型:长短期记忆全卷积神经网络和注意力LSTM-FCN,转换为多变量时间序列分类模型通过应用挤压-激励块到FCN中去提升准确性。我们提出的模型表现优于最先进的模型,同时需要最少的预处理。这提出的模型在各种复杂的多变量时间上有效工作实现分类任务,如活动识别或操作认可。此外,所提出的模型在测试时十分高效,并且足够小,可以在内存受限的系统上部署。

1.Introduction

Time series data is used in various fields of studies, ranging from
weather readings to psychological signals [1, 2, 3, 4]. A time series
is a sequence of data points in a time domain, typically in a uniform
interval [5]. There is a significant increase of time series data
being collected by sensors [6]. A time series dataset can be
univariate, where a sequence of measurements from the same variable
are collected, or multivariate, where a sequence of measurements from
multiple variables or sensors are collected [7]. Over the past decade,
multivariate time series classification has received significant
interest. Multivariate time series classifications are applied in
healthcare [8], phoneme classification [9], activity recognition,
object recognition, and action recognition [10, 11, 12, 13]. In this
paper, we propose two deep learning models that outperform existing
algorithms.

时间序列被很多研究领域所使用,从天气读数与心理信号。时间序列是时域中的一系列数据点,通常采用统一间隔。

Several time series classification algorithms have been developed
over the years. Distance based methods along with k-nearest neighbors
have proven to be successful in classifying multivariate time series [14]. Plenty of research indicates Dynamic Time Warping (DTW)
as the best distance-based measure to use along k-NN [15].

一些时间序列分类算法在这些年中被开发了出来。基于距离的k近邻算法已经在分类多变量时间序列中取得了成功,大量研究表明动态时间规划 (DTW)是沿 k-NN 使用的最佳基于距离的方法 [15]。

In addition to distance-based metrics, other algorithms are used.
Typically, featurebased classification algorithms rely heavily on the
features being extracted from the time series data [16]. However,
feature extraction is arduous because intrinsic features of time
series data are challenging to capture. For this reason,
distance-based approaches are more successful in classifying
multivariate time series data [17]. Hidden State Conditional Random
Field (HCRF) and Hidden Unit Logistic Model (HULM) are two successful
feature-based algorithms which have led to state-of-the-art results on
various benchmark datasets, ranging from online character recognition
to activity recognition [18]. HCRF is a computationally expensive
algorithm that detects latent structures of the input time series data
using a chain of k-nominal latent variables. The number of parameters
in the model increases linearly with the total number of latent states
required [19]. Further, datasets that require a large number of latent
states tend to overfit the data. To overcome this, HULM proposes using
H binary stochastic hidden units to model 2H latent structures of the
data with only O(H) parameters. Results indicate HULM outperforming
HCRF on most datasets [18]

Traditional models, such as the naive logistic model (NL) and Fisher
kernel learning (FKL) [20], show strong performance on a wide variety
of time series classification problems. The NL logistic model is a
linear logistic model that makes a prediction by summing the inner
products between the model weights and feature vectors over time,
which is followed by a softmax function [18]. The FKL model is
effective on time series classification problems when based on Hidden
Markov Models (HMM). Subsequently, the features or representation from
the FKL model is used to train a linear SVM to make a final
prediction. [20, 21]

传统的模型,比如NL模型在诸多时间序列的分类问题上展现出了很好的性能。NL逻辑模型是一个线性逻辑模型,其通过对内部求和进行预测随时间推移在模型权重和特征向量之间乘积,其后紧跟为softmax函数。

Another common approach for multivariate time series classification is
by applying dimensional reduction techniques or by concatenating all
dimensions of a multivariate time series into a univariate time
series. Symbolic Representation for Multivariate Time Series (SMTS)
[22] applies a random forest on the multivariate time series to
partition it into leaf nodes, each represented by a word to form a
codebook. Every word is used with another random forest to classify
the multivariate time series. Learned Pattern Similarity (LPS) [23] is
a similar model that extracts segments from the multivariate time
series. These segments are used to train regression trees to find
dependencies between them. Each node is represented by a word.
Finally, these words are used with a similarity measure to classify
the unknown multivariate time series. Ultra Fast Shapelets (UFS) [24]
obtains random shapelets from the multivariate time series and applies
a linear SVM or a Random Forest classifier. Subsequently, UFS was
enhanced by computing derivatives as features (dUFS) [24]. The
Auto-Regressive (AR) kernel [25] applies an AR kernel-based distance
measure to classify the multivariate time series. Auto-Regressive
forests for multivariate time series modeling (mv-ARF) [26] uses a
tree ensemble, where the trees are trained with different time lags.
Most recently, WEASEL+MUSE [27] builds a multivariate feature vector
using a classical bag of patterns approach on each variable with
various sliding window sizes to capture discrete features, words, and
pairs of words. Subsequently, feature selection is used to remove
non-discriminative features using a Chi-squared test. The final
classification is obtained using a logistic classifier on the final
feature vector.

Deep learning has also yielded promising results for multivariate time
series classification. In 2014, Yi et al. propose using Multi-Channel
Deep Convolutional Neural Network (MCDCNN) for multivariate time
series classification. MC-DCNN takes input from each variable to
detect latent features. The latent features from each channel are fed
into an MLP to perform classification [17]. This paper proposes two
deep learning models for multivariate time series classification.
These proposed models require minimal preprocessing and are tested on
35 datasets, obtaining strong performances in most of them.
Performance is the classification accuracy of a model on a particular
dataset. The rest of the paper is ordered as follows. Background works
are discussed in Section 2. We present the architecture of the two
proposed models in Section 3. In Section 4, we discuss the dataset,
evaluate the models on them, present our results and analyze our
findings. In Section 5, we draw our conclusion.

2.Background Works

2.1Recurrent Neural Networks

在这里插入图片描述

Recurrent Neural Networks (RNN) are a form of neural networks that
display temporal behavior through the direct connections betw

在时间序列预测领域,尤其是基于深度学习的方法中,Transformer、LSTM 和 CNN 的结合已经成为研究热点。TCLN 提出的融合架构通过引入多核卷积、自注意力机制和长短时记忆网络,显著提升了模型在多变量时间序列预测(MTSF)中的表现[^1]。以下是一些与 TCLN 类似的研究工作: ### 基于 Transformer 与 Conv-LSTM 的时间序列预测模型 #### 1. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting Informer 是一种专为长序列时间序列预测设计的 Transformer 变体,它通过概率稀疏注意力机制和自注意力蒸馏策略来减少计算复杂度,并有效捕捉长期依赖关系。虽然 Informer 主要基于 Transformer 架构,但其思想可以与 LSTM 或 CNN 结合以增强时空建模能力[^2]。 #### 2. Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting Autoformer 引入了序列分解模块和自相关机制,用于识别时间序列中的周期性模式。该模型利用 Transformer 架构进行长期趋势建模,并结合自相关机制提取更细粒度的时间依赖结构。这种分解方法也可与 Conv-LSTM 融合以提升空间特征提取能力[^3]。 #### 3. Temporal Fusion Transformer (TFT) Temporal Fusion Transformer 是一个专门用于多变量时间序列预测的混合模型,它结合了 LSTM 和 Transformer 的优势。TFT 使用 LSTM 编码历史时间步的信息,并通过可解释的多头注意力机制进行跨时间步的交互建模。此外,TFT 还支持静态协变量输入,适用于工业和金融等复杂场景[^2]。 #### 4. DeepGLO: A Deep Learning Approach to Multivariate Time Series Forecasting DeepGLO 是一个基于矩阵分解和深度神经网络的多变量时间序列预测框架。该模型使用全局潜在因子表示整个时间序列的共性,并通过局部 LSTM 或 CNN 模型建模每个时间序列的独特性。这种方法可以与 Transformer 结合以进一步增强模型的时空建模能力[^3]。 #### 5. LSTNet: Deep Learning for Multivariate Time Series Forecasting LSTNet 是早期将 CNN、RNN 和跳跃连接结合用于多变量时间序列预测的工作。它使用卷积层提取短期模式,并通过 LSTM 捕捉长期依赖。最后通过跳跃连接将不同层次的特征组合起来。LSTNet 的结构启发了后续许多融合 CNN 与 RNN 的工作,包括 TCLN 中的多核卷积与 LSTM 和 Transformer 的结合[^1]。 #### 6. MTGNN: Multivariate Time Series Forecasting via Graph Neural Networks MTGNN 将多元时间序列建模为图结构,利用图神经网络(GNN)建模变量之间的空间依赖关系,并结合门控循环单元(GRU)建模时间动态变化。虽然主要基于 GNN,但其时间建模部分可以替换为 Transformer 或 Conv-LSTM 以提升模型性能。 --- ### 示例代码:构建一个简单的 Transformer + LSTM 时间序列预测模型 ```python import torch import torch.nn as nn class TransformerLSTMModel(nn.Module): def __init__(self, input_dim, hidden_dim, num_layers, output_dim, nhead=4): super(TransformerLSTMModel, self).__init__() self.embedding = nn.Linear(input_dim, hidden_dim) self.transformer_layer = nn.TransformerEncoderLayer(d_model=hidden_dim, nhead=nhead) self.transformer = nn.TransformerEncoder(self.transformer_layer, num_layers=2) self.lstm = nn.LSTM(hidden_dim, hidden_dim, num_layers=num_layers, batch_first=True) self.fc = nn.Linear(hidden_dim, output_dim) def forward(self, x): # x shape: (batch_size, seq_len, input_dim) x = self.embedding(x) # (batch_size, seq_len, hidden_dim) x = x.permute(1, 0, 2) # (seq_len, batch_size, hidden_dim) x = self.transformer(x) # (seq_len, batch_size, hidden_dim) x = x.permute(1, 0, 2) # (batch_size, seq_len, hidden_dim) lstm_out, _ = self.lstm(x) # (batch_size, seq_len, hidden_dim) out = self.fc(lstm_out[:, -1, :]) # (batch_size, output_dim) return out ``` ---
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

彭祥.

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值