时间序列插补模型相关工作

最新推荐文章于 2024-08-26 09:18:16 发布

原创

最新推荐文章于 2024-08-26 09:18:16 发布 · 776 阅读

10 ·

CC 4.0 BY-SA版权

文章标签：

#学习方法 #论文阅读 #论文笔记

本文探讨了时间序列插补的各种方法，包括KNN、Kriging、MICE、ARIMA、VAR等传统技术，以及RNN、自回归模型、矩阵分解、VAE和GAN的应用。近年来，基于注意力的多变量插补模型和扩散模型如SSGN、CSDI和MIDM等在提高插补准确性方面展现出潜力。这些方法考虑了数据间的相关性和时间依赖性，以处理大规模数据集中的缺失值问题。

一些早期的研究通过空间关系或相邻序列来填充缺失值，如KNN [12]，[13]和Kriging [32]。
[12] H. Trevor, T. Robert, and F. Jerome, “The elements of statistical learning:
data mining, inference, and prediction,” 2009.
[13] L. Beretta and A. Santaniello, “Nearest neighbor imputation algorithms:
a critical evaluation,” BMC medical informatics and decision making,
vol. 16, no. 3, pp. 197–208, 2016.
[32] M. L. Stein, Interpolation of spatial data: some theory for kriging.
Springer Science & Business Media, 1999.
MICE使用链式方程来填充缺失值。
Ian R White, Patrick Royston, and Angela MWood. 2011. Multiple imputation
using chained equations: issues and guidance for practice. Statistics in medicine
30, 4 (2011), 377–399.
一些代表性的自回归模型，如ARIMA和VAR，可用于插补缺失值。
George EP Box, Gwilym M Jenkins, Gregory C Reinsel, and Greta M Ljung. 2015.
Time series analysis: forecasting and control. John Wiley & Sons.
Eric Zivot and Jiahui Wang. 2006. Vector autoregressive models for multivariate
time series. Modeling financial time series with S-PLUS® (2006), 385–429.
选择最接近的邻居，并使用邻居值的平均值来填充缺失值。
Andrew T Hudak, Nicholas L Crookston, Jeffrey S Evans, David E Hall, and
Michael J Falkowski. 2008. Nearest neighbor imputation of species-level, plot-
scale forest structure attributes from LiDAR data. Remote Sensing ofEnvironment
112, 5 (2008), 2232–2245.
MF将不完整的数据集分解为低秩矩阵，并采用这两个矩阵的乘积来估算缺失值。
Morten Morup, Daniel M Dunlavy, Evrim Acar, and Tamara Gibson Kolda. 2010.
Scalable tensor factorizations with missing data. Technical Report. Sandia National
Laboratories (SNL), Albuquerque, NM, and Livermore, CA . . . .

以上这些方法难以拟合大型数据集，并且其插补精度有限。

为了提高代表能力，[12]将自我训练机制应用于多元插补。
Tae-Min Choi, Ji-Su Kang, and Jong-Hwan Kim. 2020. RDIS: Random drop
imputation with self-training for incomplete time series data. arXiv preprint
arXiv:2010.10075 (2020).
[31，35]提出了基于注意力的多变量插补模型。
*[31] Satya Narayan Shukla and Benjamin M Marlin. 2021. Multi-time attention
networks for irregularly sampled time series. arXiv preprint arXiv:2101.10318
(2021).

[35] Qiuling Suo, Weida Zhong, Guangxu Xun, Jian