[NeurIPS 2024]AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

最新推荐文章于 2025-06-05 16:53:26 发布

夏莉莉iy

最新推荐文章于 2025-06-05 16:53:26 发布

阅读量1k

点赞数 22

分类专栏：论文精读文章标签：语言模型人工智能自然语言处理计算机视觉 transformer 神经网络机器学习

本文链接：https://blog.youkuaiyun.com/Sherlily/article/details/145816246

版权

论文精读专栏收录该内容

173 篇文章

订阅专栏

论文网址：[2402.02370] AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

论文代码：GitHub - thuml/AutoTimes: Official implementation for "AutoTimes: Autoregressive Time Series Forecasters via Large Language Models"

英文是纯手打的！论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误，若有发现欢迎评论指正！文章偏向于笔记，谨慎食用

2.3.1. Autoregressive Models

2.3.2. Large Language Models for Time Series

2.3.3. Multimodal Language Models

2.4. Method

2.4.1. Modality Alignment

2.4.2. Next Token Prediction

2.4.3. In-Context Forecasting

2.5. Experiments

2.5.1. Time Series Forecasting

2.5.2. Zero-Shot Forecasting

2.5.3. In-Context Forecasting

2.5.4. Method Analysis

2.6. Conclusion

1. 心得

（1）大模型的图都可可爱爱捏，都是小登，真好

2. 论文逐段精读

2.1. Abstract

①Time series analysis by LLM ignores inherent autoregressive property and decoder-only architecture of LLMs

corpora n. 任何事物之主体；全集

revitalize v. 使恢复生机，使复兴（=revitalise）

2.2. Introduction

①Existing approaches:

where non-autoregressive causes inconsistencies in data structure, so they aim to obtain a consistent representation

2.3. Related Work

2.3.1. Autoregressive Models

①现有的LLM本质上是自回归模型

②Autoregressive models excell at multi-step generation

2.3.2. Large Language Models for Time Series

①Lists LLMs for time series prediction

②Functions of each model:

2.3.3. Multimodal Language Models

①For avoiding the separation betweem time series and text prompt, they utilize time itself as embedding

2.4. Method

①Lookback observations: $\mathbf{x}_{1:L}=\{\mathbf{x}_{1},\ldots,\mathbf{x}_{L}\}\in\mathbb{R}^{L\times C}$ , where $L$ denotes time steps and $C$ is variates

②Task: predict future $F$ time steps $\mathbf{x}_{L+1:L+F}=\{\mathbf{x}_{L+1},\ldots,\mathbf{x}_{L+F}\}\in\mathbb{R}^{F\times C}$

③Timestamps $\mathbf{a}_{t}$ (e.g. $2016/07/05\,\,\,\,\,00:00:00$ ) are also add for enchancing prediction ability:

$f:(\mathbf{x}_{1:L},\mathbf{a}_{1:L+F})\mapsto\mathbf{\hat{x}}_{L+1:L+F}$

2.4.1. Modality Alignment

（1）Time series token

①Pipeline of time prediction:

②For signal variate $x_{t}\in\mathbb{R}$ at time point $t$ and context length $NS$ , the $i$ -th segment of length $S$ is:

$\mathbf{s}_{i}=\{x_{(i-1)S+1},\ldots,x_{iS}\}\in\mathbb{R}^{S},i=1,\ldots,N.$

③They align time series tokens and language tokens by:

$\text{SegmentEmbedding}(\cdot):\mathbb{R}^S\mapsto\mathbb{R}^D$

$\mathbf{SE}_{i}=\text{SegmentEmbedding}(\mathbf{s}_{i}),i=1,\ldots,N,$

where dimension $D$ is for aligning with LLM

（2）Position embedding

①Begin of sequence <bos> and end of sequence <eos> design:

where $\mathbf{TE}_{i}=\text{SelectLast}\left(\mathrm{LLM}(\text{TimestampTemplate}(\mathbf{s}_{i}))\right) \in \mathbb{R}^D$ （这个selectlast是啥啊？）

②The final embedding is:

$\mathbf{E}_{i}=\mathbf{SE}_{i}+\mathbf{TE}_{i}$

2.4.2. Next Token Prediction

①对于每一个嵌入 $\mathbf{E}_{i}$ ，作者要分别预测其下一秒的序列：

$\{\hat{\mathbf{E}}_2,\ldots,\hat{\mathbf{E}}_{N+1}\}=\mathrm{LLMLayers}(\{\mathbf{E}_1,\ldots,\mathbf{E}_N\})$

②To futher project this by:

$\hat{\mathbf{s}}_i=\text{SegmentProjection}(\hat{\mathbf{E}}_i),i=2,\ldots,N+1$

③Loss:

$\mathcal{L}_{\mathrm{MSE}}=\frac{1}{NS}\sum||\mathbf{s}_{i}-\mathbf{\hat{s}}_{i}||_{2}^{2},i=2,\ldots,N$

④Multi-steps prediction:

$\mathbf{\hat{s}}_{i}=\text{LLMForecaster}(\mathbf{s}_{<i}),i=1,\ldots,\frac{F}{S}$

2.4.3. In-Context Forecasting

①Task demonstrations in LLM is paired questions and answers:

$\mathcal{C}=\{g(x^{(1)},y^{(1)}),\ldots,g(x^{(m)},y^{(m)})\}$

where $g\left ( \cdot \right )$ denotes the template that transforms each question and answer into natural language

②For extended context $\mathcal{C}$ with $m$ time series prompts $\mathrm{tsp}^{(j)}$ :

$\mathcal{C}=\{\mathrm{tsp}^{(j)}=\mathbf{x}_{\leq t_j}|\text{earlier historical time series}\},j=1,\ldots,m,t_j\leq L$

③In-context forecasting process:

2.5. Experiments

2.5.1. Time Series Forecasting

①Datasets: ETTh1, ECL, Traffic, Weather, and Solar-Energy for long term forecasting, M4 competition for short term forecasting

②Baselines: LLM4TS methods: TimeLLM, UniTime, and FPT; deep forecasters: iTransformer, DLinear, PatchTST, and TimesNet; short-term forecasters: Koopa, N-HiTS and N-BEATS

③Backbone: LLaMA-7B

④Short term performance: