[NeurIPS 2024]AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

论文网址:[2402.02370] AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

论文代码:GitHub - thuml/AutoTimes: Official implementation for "AutoTimes: Autoregressive Time Series Forecasters via Large Language Models"

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related Work

2.3.1. Autoregressive Models

2.3.2. Large Language Models for Time Series

2.3.3. Multimodal Language Models

2.4. Method

2.4.1. Modality Alignment

2.4.2.  Next Token Prediction

2.4.3.  In-Context Forecasting

2.5. Experiments

2.5.1. Time Series Forecasting

2.5.2. Zero-Shot Forecasting

2.5.3. In-Context Forecasting

2.5.4. Method Analysis

2.6. Conclusion


1. 心得

(1)大模型的图都可可爱爱捏,都是小登,真好

2. 论文逐段精读

2.1. Abstract

        ①Time series analysis by LLM ignores inherent autoregressive property and decoder-only architecture of LLMs

corpora  n. 任何事物之主体;全集

revitalize  v. 使恢复生机,使复兴(=revitalise)

2.2. Introduction

        ①Existing approaches:

where non-autoregressive causes inconsistencies in data structure, so they aim to obtain a consistent representation

2.3. Related Work

2.3.1. Autoregressive Models

        ①现有的LLM本质上是自回归模型

        ②Autoregressive models excell at multi-step generation

2.3.2. Large Language Models for Time Series

        ①Lists LLMs for time series prediction

        ②Functions of each model:

2.3.3. Multimodal Language Models

        ①For avoiding the separation betweem time series and text prompt, they utilize time itself as embedding

2.4. Method

        ①Lookback observations: \mathbf{x}_{1:L}=\{\mathbf{x}_{1},\ldots,\mathbf{x}_{L}\}\in\mathbb{R}^{L\times C}, where L denotes time steps and C is variates

        ②Task: predict future F time steps \mathbf{x}_{L+1:L+F}=\{\mathbf{x}_{L+1},\ldots,\mathbf{x}_{L+F}\}\in\mathbb{R}^{F\times C}

        ③Timestamps \mathbf{a}_{t} (e.g. 2016/07/05\,\,\,\,\,00:00:00) are also add for enchancing prediction ability:

f:(\mathbf{x}_{1:L},\mathbf{a}_{1:L+F})\mapsto\mathbf{\hat{x}}_{L+1:L+F}

2.4.1. Modality Alignment

(1)Time series token

        ①Pipeline of time prediction:

        ②For signal variate x_{t}\in\mathbb{R} at time point t and context length NS, the i-th segment of length S is:

\mathbf{s}_{i}=\{x_{(i-1)S+1},\ldots,x_{iS}\}\in\mathbb{R}^{S},i=1,\ldots,N.

        ③They align time series tokens and language tokens by:

\text{SegmentEmbedding}(\cdot):\mathbb{R}^S\mapsto\mathbb{R}^D

\mathbf{SE}_{i}=\text{SegmentEmbedding}(\mathbf{s}_{i}),i=1,\ldots,N,

where dimension D is for aligning with LLM

(2)Position embedding

        ①Begin of sequence <bos> and end of sequence <eos> design:

where \mathbf{TE}_{i}=\text{SelectLast}\left(\mathrm{LLM}(\text{TimestampTemplate}(\mathbf{s}_{i}))\right) \in \mathbb{R}^D这个selectlast是啥啊?

        ②The final embedding is:

\mathbf{E}_{i}=\mathbf{SE}_{i}+\mathbf{TE}_{i}

2.4.2.  Next Token Prediction

        ①对于每一个嵌入\mathbf{E}_{i},作者要分别预测其下一秒的序列:

\{\hat{\mathbf{E}}_2,\ldots,\hat{\mathbf{E}}_{N+1}\}=\mathrm{LLMLayers}(\{\mathbf{E}_1,\ldots,\mathbf{E}_N\})

        ②To futher project this by:

\hat{\mathbf{s}}_i=\text{SegmentProjection}(\hat{\mathbf{E}}_i),i=2,\ldots,N+1

        ③Loss:

\mathcal{L}_{\mathrm{MSE}}=\frac{1}{NS}\sum||\mathbf{s}_{i}-\mathbf{\hat{s}}_{i}||_{2}^{2},i=2,\ldots,N

        ④Multi-steps prediction:

\mathbf{\hat{s}}_{i}=\text{LLMForecaster}(\mathbf{s}_{<i}),i=1,\ldots,\frac{F}{S}

2.4.3.  In-Context Forecasting

        ①Task demonstrations in LLM is paired questions and answers:

\mathcal{C}=\{g(x^{(1)},y^{(1)}),\ldots,g(x^{(m)},y^{(m)})\}

where g\left ( \cdot \right ) denotes the template that transforms each question and answer into natural language

        ②For extended context \mathcal{C} with m time series prompts \mathrm{tsp}^{(j)}:

\mathcal{C}=\{\mathrm{tsp}^{(j)}=\mathbf{x}_{\leq t_j}|\text{earlier historical time series}\},j=1,\ldots,m,t_j\leq L

        ③In-context forecasting process:

2.5. Experiments

2.5.1. Time Series Forecasting

        ①Datasets: ETTh1, ECL, Traffic, Weather, and Solar-Energy for long term forecasting, M4 competition for short term forecasting

        ②Baselines: LLM4TS methods: TimeLLM, UniTime, and FPT; deep forecasters: iTransformer, DLinear, PatchTST, and TimesNet; short-term forecasters: Koopa, N-HiTS and N-BEATS

        ③Backbone: LLaMA-7B

        ④Short term performance:

        ⑤Long term performance table:

2.5.2. Zero-Shot Forecasting

        ①Zero shot performance:

2.5.3. In-Context Forecasting

        ①In-context forecasting fomular:

f:(\{x_{1:2F}\},x_{t+1:t+F},\mathbf{a}_{t+1:t+2F})\mapsto\hat{x}_{t+F+1,t+2F}

        ②In-context performance:

2.5.4. Method Analysis

        ①Backbone ablation:

        ②Efficiency of LLMs:

        ③Training and inference time:

        ④LLM4TS ablation:

        ⑤LoRA combined performance:

2.6. Conclusion

        ~

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值