LSTM预测外汇涨跌平

参考论文《Forecasting directional movement
of Forex data using LSTM with technical
and macroeconomic indicators》

数据样式:其中label为预测标签,根据论文中的阈值法确定
在这里插入图片描述

import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import torch
from torch import nn
from torch.nn import functional as F
import torch.optim as optim

# 读取宏观数据并划分数据集,20%用于预测
data_ma = pd.read_csv("data_macro.csv")
Xma_train = data_ma.iloc[:int(0.8*data_ma.shape[0]),1:-1]
Xma_test = data_ma.iloc[int(0.8*data_ma.shape[0]):,1:-1]
yma_train = data_ma.iloc[:int(0.8*data_ma.shape[0]),-1:].values
yma_test = data_ma.iloc[int(0.8*data_ma.shape[0]):,-1:].values

def scaler(X_train,X_test):
    """
    数据归一化
    """
    mm = MinMaxScaler()
    mm.fit(X_train)
    mmX_train = mm.transform(X_train)
    mmX_test = mm.transform(X_test)
    return mmX_train,mmX_test
Xma_train,Xma_test = scaler(Xma_train,Xma_test)

def get_seqdata(X,y,seq_len=13):
    """
    获取LSTM的输入:根据序列长度获得训练数据集
    """
    n = X.shape[0]
    seq_data = []
    seq_y = []
    for i in range(n-seq_len+1):
        seq_data.append(X[i:i+seq_len,:])
        seq_y.append(y[i+seq_len-1])
    return np.array(seq_data),np.array(seq_y)

def tensor_transform():
    """
    转换数据格式
    """
    # 转换为(batch,sqe_len,features)的形式
    X_train,y_train = get_seqdata(Xma_train,yma_train)
    X_test,y_test = get_seqdata(Xma_test,yma_test)
    # 转换为torch的输入格式
    X_train = torch.from_numpy(X_train).float()
    X_test = torch.from_numpy(X_test).float()
    y_train = torch.from_numpy(y_train).long()
    y_test = torch.from_numpy(y_test).long()
    return X_train,X_test,y_train,y_test

class LSTM(nn.Module):
	"""
	构造LSTM模型
	"""
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super(LSTM, self).__init__()
        self.num_layers = num_layers
        self.hidden_size = hidden_size
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).requires_grad_()
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).requires_grad_()

        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
        out = self.fc(out[:, -1, :])
        out = torch.argmax(F.softmax(out, 1), dim=1)
        return out

# lstm 参数
input_size = 8
hidden_size = 16
num_layers = 2
output_size = 3
lr = 0.0001
num_epochs = 100

def model_train(X_train,y_train):
	"""
	模型训练
	"""
    loss_list = []
    model = LSTM(input_size, hidden_size, num_layers, output_size)
    criterion = nn.MultiLabelSoftMarginLoss()
    optimizer = optim.Adam(model.parameters(), lr=lr)
    hist = np.zeros(num_epochs)
    lstm = []
    for t in range(num_epochs):
        y_train_pred = model(X_train)
        y_train_pred = torch.tensor(y_train_pred, dtype=torch.float)
        y_train_pred = torch.unsqueeze(y_train_pred, 1)
        l = criterion(torch.tensor(y_train, dtype=torch.float), y_train_pred)
        print("Epoch ", t, "MSE: ", l.item())
        hist[t] = l.item()
        optimizer.zero_grad()
        l.requires_grad_(True)
        l.backward()
        optimizer.step()
    return model


def profit_accuracy(df: pd.DataFrame):
    """
    自定义准确率
    @return: 返回准确率计算结果
    """
    if "y_pred" not in df.columns:
        print("输入的数据有问题")
        return
    print("可交易数量: ", (df.y_pred != 0).sum(), "\t总数:", df.y_pred.shape[0])

    true_dec = df.query("y_true == 1 and y_pred == 1").shape[0]
    true_inc = df.query("y_true == 2 and y_pred == 2").shape[0]
    false_dec_noact = df.query("y_true == 0 and y_pred == 1").shape[0]
    false_inc_noact = df.query("y_true == 0 and y_pred == 2").shape[0]
    false_inc_dec = df.query("y_true == 1 and y_pred == 2").shape[0]
    false_dec_inc = df.query("y_true == 2 and y_pred == 1").shape[0]
    accuracy = (true_dec + true_inc) / (true_dec + true_inc + false_dec_inc + false_inc_dec + false_dec_noact + false_inc_noact)
    print('accuracy:',accuracy)
    return accuracy

if __name__ == '__main__':
    # 模型训练
    X_train, X_test, y_train, y_test = tensor_transform()
    model = model_train(X_train,y_train)

    # 模型预测并计算结果
    result = pd.DataFrame(columns=['y_true', 'y_pred'])
    result['y_true']=y_test.numpy().reshape(-1)
    result['y_pred']=model(X_test)
    accuracy = profit_accuracy(result)
### 使用BiLSTM模型进行股票价格涨跌预测 #### 数据预处理 为了有效地应用BiLSTM模型于股票价格预测,数据预处理是一个至关重要的环节。通常情况下,会选取特定的时间窗口内的历史收盘价作为输入特征,并将其转换成适合神经网络处理的形式。例如,在Matlab中可以通过如下方式提取并重塑训练数据: ```matlab new_data = data.close; % 预测目标为收盘价 train_data = new_data(1:round(numel(new_data)*0.7)); % 取前70%的数据用于训练 train_prices = reshape(train_data', [], 1); % 转换成列向量形式 ``` 接着,还需要对这些原始数值实施标准化或其他类型的缩放操作,以便更好地适应深度学习算法的要求。 #### 构建BiLSTM模型架构 构建一个有效的BiLSTM模型结构对于捕捉时间序列中的长期依赖关系至关重要。下面给出了一种可能的BiLSTM层配置方案: ```matlab layers = [ sequenceInputLayer(inputSize,'Name','input') bilstmLayer(hiddenUnits,'OutputMode','sequence','Name','bilstm') fullyConnectedLayer(outputSize,'Name','fc') regressionLayer('Name','output')]; ``` 这里`hiddenUnits`代表隐藏单元的数量,而`outputSize`则取决于具体的任务需求——如果是二分类问题(即判断股价上升还是下降),那么这个值应该是2;如果只是回归预测,则应设置为1[^1]。 #### 模型训练与评估 完成上述准备工作之后就可以开始训练过程了。值得注意的是,在实际应用场景下应当预留一部分测试集用来最终检验模型性能。此外,还可以考虑引入交叉验证机制进一步提升泛化能力。关于超参数的选择以及调优策略方面,可以参考其他研究工作提供的经验指导[^2]。 最后,当模型经过充分训练后,便能够依据给定的历史行情信息对未来短期内的价格变动趋势做出较为合理的估计。不过正如前面提到过的那样,由于股市本身的复杂性和随机性,任何预测结果都不应该被视作绝对可靠的决策依据。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值