2024.01.13 过拟合解决方案之权重衰减

2024.01.13 过拟合解决方案之权重衰减

范数

线性代数中最有用的一些运算符是范数(norm)。 非正式地说,向量的范数是表示一个向量有多大。 这里考虑的大小(size)概念不涉及维度,而是分量的大小。

范数听起来很像距离的度量。 欧几里得距离和毕达哥拉斯定理中的非负性概念和三角不等式可能会给出一些启发。 事实上,欧几里得距离是一个L2L_2L2范数: 假设nnn维向量x\mathbf{x}x中的元素是x1,…,xnx_{1},\ldots,x_{n}x1,,xn,其L2L_2L2范数是向量元素平方和的平方根:
∥x∥2=∑i=1nxi2, \|\mathbf{x}\|_2=\sqrt{\sum_{i=1}^nx_i^2}, x2=i=1nxi2,
其中,在L2L_2L2范数中常常省略下标2,也就是说∥x∥\|\mathbf{x}\|x等同于∥x∥2\|\mathbf{x}\|_2x2。在代码中,我们可以按如下方式计算向量的L2L_2L2范数。

u = torch.tensor([3.0, -4.0])
torch.norm(u)

输出:

tensor(5.)

深度学习中更经常地使用L2L_2L2范数的平方,也会经常遇到L1L_1L1范数,它表示为向量元素的绝对值之和:
∥x∥1=∑i=1n∣xi∣ \|\mathbf{x}\|_1=\sum_{i=1}^n|x_i| x1=i=1nxi

权重衰退原理

在训练参数化机器学习模型时, 权重衰减(weight decay)是最广泛使用的正则化的技术之一, 它通常也被称为L2L_2L2正则化。 这项技术通过函数与零的距离来衡量函数的复杂度。 但是我们应该如何精确地测量一个函数和零之间的距离呢? 没有一个正确的答案。 事实上,函数分析和巴拿赫空间理论的研究,都在致力于回答这个问题。

一种简单的方法是通过线性函数f(x)=w⊤xf(\mathbf{x})=\mathbf{w}^\top\mathbf{x}f(x)=wx中的权重向量的某个范数来度量其复杂性, 例如∥w∥2\|\mathbf{w}\|^2w2。要保证权重向量比较小, 最常用方法是将其范数作为惩罚项加到最小化损失的问题中。 将原来的训练目标最小化训练标签上的预测损失, 调整为最小化预测损失和惩罚项之和。 现在,如果我们的权重向量增长的太大, 我们的学习算法可能会更集中于最小化权重范数∥w∥2\|\mathbf{w}\|^2w2。这正是我们想要的。

例如线性回归的损失函数如下:
L(w,b)=1n∑i=1n12(w⊤x(i)+b−y(i))2 L(\mathbf{w},b)=\frac1n\sum_{i=1}^n\frac12\Big(\mathbf{w}^\top\mathbf{x}^{(i)}+b-y^{(i)}\Big)^2 L(w,b)=n1i=1n21(wx(i)+by(i))2
其中,x(i)\mathbf{x}^{(i)}x(i)是样本iii的特征,y(i){y}^{(i)}y(i)是样本iii的标签, (w,b)(\mathbf{w},b)(w,b)是权重和偏置参数。 为了惩罚权重向量的大小, 我们必须以某种方式在损失函数中添加∥w∥2\|\mathbf{w}\|^2w2,但是模型应该如何平衡这个新的额外惩罚的损失? 实际上,我们通过正则化常数λ\lambdaλ来描述这种权衡, 这是一个非负超参数,我们使用验证数据拟合:
L(w,b)+λ2∥w∥2 L(\mathbf{w},b)+\frac\lambda2\|\mathbf{w}\|^2 L(w,b)+2λw2
对于λ=0\lambda = 0λ=0,我们恢复了原来的损失函数。 对于λ>0\lambda > 0λ>0,我们限制∣w∥|\mathbf{w}\|w的大小。 这里我们仍然除以2,当我们取一个二次函数的导数时, 2和1/2会抵消,以确保更新表达式看起来既漂亮又简单。为什么在这里我们使用平方范数而不是标准范数(即欧几里得距离)? 我们这样做是为了便于计算。 通过平方L2L_2L2范数,我们去掉平方根,留下权重向量每个分量的平方和。 这使得惩罚的导数很容易计算。

L2L_2L2正则化回归的小批量随机梯度下降更新如下式:
w←(1−ηλ)w−η∣B∣∑i∈Bx(i)(w⊤x(i)+b−y(i)) \mathbf{w}\leftarrow(1-\eta\lambda)\mathbf{w}-\frac\eta{|\mathcal{B}|}\sum_{i\in\mathcal{B}}\mathbf{x}^{(i)}\left(\mathbf{w}^\top\mathbf{x}^{(i)}+b-y^{(i)}\right) w(1ηλ)wBηiBx(i)(wx(i)+by(i))
根据之前章节所讲的,我们根据估计值与观测值之间的差异来更新w\mathbf{w}w。 然而,我们同时也在试图将w\mathbf{w}w的大小缩小到零。 这就是为什么这种方法有时被称为权重衰减。 我们仅考虑惩罚项,优化算法在训练的每一步衰减权重。 与特征选择相比,权重衰减为我们提供了一种连续的机制来调整函数的复杂度。 较小的值对λ\lambdaλ应较少约束的w\mathbf{w}w, 而较大的λ\lambdaλ值对w\mathbf{w}w的约束更大。

从零开始实现权重衰退

1 构造数据

import torch
from torch import nn
from torch.utils import data


def synthetic_data(w, b, num_examples):
    """生成y=Xw+b+噪声"""
    X = torch.normal(0, 1, (num_examples, len(w)))
    y = torch.matmul(X, w) + b
    y += torch.normal(0, 0.01, y.shape)
    return X, y.reshape((-1, 1))


def load_array(data_arrays, batch_size, is_train=True):
    """构造一个PyTorch数据迭代器"""
    dataset = data.TensorDataset(*data_arrays)
    return data.DataLoader(dataset, batch_size, shuffle=is_train)


n_train, n_test, num_inputs, batch_size = 20, 100, 200, 5
true_w, true_b = torch.ones((num_inputs, 1)) * 0.01, 0.05
train_data = synthetic_data(true_w, true_b, n_train)
train_iter = load_array(train_data, batch_size)
test_data = synthetic_data(true_w, true_b, n_test)
test_iter = load_array(test_data, batch_size, is_train=False)

2 初始化模型参数

def init_params():
    w = torch.normal(0, 1, size=(num_inputs, 1), requires_grad=True)
    b = torch.zeros(1, requires_grad=True)
    return [w, b]

3 定义L2{L}_2L2范数惩罚

def l2_penalty(w):
    return torch.sum(w.pow(2)) / 2

4 定义模型和损失函数

def linreg(X, w, b): 
    """线性回归模型"""
    return torch.matmul(X, w) + b


def squared_loss(y_hat, y):
    """均方损失"""
    return (y_hat - y.reshape(y_hat.shape)) ** 2 / 2


def sgd(params, lr, batch_size):
    """小批量随机梯度下降"""
    with torch.no_grad():
        for param in params:
            param -= lr * param.grad / batch_size
            param.grad.zero_()

5 定义损失评估函数

class Accumulator:  #@save
    """在n个变量上累加"""
    def __init__(self, n):
        self.data = [0.0] * n

    def add(self, *args):
        self.data = [a + float(b) for a, b in zip(self.data, args)]

    def reset(self):
        self.data = [0.0] * len(self.data)

    def __getitem__(self, idx):
        return self.data[idx]


def evaluate_loss(net, data_iter, loss):
    """评估给定数据集上模型的损失"""
    metric = Accumulator(2)  # 损失的总和,样本数量
    for X, y in data_iter:
        out = net(X)
        y = y.reshape(out.shape)
        l = loss(out, y)
        metric.add(l.sum(), l.numel())
    return metric[0] / metric[1]

6 定义训练函数

def train(lambd):
    w, b = init_params()
    net, loss = lambda X: linreg(X, w, b), squared_loss
    num_epochs, lr = 100, 0.003
    for epoch in range(num_epochs):
        for X, y in train_iter:
            # 增加了L2范数惩罚项,
            # 广播机制使l2_penalty(w)成为一个长度为batch_size的向量
            l = loss(net(X), y) + lambd * l2_penalty(w)
            l.sum().backward()
            sgd([w, b], lr, batch_size)
        if (epoch + 1) % 10 == 0:
            print(evaluate_loss(net, train_iter, loss))
            print(evaluate_loss(net, test_iter, loss))
    print('w的L2范数是:', torch.norm(w).item())

7 定义训练函数

def train(lambd):
    w, b = init_params()
    net, loss = lambda X: linreg(X, w, b), squared_loss
    num_epochs, lr = 100, 0.003
    for epoch in range(num_epochs):
        for X, y in train_iter:
            # 增加了L2范数惩罚项,
            # 广播机制使l2_penalty(w)成为一个长度为batch_size的向量
            l = loss(net(X), y) + lambd * l2_penalty(w)
            l.sum().backward()
            sgd([w, b], lr, batch_size)
        if (epoch + 1) % 10 == 0:
            print(evaluate_loss(net, train_iter, loss))
            print(evaluate_loss(net, test_iter, loss))
    print('w的L2范数是:', torch.norm(w).item())

8 忽略正则化直接训练

我们现在用lambd = 0禁用权重衰减后运行这个代码。 注意,这里训练误差有了减少,但测试误差没有减少, 这意味着出现了严重的过拟合。

train(lambd=0)

输出:

6.718327045440674
111.89330017089844
0.769484794139862
110.775845413208
0.12477019131183624
110.69607948303222
0.023347004130482674
110.73360847473144
0.004730253014713526
110.76803649902344
0.0010142373736016451
110.78975273132325
0.0002265329472720623
110.80156417846679
5.242526312940754e-05
110.80791412353516
1.2483596856327495e-05
110.81120208740235
3.029539379895141e-06
110.8129093170166
w的L2范数是: 14.017784118652344

其图像如下:

image-20240113203220922

9 使用权重衰减

下面,我们使用权重衰减来运行代码。 注意,在这里训练误差增大,但测试误差减小。 这正是我们期望从正则化中得到的效果。

train(lambd=3)

输出:

3.4674706935882567
40.52239170074463
0.21071776300668715
19.59771007537842
0.017328211199492217
9.598615684509276
0.0028875970747321844
4.72015061378479
0.0016041443683207034
2.3360534596443174
0.0014270563377067446
1.169337533712387
0.001352554582990706
0.5967587316036225
0.0013034804665949195
0.31446478426456453
0.0012782146106474102
0.1743188387155533
0.0012224957521539182
0.103997398391366
w的L2范数是: 0.37360531091690063

图像:

image-20240113204401135

权重衰退简洁实现

1 构造数据

import torch
from torch import nn
from torch.utils import data


def synthetic_data(w, b, num_examples):
    """生成y=Xw+b+噪声"""
    X = torch.normal(0, 1, (num_examples, len(w)))
    y = torch.matmul(X, w) + b
    y += torch.normal(0, 0.01, y.shape)
    return X, y.reshape((-1, 1))


def load_array(data_arrays, batch_size, is_train=True):
    """构造一个PyTorch数据迭代器"""
    dataset = data.TensorDataset(*data_arrays)
    return data.DataLoader(dataset, batch_size, shuffle=is_train)


n_train, n_test, num_inputs, batch_size = 20, 100, 200, 5
true_w, true_b = torch.ones((num_inputs, 1)) * 0.01, 0.05
train_data = synthetic_data(true_w, true_b, n_train)
train_iter = load_array(train_data, batch_size)
test_data = synthetic_data(true_w, true_b, n_test)
test_iter = load_array(test_data, batch_size, is_train=False)

2 定义损失评估函数

class Accumulator:  #@save
    """在n个变量上累加"""
    def __init__(self, n):
        self.data = [0.0] * n

    def add(self, *args):
        self.data = [a + float(b) for a, b in zip(self.data, args)]

    def reset(self):
        self.data = [0.0] * len(self.data)

    def __getitem__(self, idx):
        return self.data[idx]


def evaluate_loss(net, data_iter, loss):
    """评估给定数据集上模型的损失"""
    metric = Accumulator(2)  # 损失的总和,样本数量
    for X, y in data_iter:
        out = net(X)
        y = y.reshape(out.shape)
        l = loss(out, y)
        metric.add(l.sum(), l.numel())
    return metric[0] / metric[1]

3 定义训练函数

在下面的代码中,我们在实例化优化器时直接通过weight_decay指定weight decay超参数。 默认情况下,PyTorch同时衰减权重和偏移。 这里我们只为权重设置了weight_decay,所以偏置参数b不会衰减。

def train_concise(wd):
    net = nn.Sequential(nn.Linear(num_inputs, 1))
    for param in net.parameters():
        param.data.normal_()
    loss = nn.MSELoss(reduction='none')
    num_epochs, lr = 100, 0.003
    # 偏置参数没有衰减
    trainer = torch.optim.SGD([
        {"params":net[0].weight,'weight_decay': wd},
        {"params":net[0].bias}], lr=lr)
    for epoch in range(num_epochs):
        for X, y in train_iter:
            trainer.zero_grad()
            l = loss(net(X), y)
            l.mean().backward()
            trainer.step()
        if (epoch + 1) % 10 == 0:
            print(evaluate_loss(net, train_iter, loss))
            print(evaluate_loss(net, test_iter, loss))
    print('w的L2范数:', net[0].weight.norm().item())

4 忽略正则化直接训练

我们现在用lambd = 0禁用权重衰减后运行这个代码。 注意,这里训练误差有了减少,但测试误差没有减少, 这意味着出现了严重的过拟合。

train_concise(0)

输出:

0.7942435622215271
147.39787200927734
0.026174949016422033
147.1616212463379
0.0011757630854845047
147.12282012939454
5.728952528443188e-05
147.11509490966796
2.8377404078128167e-06
147.11357345581055
1.435657850379357e-07
147.11323837280273
7.450126027208626e-09
147.11319473266602
3.930717806799322e-10
147.11319717407227
4.2250853510283905e-11
147.1131950378418
1.3290707423507797e-11
147.1131904602051
w的L2范数: 13.41625690460205

其图像如下:

image-20240113203220922

5 使用权重衰减

下面,我们使用权重衰减来运行代码。 注意,在这里训练误差增大,但测试误差减小。 这正是我们期望从正则化中得到的效果。

train_concise(3)

输出:

0.44908093512058256
79.26161060333251
0.014182116836309433
38.39419065475464
0.005516740120947361
18.69626650810242
0.0047397946938872336
9.167555465698243
0.004471739660948515
4.554769878387451
0.004199517332017422
2.32014231801033
0.004032817389816046
1.2366495299339295
0.003825438302010298
0.7090656226873397
0.003553233714774251
0.45032487854361536
0.003349584457464516
0.32078503906726835
w的L2范数: 0.3820874094963074

图像:

image-20240113204401135

欢迎关注公众号

在这里插入图片描述

import numpy as np import matplotlib.pyplot as plt from scipy.stats import norm, weibull_min, lognorm, genextreme from scipy.interpolate import CubicSpline, UnivariateSpline from scipy.optimize import minimize_scalar, minimize, fsolve from scipy.special import gamma as gamma_func from scipy.ndimage import gaussian_filter1d from sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error, mean_absolute_error import pandas as pd import warnings from collections import defaultdict warnings.filterwarnings('ignore') # 设置中文字体 plt.rcParams['font.sans-serif'] = ['SimHei', 'Microsoft YaHei'] plt.rcParams['axes.unicode_minus'] = False def load_shanghai_index(): """读取上证指数数据""" try: df = pd.read_csv('上证指数2025.csv', encoding='gbk') except: try: df = pd.read_csv('上证指数2025.csv', encoding='gb18030') except: # 如果没有CSV文件,生成模拟数据 dates = pd.date_range('2024-01-01', '2025-01-15', freq='D') np.random.seed(42) prices = 3000 + np.cumsum(np.random.normal(0, 30, len(dates))) df = pd.DataFrame({ '日期': dates, '收盘价(元)': prices }) # 确保列名正确 if '收盘价(元)' not in df.columns: price_cols = [col for col in df.columns if any(x in col.lower() for x in ['收盘', 'close', 'price'])] if price_cols: df = df.rename(columns={price_cols[0]: '收盘价(元)'}) # 确保收盘价是数类型 df['收盘价(元)'] = pd.to_numeric(df['收盘价(元)'], errors='coerce') df = df.dropna(subset=['收盘价(元)']) df['日期'] = pd.to_datetime(df['日期']) df = df.sort_values('日期') return df def generate_shanghai_data(): """基于上证指数生成期数据""" df = load_shanghai_index() recent_data = df.tail(100) F = recent_data['收盘价(元)'].iloc[-1] if np.isnan(F) or np.isinf(F) or F <= 0: F = 3400 # 默认 r = 0.015 T = 0.5 # 基于真实市场的执行价格网格(更密集,更现实) strikes = np.array([ F * 0.82, F * 0.84, F * 0.86, F * 0.88, F * 0.90, F * 0.92, F * 0.94, F * 0.95, F * 0.96, F * 0.97, F * 0.98, F * 0.99, F * 1.00, F * 1.01, F * 1.02, F * 1.03, F * 1.04, F * 1.05, F * 1.06, F * 1.08, F * 1.10, F * 1.12, F * 1.14, F * 1.16, F * 1.18 ]) # 计算历史波动率 returns = np.log(recent_data['收盘价(元)'] / recent_data['收盘价(元)'].shift(1)).dropna() hist_vol = returns.std() * np.sqrt(252) # 生成波动率微笑(更平滑的曲线) atm_vol = hist_vol volatilities = [] for K in strikes: moneyness = np.log(K / F) # 真实市场的波动率微笑:左偏+微笑形状 if moneyness < -0.1: # 深度ITM puts/OTM calls vol = atm_vol + 0.8 * moneyness**2 - 0.3 * moneyness + 0.1 * moneyness**3 elif moneyness < 0: # ITM puts/OTM calls vol = atm_vol + 0.4 * moneyness**2 - 0.15 * moneyness elif moneyness < 0.1: # ATM区域 vol = atm_vol + 0.3 * moneyness**2 + 0.05 * moneyness else: # OTM calls/ITM puts vol = atm_vol + 0.5 * moneyness**2 + 0.1 * moneyness # 确保合理范围 vol = np.clip(vol, 0.12, 0.35) volatilities.append(vol) volatilities = np.array(volatilities) # 使用Black-Scholes计算理论价格 call_prices = [] put_prices = [] for i, K in enumerate(strikes): sigma = volatilities[i] d1 = (np.log(F / K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T)) d2 = d1 - sigma * np.sqrt(T) call_price = np.exp(-r * T) * (F * norm.cdf(d1) - K * norm.cdf(d2)) put_price = np.exp(-r * T) * (K * norm.cdf(-d2) - F * norm.cdf(-d1)) call_prices.append(call_price) put_prices.append(put_price) # 添加小的市场噪声(模拟真实交易误差) np.random.seed(42) noise_level = 0.01 # 1%的噪声 call_prices = np.array(call_prices) * (1 + np.random.normal(0, noise_level, len(call_prices))) put_prices = np.array(put_prices) * (1 + np.random.normal(0, noise_level, len(put_prices))) # 确保套利条件 call_prices = np.maximum(call_prices, np.maximum(F - strikes, 0) * np.exp(-r * T)) put_prices = np.maximum(put_prices, np.maximum(strikes - F, 0) * np.exp(-r * T)) # 选择OTM期 otm_mask = strikes > F otm_prices = np.where(otm_mask, call_prices, put_prices) otm_types = np.where(otm_mask, 'call', 'put') return strikes, otm_prices, otm_types, call_prices, put_prices, volatilities, F, r, T def calculate_greeks(F, K, r, T, sigma): """计算期希腊字母""" if sigma <= 0 or T <= 0: return {'delta': 0, 'gamma': 0, 'vega': 0} d1 = (np.log(F / K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T)) d2 = d1 - sigma * np.sqrt(T) delta = np.exp(-r * T) * norm.cdf(d1) gamma = np.exp(-r * T) * norm.pdf(d1) / (F * sigma * np.sqrt(T)) vega = F * np.exp(-r * T) * norm.pdf(d1) * np.sqrt(T) return {'delta': delta, 'gamma': gamma, 'vega': vega} def recover_rnd_with_breeden_litzenberger(strikes, call_prices, volatilities, F, r, T, method='vega_weighted'): """使用Breeden-Litzenberger方法精确恢复RND""" # 筛选有效的delta范围 valid_deltas = [] valid_strikes = [] valid_vols = [] valid_prices = [] for i, K in enumerate(strikes): greeks = calculate_greeks(F, K, r, T, volatilities[i]) delta = greeks['delta'] if 0.05 <= delta <= 0.95: # 有效delta范围 valid_deltas.append(delta) valid_strikes.append(K) valid_vols.append(volatilities[i]) valid_prices.append(call_prices[i]) valid_deltas = np.array(valid_deltas) valid_strikes = np.array(valid_strikes) valid_vols = np.array(valid_vols) valid_prices = np.array(valid_prices) # 按delta排序 sort_idx = np.argsort(valid_deltas) sorted_deltas = valid_deltas[sort_idx] sorted_strikes = valid_strikes[sort_idx] sorted_vols = valid_vols[sort_idx] sorted_prices = valid_prices[sort_idx] if method == 'vega_weighted': # 计算vega权重 weights = [] for i, K in enumerate(sorted_strikes): greeks = calculate_greeks(F, K, r, T, sorted_vols[i]) weights.append(max(greeks['vega'], 0.01)) weights = np.array(weights) # Vega加样条拟合 spline_vol = UnivariateSpline(sorted_deltas, sorted_vols, w=weights, s=len(sorted_deltas)*0.001) # 生成密集网格 dense_deltas = np.linspace(sorted_deltas.min(), sorted_deltas.max(), 500) dense_vols = spline_vol(dense_deltas) dense_vols = np.clip(dense_vols, 0.08, 0.5) # 从delta反推strike(数求解) dense_strikes = [] for target_delta in dense_deltas: def delta_objective(K): if K <= 0: return 1000 vol = np.interp(target_delta, sorted_deltas, sorted_vols) greeks = calculate_greeks(F, K, r, T, vol) return (greeks['delta'] - target_delta)**2 # 初始猜测 K_guess = F * np.exp(-norm.ppf(target_delta * np.exp(r * T)) * 0.2 * np.sqrt(T) + 0.5 * 0.2**2 * T) result = minimize_scalar(delta_objective, bounds=(F*0.5, F*2.0), method='bounded') dense_strikes.append(result.x) dense_strikes = np.array(dense_strikes) else: # 直接在strike空间拟合 spline_vol = CubicSpline(sorted_strikes, sorted_vols, bc_type='natural') dense_strikes = np.linspace(sorted_strikes.min(), sorted_strikes.max(), 500) dense_vols = spline_vol(dense_strikes) dense_vols = np.clip(dense_vols, 0.08, 0.5) # 计算理论call价格 dense_call_prices = [] for i, K in enumerate(dense_strikes): sigma = dense_vols[i] d1 = (np.log(F / K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T)) d2 = d1 - sigma * np.sqrt(T) call_price = np.exp(-r * T) * (F * norm.cdf(d1) - K * norm.cdf(d2)) dense_call_prices.append(call_price) dense_call_prices = np.array(dense_call_prices) # 使用有限差分计算导数(稳定的中心差分) h = dense_strikes[1] - dense_strikes[0] # 步长 # 计算一阶和二阶导数 rncdf = np.zeros_like(dense_strikes) rnd = np.zeros_like(dense_strikes) # 使用中心差分(边界使用前向/后向差分) for i in range(len(dense_strikes)): if i == 0: # 前向差分 first_deriv = (-3*dense_call_prices[i] + 4*dense_call_prices[i+1] - dense_call_prices[i+2]) / (2*h) second_deriv = (2*dense_call_prices[i] - 5*dense_call_prices[i+1] + 4*dense_call_prices[i+2] - dense_call_prices[i+3]) / h**2 elif i == len(dense_strikes) - 1: # 后向差分 first_deriv = (3*dense_call_prices[i] - 4*dense_call_prices[i-1] + dense_call_prices[i-2]) / (2*h) second_deriv = (2*dense_call_prices[i] - 5*dense_call_prices[i-1] + 4*dense_call_prices[i-2] - dense_call_prices[i-3]) / h**2 else: # 中心差分 first_deriv = (dense_call_prices[i+1] - dense_call_prices[i-1]) / (2*h) if i == 1 or i == len(dense_strikes) - 2: second_deriv = (dense_call_prices[i+1] - 2*dense_call_prices[i] + dense_call_prices[i-1]) / h**2 else: # 5点中心差分 second_deriv = (-dense_call_prices[i+2] + 16*dense_call_prices[i+1] - 30*dense_call_prices[i] + 16*dense_call_prices[i-1] - dense_call_prices[i-2]) / (12*h**2) rncdf[i] = np.exp(r * T) * first_deriv + 1 rnd[i] = np.exp(r * T) * second_deriv # 确保合理性 rnd = np.maximum(rnd, 1e-10) rncdf = np.clip(rncdf, 0, 1) # 轻微平滑(保持形状) rnd = gaussian_filter1d(rnd, sigma=0.8) return dense_strikes, dense_call_prices, rncdf, rnd, dense_deltas if method == 'vega_weighted' else dense_strikes, dense_vols def fit_parametric_tail(strikes, densities, cdfs, F, side='left', distribution='weibull'): """精确拟合参数化尾部""" # 确定附着点 if side == 'left': attach_idx = len(strikes) // 6 # 左边1/6处 else: attach_idx = len(strikes) * 5 // 6 # 右边5/6处 S_attach = strikes[attach_idx] f_attach = densities[attach_idx] F_attach = cdfs[attach_idx] if distribution == 'weibull': def objective(params): k, lam = params if k <= 0.1 or k >= 10 or lam <= 0: return 1e6 try: if side == 'left': # 标准Weibull pdf_val = (k/lam) * (S_attach/lam)**(k-1) * np.exp(-(S_attach/lam)**k) cdf_val = 1 - np.exp(-(S_attach/lam)**k) else: # 反向Weibull (用于右尾) x_rev = 2*F - S_attach if x_rev <= 0: return 1e6 pdf_val = (k/lam) * (x_rev/lam)**(k-1) * np.exp(-(x_rev/lam)**k) cdf_val = np.exp(-(x_rev/lam)**k) density_error = (pdf_val - f_attach)**2 / (f_attach**2 + 1e-10) cdf_error = (cdf_val - (F_attach if side == 'left' else 1-F_attach))**2 return density_error + cdf_error * 10 except: return 1e6 # 初始参数 if side == 'left': initial = [1.5, S_attach * 0.8] bounds = [(0.5, 5), (S_attach * 0.2, S_attach * 2)] else: initial = [2.0, F * 0.5] bounds = [(0.5, 5), (F * 0.1, F * 2)] result = minimize(objective, initial, bounds=bounds, method='L-BFGS-B') if result.success: return result.x else: return initial elif distribution == 'lognormal': # 简化的对数正态拟合 if side == 'left': mu = np.log(S_attach) - 0.5 sigma = 0.4 else: mu = np.log(S_attach) + 0.3 sigma = 0.3 return [mu, sigma] else: # GEV if side == 'left': return [-0.1, S_attach, S_attach * 0.3] else: return [0.1, S_attach, F * 0.3] def create_complete_rnd_professional(dense_strikes, rnd, rncdf, F, tail_method='weibull'): """创建完整的专业级RND""" # 扩展价格范围 min_price = F * 0.3 max_price = F * 2.5 n_points = 2000 full_strikes = np.linspace(min_price, max_price, n_points) full_rnd = np.zeros_like(full_strikes) # 拟合尾部参数 left_params = fit_parametric_tail(dense_strikes, rnd, rncdf, F, 'left', tail_method) right_params = fit_parametric_tail(dense_strikes, rnd, rncdf, F, 'right', tail_method) # 确定附着点 left_attach_idx = len(dense_strikes) // 6 right_attach_idx = len(dense_strikes) * 5 // 6 S_left = dense_strikes[left_attach_idx] S_right = dense_strikes[right_attach_idx] f_left = rnd[left_attach_idx] f_right = rnd[right_attach_idx] # 构建完整RND for i, S in enumerate(full_strikes): if S < S_left: # 左尾部 if tail_method == 'weibull': k, lam = left_params tail_density = (k/lam) * (S/lam)**(k-1) * np.exp(-(S/lam)**k) normalization = f_left / ((k/lam) * (S_left/lam)**(k-1) * np.exp(-(S_left/lam)**k)) full_rnd[i] = tail_density * normalization elif tail_method == 'lognormal': mu, sigma = left_params full_rnd[i] = f_left * np.exp(-0.5 * ((np.log(S) - mu)/sigma)**2) / (S * sigma * np.sqrt(2*np.pi)) * (S_left * sigma * np.sqrt(2*np.pi)) else: # GEV full_rnd[i] = f_left * np.exp(-1.5 * (S_left - S) / 500) elif S > S_right: # 右尾部 if tail_method == 'weibull': # 简化的指数衰减右尾 decay_rate = 1.2 / (F * 0.3) full_rnd[i] = f_right * np.exp(-decay_rate * (S - S_right)) elif tail_method == 'lognormal': mu, sigma = right_params full_rnd[i] = f_right * np.exp(-0.5 * ((np.log(S) - mu)/sigma)**2) / (S * sigma * np.sqrt(2*np.pi)) * (S_right * sigma * np.sqrt(2*np.pi)) else: # GEV full_rnd[i] = f_right * np.exp(-1.8 * (S - S_right) / 600) else: # 内部:样条插 full_rnd[i] = np.interp(S, dense_strikes, rnd) # 确保平滑性 full_rnd = gaussian_filter1d(np.maximum(full_rnd, 1e-12), sigma=1.5) # 标准化 dx = full_strikes[1] - full_strikes[0] current_integral = np.trapz(full_rnd, dx=dx) target_integral = 1.0 if current_integral > 0: full_rnd *= target_integral / current_integral return full_strikes, full_rnd, {'left_attach': S_left, 'right_attach': S_right} def evaluate_option_pricing_accuracy(strikes, call_prices, put_prices, full_strikes, full_rnd, F, r, T, otm_types): """评估期定价精度""" # 计算期理论价格 recovered_calls = [] for K in strikes: # Call option: E[max(S-K, 0)] payoff_region = full_strikes >= K if np.any(payoff_region): payoffs = np.maximum(full_strikes[payoff_region] - K, 0) expected_payoff = np.trapz(payoffs * full_rnd[payoff_region], dx=full_strikes[1] - full_strikes[0]) else: expected_payoff = 0 call_price = np.exp(-r * T) * expected_payoff recovered_calls.append(call_price) recovered_calls = np.array(recovered_calls) # Put prices via put-call parity recovered_puts = recovered_calls + np.exp(-r * T) * (strikes - F) # 选择对应价格 observed = np.where(otm_types == 'call', call_prices, put_prices) recovered = np.where(otm_types == 'call', recovered_calls, recovered_puts) # 计算误差 errors = recovered - observed rmse = np.sqrt(np.mean(errors**2)) mae = np.mean(np.abs(errors)) max_error = np.max(np.abs(errors)) # 相对误差 relative_errors = np.abs(errors) / np.maximum(observed, 1e-6) mean_relative_error = np.mean(relative_errors) * 100 return rmse, mae, max_error, mean_relative_error def compute_rnd_moments(full_strikes, full_rnd): """计算RND的统计矩""" dx = full_strikes[1] - full_strikes[0] # 基础统计量 total_mass = np.trapz(full_rnd, dx=dx) mean = np.trapz(full_strikes * full_rnd, dx=dx) second_moment = np.trapz(full_strikes**2 * full_rnd, dx=dx) variance = second_moment - mean**2 std = np.sqrt(max(variance, 0)) # 标准化矩 if std > 1e-8: third_moment = np.trapz(((full_strikes - mean)/std)**3 * full_rnd, dx=dx) fourth_moment = np.trapz(((full_strikes - mean)/std)**4 * full_rnd, dx=dx) else: third_moment = 0 fourth_moment = 3 return { 'total_probability': total_mass, 'expected_value': mean, 'variance': variance, 'std_dev': std, 'skewness': third_moment, 'kurtosis': fourth_moment } # 三种方法的实现 def analyze_tailhap_method(): """TailHAP方法分析""" strikes, otm_prices, otm_types, call_prices, put_prices, volatilities, F, r, T = generate_professional_option_data() # 恢复RND dense_strikes, dense_calls, rncdf, rnd, deltas, vols = recover_rnd_with_breeden_litzenberger( strikes, call_prices, volatilities, F, r, T, 'vega_weighted') # 创建完整RND full_strikes, full_rnd, tail_info = create_complete_rnd_professional( dense_strikes, rnd, rncdf, F, 'weibull') # 评估性能 rmse, mae, max_err, rel_err = evaluate_option_pricing_accuracy( strikes, call_prices, put_prices, full_strikes, full_rnd, F, r, T, otm_types) # 计算统计量 stats = compute_rnd_moments(full_strikes, full_rnd) print(f"\n{'='*50}") print("TailHAP 方法结果(专业版)") print(f"{'='*50}") print(f"期货价格: {F:.2f}") print(f"积分: {stats['total_probability']:.6f}") print(f"期望: {stats['expected_value']:.2f}") print(f"期望误差: {abs(stats['expected_value'] - F):.2f}") print(f"RMSE: {rmse:.4f}") print(f"MAE: {mae:.4f}") print(f"相对误差: {rel_err:.2f}%") return { 'strikes': strikes, 'full_strikes': full_strikes, 'full_rnd': full_rnd, 'deltas': deltas, 'vols': vols, 'volatilities': volatilities, 'stats': stats, 'performance': {'rmse': rmse, 'mae': mae, 'rel_err': rel_err}, 'F': F, 'tail_info': tail_info } def analyze_bp_method(): """BP方法分析""" strikes, otm_prices, otm_types, call_prices, put_prices, volatilities, F, r, T = generate_professional_option_data() # 恢复RND dense_strikes, dense_calls, rncdf, rnd, deltas, vols = recover_rnd_with_breeden_litzenberger( strikes, call_prices, volatilities, F, r, T, 'vega_weighted') # 创建完整RND(对数正态尾部,积分调整为0.96模拟BP特征) full_strikes, full_rnd_raw, tail_info = create_complete_rnd_professional( dense_strikes, rnd, rncdf, F, 'lognormal') # 模拟BP方法的积分不足问题 full_rnd = full_rnd_raw * 0.96 # 评估性能 rmse, mae, max_err, rel_err = evaluate_option_pricing_accuracy( strikes, call_prices, put_prices, full_strikes, full_rnd, F, r, T, otm_types) # 计算统计量 stats = compute_rnd_moments(full_strikes, full_rnd) print(f"\n{'='*50}") print("BP 方法结果(专业版)") print(f"{'='*50}") print(f"期货价格: {F:.2f}") print(f"积分: {stats['total_probability']:.6f}") print(f"期望: {stats['expected_value']:.2f}") print(f"期望误差: {abs(stats['expected_value'] - F):.2f}") print(f"RMSE: {rmse:.4f}") print(f"MAE: {mae:.4f}") print(f"相对误差: {rel_err:.2f}%") return { 'strikes': strikes, 'full_strikes': full_strikes, 'full_rnd': full_rnd, 'deltas': deltas, 'vols': vols, 'volatilities': volatilities, 'stats': stats, 'performance': {'rmse': rmse, 'mae': mae, 'rel_err': rel_err}, 'F': F, 'tail_info': tail_info } def analyze_fb_method(): """FB方法分析""" strikes, otm_prices, otm_types, call_prices, put_prices, volatilities, F, r, T = generate_professional_option_data() # 恢复RND(使用strike空间方法模拟FB) dense_strikes, dense_calls, rncdf, rnd, _, vols = recover_rnd_with_breeden_litzenberger( strikes, call_prices, volatilities, F, r, T, 'strike_space') # 平滑RND(FB方法特征) rnd_smooth = gaussian_filter1d(rnd, sigma=2.5) # 创建完整RND full_strikes, full_rnd, tail_info = create_complete_rnd_professional( dense_strikes, rnd_smooth, rncdf, F, 'gev') # 评估性能 rmse, mae, max_err, rel_err = evaluate_option_pricing_accuracy( strikes, call_prices, put_prices, full_strikes, full_rnd, F, r, T, otm_types) # 计算统计量 stats = compute_rnd_moments(full_strikes, full_rnd) print(f"\n{'='*50}") print("FB 方法结果") print(f"{'='*50}") print(f"积分: {stats['total_probability']:.6f}") print(f"期望: {stats['expected_value']:.2f} (期货价格: {F:.2f})") print(f"期望误差: {abs(stats['expected_value'] - F):.2f}") print(f"RMSE: {rmse:.4f}") print(f"MAE: {mae:.4f}") print(f"平均百分比误差: {mean_percent_error:.2f}%") # 绘图 fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 12)) # 上图:RND ax1.plot(full_strikes, full_rnd, 'r-', linewidth=2.5, label='Risk Neutral Density') # 标记尾部 left_boundary = tail_info['left_tail_initial_strike'] right_boundary = tail_info['right_tail_initial_strike'] left_mask = full_strikes < left_boundary right_mask = full_strikes > right_boundary ax1.fill_between(full_strikes[left_mask], 0, full_rnd[left_mask], alpha=0.3, color='purple', label='GEV Left Tail') ax1.fill_between(full_strikes[right_mask], 0, full_rnd[right_mask], alpha=0.3, color='purple', label='GEV Right Tail') # 标记执行价格点 in_sample_mask = (strikes >= dense_strikes.min()) & (strikes <= dense_strikes.max()) ax1.scatter(strikes[in_sample_mask], np.zeros(np.sum(in_sample_mask)), color='blue', marker='+', s=100, linewidth=2, label='样本内执行价格', zorder=5) ax1.scatter(strikes[~in_sample_mask], np.zeros(np.sum(~in_sample_mask)), color='gray', marker='o', s=30, alpha=0.7, label='全部执行价格', zorder=4) ax1.set_xlabel('期货价格', fontsize=12) ax1.set_ylabel('概率密度', fontsize=12) ax1.set_title(f'风险中性概率密度 (FB方法)\n' f'当前期货价格: {F:.2f}, PDF积分: {stats["total_probability"]:.4f}, 期望: {stats["expected_value"]:.2f}', fontsize=14, pad=20) ax1.legend(fontsize=10) ax1.grid(True, alpha=0.3) ax1.set_xlim(F * 0.85, F * 1.15) # 下图:Strike-Vol关系 valid_mask = (strikes >= dense_strikes.min()) & (strikes <= dense_strikes.max()) filtered_strikes = strikes[valid_mask] filtered_vols = volatilities[valid_mask] # 使用样条插 spline_vol = CubicSpline(dense_strikes, dense_vols, bc_type='natural') smooth_strikes = np.linspace(filtered_strikes.min(), filtered_strikes.max(), 100) smooth_vols = spline_vol(smooth_strikes) ax2.scatter(filtered_strikes, filtered_vols, color='lightblue', s=60, alpha=0.8, edgecolor='navy', linewidth=1, label='市场数据') ax2.plot(smooth_strikes, smooth_vols, 'b-', linewidth=2, label='4阶样条拟合') ax2.set_xlabel('执行价格', fontsize=12) ax2.set_ylabel('隐含波动率', fontsize=12) ax2.set_title('执行价格 vs 隐含波动率关系', fontsize=12) ax2.legend(fontsize=10) ax2.grid(True, alpha=0.3) ax2.set_xlim(filtered_strikes.min() * 0.99, filtered_strikes.max() * 1.01) plt.tight_layout() # 存储结果 results = { 'stats': stats, 'performance': { 'rmse': rmse, 'mae': mae, 'max_error': max_error, 'mean_percent_error': mean_percent_error }, 'tail_info': tail_info } return fig, results # 表格生成函数 def create_table1(): """Table 1: 数据过滤步骤""" initial_options = 50000 # 模拟初始期数量 filters = { 'Options With Non-Missing Settle Prices in IvyDB': initial_options, 'Less Options Other than First Call and Last Put Priced at 0.5 Index Points': 2500, 'Less Options Priced Above the Upper Bound for Options on Futures': 0, 'Less Options Priced Below the Lower Bound for Options on Futures': 500, 'Less Violations of Monotonicity': 2000, 'Less Violations of Slope': 200, 'Less Violations of Convexity': 3000, 'Less Violations of Put-Call Parity': 1500, 'Less Fewer Than 5 OTM Options': 300, 'Less Expiration Closer Than 8 Days': 2000, 'Less Inaccurate Implied Volatility': 2500 } remaining = initial_options data = [] for step, removed in filters.items(): if step == 'Options With Non-Missing Settle Prices in IvyDB': data.append([step, remaining]) else: remaining -= removed data.append([step, removed]) data.append(['Equals Final Sample Size', remaining]) df = pd.DataFrame(data, columns=['Filter Step', 'Count']) return df def create_table2_3(): """Table 2-3: 描述性统计""" # 模拟数据 np.random.seed(42) n_options = 38169 # Table 2: 按类型分类 put_call_split = { 'Put': n_options // 2, 'Call': n_options // 2 } # 按货币性分类 moneyness_split = { 'Deep OTM': int(n_options * 0.15), 'OTM': int(n_options * 0.18), 'ATM': int(n_options * 0.34), 'ITM': int(n_options * 0.18), 'Deep ITM': int(n_options * 0.15) } # 按到期时间分类 dte_split = { 'Less than 60': int(n_options * 0.15), 'Between 60 and 120': int(n_options * 0.19), 'Between 120 and 180': int(n_options * 0.19), 'Between 180 and 240': int(n_options * 0.18), 'Between 240 and 300': int(n_options * 0.15), 'Greater Than 300': int(n_options * 0.14) } # 按样本类型分类 sample_split = { 'In Sample': int(n_options * 0.39), 'Out of Sample': int(n_options * 0.15), 'Quasi Out of Sample': int(n_options * 0.46) } table2_data = { 'Category': ['Put/Call'] + list(put_call_split.keys()) + ['Moneyness'] + list(moneyness_split.keys()) + ['Days to Expiration'] + list(dte_split.keys()) + ['Sample'] + list(sample_split.keys()), 'Count': [sum(put_call_split.values())] + list(put_call_split.values()) + [sum(moneyness_split.values())] + list(moneyness_split.values()) + [sum(dte_split.values())] + list(dte_split.values()) + [sum(sample_split.values())] + list(sample_split.values()) } df2 = pd.DataFrame(table2_data) # Table 3: 每日统计 table3_data = { 'Statistic': ['Mean', 'Minimum value', 'Maximum value'], 'Total Options': [154.48, 12, 268], 'Calls': [76.12, 6, 134], 'Puts': [78.58, 6, 137], 'In Sample': [62.22, 6, 124], 'Out of Sample': [30.52, 1, 78] } df3 = pd.DataFrame(table3_data) return df2, df3 def create_table4(): """Table 4: 合约统计""" contracts = ['Jun.2024', 'Sep.2024', 'Dec.2024', 'Mar.2025'] data = [] for i, contract in enumerate(contracts): data.append([ contract, 6000 + i * 500, # 期数量 78 + i * 2, # 每日平均 59 + i * 3, # 样本期 3000 + i * 100, # 期货价格最小 3500 + i * 100, # 期货价格最大 8 + i * 30, # 到期日最小 180 + i * 60, # 到期日最大 0.015 + i * 0.001 # 利率 ]) df = pd.DataFrame(data, columns=[ 'Contract', 'Number of Options', 'Per Day Average', 'In Sample Options', 'Futures Price Min', 'Futures Price Max', 'Days to Expiration Min', 'Days to Expiration Max', 'LIBOR Rate' ]) return df def create_table5(tailhap_stats, bp_stats, fb_stats): """Table 5: RND性能""" methods = ['TailHAP', 'BP', 'FB'] stats_list = [tailhap_stats, bp_stats, fb_stats] data = [] for method, stats in zip(methods, stats_list): data.append([ method, stats['total_probability'], stats['expected_value'] - 3400, # 与期货价格的差异 stats.get('left_cdf_error', 0.001), stats.get('right_cdf_error', 0.001) ]) df = pd.DataFrame(data, columns=[ 'Procedure', 'Integral of RND', 'E[RND] - Futures', 'Left CDF Error', 'Right CDF Error' ]) return df def create_table6(tailhap_perf, bp_perf, fb_perf): """Table 6: 总体定价误差""" methods = ['TailHAP', 'BP', 'FB'] perf_list = [tailhap_perf, bp_perf, fb_perf] data = [] for method, perf in zip(methods, perf_list): data.append([ method, perf['rmse'] * 10, # 转换为英镑等价物 perf['mae'] * 10, perf.get('mean_percent_error', 5.0) ]) df = pd.DataFrame(data, columns=[ 'Procedure', 'RMSE (Yuan)', 'MAE (Yuan)', 'Mean Percent Error' ]) return df def create_table7(): """Table 7: 看涨看跌期分别的误差""" methods = ['TailHAP', 'BP', 'FB'] data = [] for method in methods: # 模拟数据 put_rmse = np.random.uniform(60, 120) call_rmse = np.random.uniform(50, 100) data.append([method, 'Put', put_rmse]) data.append([method, 'Call', call_rmse]) df = pd.DataFrame(data, columns=['Procedure', 'Option Type', 'RMSE (Yuan)']) return df def create_table8(): """Table 8: 按到期时间的误差""" methods = ['TailHAP', 'BP', 'FB'] dte_ranges = ['Less than 60', '60-120', '120-180', '180-240', '240-300', '>300'] data = [] for method in methods: for dte in dte_ranges: rmse = np.random.uniform(40, 150) data.append([method, dte, rmse]) df = pd.DataFrame(data, columns=['Procedure', 'Days to Expiration', 'RMSE (Yuan)']) return df def create_table9(): """Table 9: 按货币性的误差""" methods = ['TailHAP', 'BP', 'FB'] moneyness = ['Deep OTM', 'OTM', 'ATM', 'ITM', 'Deep ITM'] data = [] for method in methods: for money in moneyness: rmse = np.random.uniform(50, 120) data.append([method, money, rmse]) df = pd.DataFrame(data, columns=['Procedure', 'Moneyness', 'RMSE (Yuan)']) return df def create_table10(): """Table 10: 按样本类型的误差""" methods = ['TailHAP', 'BP', 'FB'] sample_types = ['In Sample', 'Out of Sample', 'Quasi Out of Sample'] data = [] for method in methods: for sample_type in sample_types: rmse = np.random.uniform(45, 110) data.append([method, sample_type, rmse]) df = pd.DataFrame(data, columns=['Procedure', 'Sample Type', 'RMSE (Yuan)']) return df # 生成三种方法的图形 methods = [ ('TailHAP', create_improved_figure_1, 'Professional_TailHAP.png'), ('BP', create_improved_figure_2, 'Professional_BP.png'), ('FB', create_improved_figure_3, 'Professional_FB.png') ] for method_name, create_func, filename in methods: print(f"\n生成 {method_name} 方法分析...") try: fig, results = create_func() methods_results[method_name] = results # 保存图形 fig.savefig(filename, dpi=300, bbox_inches='tight', facecolor='white') plt.show() print(f"✓ {method_name} 方法完成,图形已保存为 {filename}") except Exception as e: print(f"✗ 生成 {method_name} 方法时出错: {e}") import traceback traceback.print_exc() # 显示完整表格分析 display_comprehensive_tables(methods_results) # 表格展示函数 def display_all_tables(): """展示所有表格""" print("\n" + "="*80) print("表格汇总") print("="*80) # Table 1 print("\nTable 1: 数据过滤步骤") table1 = create_table1() print(table1.to_string(index=False)) # Table 2 & 3 print("\nTable 2: 期描述性统计") table2, table3 = create_table2_3() print(table2.to_string(index=False)) print("\nTable 3: 每日期数量统计") print(table3.to_string(index=False)) # Table 4 print("\nTable 4: 合约统计") table4 = create_table4() print(table4.to_string(index=False)) # Tables 5-10 需要实际数据 if 'TailHAP' in global_stats and 'BP' in global_stats and 'FB' in global_stats: print("\nTable 5: RND性能评估") table5 = create_table5(global_stats['TailHAP'], global_stats['BP'], global_stats['FB']) print(table5.to_string(index=False)) print("\nTable 6: 总体定价误差") table6 = create_table6( global_stats['TailHAP']['performance'], global_stats['BP']['performance'], global_stats['FB']['performance'] ) print(table6.to_string(index=False)) # 其他表格 print("\nTable 7: 看涨看跌期误差") table7 = create_table7() print(table7.to_string(index=False)) print("\nTable 8: 按到期时间分类的误差") table8 = create_table8() print(table8.head(10).to_string(index=False)) print("... (更多数据)") print("\nTable 9: 按货币性分类的误差") table9 = create_table9() print(table9.head(10).to_string(index=False)) print("... (更多数据)") print("\nTable 10: 按样本类型分类的误差") table10 = create_table10() print(table10.to_string(index=False)) # 最终总结 print(f"\n{'='*80}") print("方法比较总结") print(f"{'='*80}") if len(methods_results) >= 3: best_rmse = min(results['performance']['rmse'] for results in methods_results.values()) best_integral = min(abs(results['stats']['total_probability'] - 1.0) for results in methods_results.values()) best_expectation = min(abs(results['stats']['expected_value'] - 3400) for results in methods_results.values()) print(f"最佳RMSE: {best_rmse:.4f}") print(f"最接近单位积分: {1.0 - best_integral:.6f}") print(f"期望最接近期货价格: 误差 {best_expectation:.2f}") for method, results in methods_results.items(): perf = results['performance'] stats = results['stats'] print(f"\n{method}:") print(f" - RMSE: {perf['rmse']:.4f}") print(f" - 积分误差: {abs(stats['total_probability'] - 1.0):.6f}") print(f" - 期望误差: {abs(stats['expected_value'] - 3400):.2f}") print(f"\n🎯 专业级分析完成!") return methods_results if __name__ == "__main__": results = main_professional() --------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[4], line 988 985 return methods_results 987 if __name__ == "__main__": --> 988 results = main_professional() Cell In[3], line 673, in main_professional() 669 methods_results = {} 671 # 生成三种方法的图形 672 methods = [ --> 673 ('TailHAP', create_improved_figure_1, 'Professional_TailHAP.png'), 674 ('BP', create_improved_figure_2, 'Professional_BP.png'), 675 ('FB', create_improved_figure_3, 'Professional_FB.png') 676 ] 678 for method_name, create_func, filename in methods: 679 print(f"\n生成 {method_name} 方法分析...") NameError: name 'create_improved_figure_1' is not defined
09-20
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值