为什么要使用mode='absolute'使索引回到最初的位置

本文详细解析了Python中数据库游标操作的原理,包括使用.scroll方法设置游标位置的细节,以及.fetch*系列方法如何根据游标位置获取数据。通过实例演示了相对和绝对模式下游标的移动方式。

【问题】:为什么第二步要使用mode='absolute'使索引回到最初的位置呢?

self.cur.execute(condition)
self.cur.scroll(0,mode='absolute')# 游标索引回到初始位置
results =self.cur.fetchall()#返回游标中所有结果

【我的理解】:第二步是将游标回到原点,继而获取所有数据

1、我的数据库表数据是3条:

 

2、第一步是在游标下执行sql,那么执行的结果是先放在游标中的,这个没有异议;
3、第二步是在游标下执行.scroll方法,那么不清楚这一步具体是做啥的,我们看下Python源码是怎么解释的:

 

4、第三步在游标下执行fetchall()方法,同样我们看下Python源码是怎么解释的:

 

【更进一步】:设置游标位置

可以通过cursor.scroll(position, mode="relative | absolute")方法,来设置相对位置游标和绝对位置游标。
方法参数描述:
position : 游标位置(游标位置从0开始)
mode : 游标位置的模式(relative:默认模式,相对当前位置(即执行scroll方法时游标的位置);absolute:绝对位置)
例如:
mode=relative, position=1;表示的是设置游标为当前位置+1的位置,即向下移动一个位置
mode=absolute, position=2;将游标移动到索引为2的位置,无论当前位置在哪里

【样例解释】

#hrwang
#coding:utf-8
import MySQLdb
connection = MySQLdb.connect(host="127.0.0.1", port=3306, user="root", passwd="root", db='test_interface', charset='utf8')
cursor = connection.cursor()
# 返回执行结果数
nums = cursor.execute("select * from test_xml")#self._rows=<type 'tuple'>: ((1L, u'test1'), (2L, u'weqwe'), (3L, u'weqw'))
print(nums)#执行结果影响的数据条数
'''
    fetchone()源码
    def fetchone(self):
        """Fetches a single row from the cursor. None indicates that
        no more rows are available."""
        self._check_executed()
        if self.rownumber >= len(self._rows): return None
        result = self._rows[self.rownumber]
        self.rownumber = self.rownumber+1
        return result
'''
print(cursor.fetchone())  # 执行后,游标移动到索引位置为1,即self.rownumber=1
cursor.scroll(1,mode='relative')  # 相对游标移动模式,当前索引+1,即游标位置为2
'''
    fetchall()源码
    def fetchall(self):
        """Fetchs all available rows from the cursor."""
        self._check_executed()
        if self.rownumber:
            result = self._rows[self.rownumber:]
        else:
            result = self._rows
        self.rownumber = len(self._rows)
        return result
'''
print(cursor.fetchall())  # 因此获取所有数据即result=self._rows[2:],即result等于数据库第三个数
'''
    fetchmany()源码
    def fetchmany(self, size=None):
        """Fetch up to size rows from the cursor. Result set may be smaller
        than size. If size is not defined, cursor.arraysize is used."""
        self._check_executed()
        end = self.rownumber + (size or self.arraysize)
        result = self._rows[self.rownumber:end]
        self.rownumber = min(end, len(self._rows))
        return result
'''
print(cursor.fetchmany(size=1))  # 执行该语句前,self.rownumber=3,故result=()
cursor.scroll(0, mode="absolute")  # 绝对索引模式,将游标重置为0
print(cursor.fetchall())  # 因此获取所有数据,执行开始的时候self.rownumber=0,故result=self._rows
#执行结果是:
3
(1L, u'test1')
((3L, u'weqw'),)
()
((1L, u'test1'), (2L, u'weqwe'), (3L, u'weqw'))

################################################
先说下更新的原因:
今天在查询数据库数据时(表中无数据),使用self.cur.scroll(0,mode='absolute')会提示“out of range”,细心的朋友看源码也知道原因了,但是我不能理解源码为什么要这么写,难道没有数据就不能使用这个方法了吗?最后我没有找到原因也没去修改源码,默默的加了条数据。

 

 

 


链接:https://www.jianshu.com/p/2b327f1ddba1

 

当前特征数量: 8 归一化器特征数量: 7 目标特征位置: 0 重新评估测试集性能...(不是,这个结果不对吧) ===== 1. 导入包 ===== import torch import torch.nn as nn import torch.optim as optim import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error, mean_absolute_error from torch.utils.data import DataLoader, TensorDataset, random_split import time import os 设置中文显示 plt.rcParams[‘font.sans-serif’] = [‘SimHei’] plt.rcParams[‘axes.unicode_minus’] = False ===== 2. 数据准备 ===== print(“正在准备数据…”) try: # 从UCI数据库下载PM2.5数据 url = “https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv” df = pd.read_csv(url) # 数据清洗 df = df.dropna() features = [‘pm2.5’, ‘DEWP’, ‘TEMP’, ‘PRES’, ‘Iws’, ‘Is’, ‘Ir’] # 7个特征 df = df[features].copy() print(“成功加载UCI数据集”) except Exception as e: print(f"下载数据失败: {e}, 使用模拟数据") # 模拟数据 np.random.seed(42) dates = pd.date_range(start=‘2010-01-01’, periods=2000) pm25 = 50 + 30np.sin(np.linspace(0, 20np.pi, 2000)) + np.random.normal(0, 15, 2000) df = pd.DataFrame({ ‘pm2.5’: pm25, ‘DEWP’: 10 + 10np.random.randn(2000), ‘TEMP’: 25 + 10np.random.randn(2000), ‘PRES’: 1010 + 10np.random.randn(2000), ‘Iws’: 5 + 3np.random.randn(2000), ‘Is’: np.random.randint(0, 3, 2000), ‘Ir’: np.random.randint(0, 2, 2000) }) features = df.columns.tolist() print(“数据预览(前5行):”) print(df.head()) ===== 3. 数据预处理 ===== print(“\n进行数据预处理…”) 3.1 归一化(将特征缩放到[-1,1]) scaler = MinMaxScaler(feature_range=(-1, 1)) scaled_data = scaler.fit_transform(df[features]) 3.2 添加时序特征:前一天的PM2.5差值 pm25_diff = np.zeros_like(scaled_data[:, 0]) pm25_diff[1:] = scaled_data[1:, 0] - scaled_data[:-1, 0] scaled_data = np.column_stack((scaled_data, pm25_diff)) features.append(‘pm2.5_diff’) # 添加新特征 3.3 构造时序序列(用前7天数据预测第8天的PM2.5浓度) TIME_STEP = 7 TARGET_FEATURE = ‘pm2.5’ # 预测目标 target_idx = features.index(TARGET_FEATURE) def create_sequences(data, time_step, target_idx): “”“创建时序序列数据集”“” X, Y = [], [] for i in range(len(data) - time_step): X.append(data[i:i+time_step, :]) Y.append(data[i+time_step, target_idx]) return np.array(X), np.array(Y) X, Y = create_sequences(scaled_data, TIME_STEP, target_idx) 3.4 划分数据集:训练集(70%)、验证集(15%)、测试集(15%) train_size = int(len(X) * 0.7) val_size = int(len(X) * 0.15) test_size = len(X) - train_size - val_size X_train, Y_train = X[:train_size], Y[:train_size] X_val, Y_val = X[train_size:train_size+val_size], Y[train_size:train_size+val_size] X_test, Y_test = X[train_size+val_size:], Y[train_size+val_size:] 3.5 转换为PyTorch张量 X_train = torch.from_numpy(X_train).float() Y_train = torch.from_numpy(Y_train).float().unsqueeze(1) X_val = torch.from_numpy(X_val).float() Y_val = torch.from_numpy(Y_val).float().unsqueeze(1) X_test = torch.from_numpy(X_test).float() Y_test = torch.from_numpy(Y_test).float().unsqueeze(1) 3.6 创建DataLoader BATCH_SIZE = 64 train_dataset = TensorDataset(X_train, Y_train) val_dataset = TensorDataset(X_val, Y_val) test_dataset = TensorDataset(X_test, Y_test) train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True) val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE) test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE) print(f"数据集划分: 训练集 {len(X_train)} 样本, 验证集 {len(X_val)} 样本, 测试集 {len(X_test)} 样本") ===== 4. 模型定义 ===== class ImprovedRNN(nn.Module): “”“改进的RNN模型(保留RNN结构但增强性能)”“” def init(self, input_size, hidden_size=128, output_size=1, num_layers=2, dropout=0.3): super().init() self.num_layers = num_layers self.hidden_size = hidden_size # 多层RNN结构 self.rnn = nn.RNN( input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True, nonlinearity='relu' # 使用ReLU激活函数缓解梯度问题 ) # 层归一化(改善训练稳定性) self.layer_norm = nn.LayerNorm(hidden_size) # Dropout层(防止过拟合) self.dropout = nn.Dropout(dropout) # 输出层 self.fc = nn.Linear(hidden_size, output_size) def forward(self, x): # 初始化隐藏状态 h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device) # RNN前向传播 out, _ = self.rnn(x, h0) # 层归一化 out = self.layer_norm(out[:, -1, :]) # 只取最后一个时间步 # Dropout out = self.dropout(out) # 输出层 return self.fc(out) class BidirectionalLSTM(nn.Module): “”“双向LSTM模型(作为性能对比)”“” def init(self, input_size, hidden_size=128, output_size=1, dropout=0.3): super().init() self.lstm = nn.LSTM( input_size=input_size, hidden_size=hidden_size, batch_first=True, bidirectional=True, dropout=dropout if dropout > 0 else 0 ) self.dropout = nn.Dropout(dropout) self.fc = nn.Linear(hidden_size * 2, output_size) def forward(self, x): lstm_out, _ = self.lstm(x) last_output = lstm_out[:, -1, :] last_output = self.dropout(last_output) return self.fc(last_output) ===== 5. 模型训练与评估 ===== device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”) print(f"使用设备: {device}") 模型参数 input_size = len(features) hidden_size = 128 dropout_rate = 0.3 lr = 0.001 epochs = 100 patience = 10 创建模型保存目录 os.makedirs(models’, exist_ok=True) 初始化模型 rnn_model = ImprovedRNN( input_size=input_size, hidden_size=hidden_size, dropout=dropout_rate ).to(device) lstm_model = BidirectionalLSTM( input_size=input_size, hidden_size=hidden_size, dropout=dropout_rate ).to(device) 损失函数和优化器 criterion = nn.MSELoss() rnn_optimizer = optim.Adam(rnn_model.parameters(), lr=lr) lstm_optimizer = optim.Adam(lstm_model.parameters(), lr=lr) 学习率调度器(已修正:移除了不支持的verbose参数) rnn_scheduler = optim.lr_scheduler.ReduceLROnPlateau( rnn_optimizer, mode=‘min’, factor=0.5, patience=5 ) lstm_scheduler = optim.lr_scheduler.ReduceLROnPlateau( lstm_optimizer, mode=‘min’, factor=0.5, patience=5 ) 训练函数(同时训练两个模型) def train_models(rnn_model, lstm_model, train_loader, val_loader, epochs, patience): “”“同时训练RNN和LSTM模型”“” models = {‘RNN’: rnn_model, ‘LSTM’: lstm_model} optimizers = {‘RNN’: rnn_optimizer, ‘LSTM’: lstm_optimizer} schedulers = history = { 'RNN': {'train_loss': [], 'val_loss': [], 'best_val_loss': float('inf')}, 'LSTM': {'train_loss': [], 'val_loss': [], 'best_val_loss': float('inf')} } start_time = time.time() for epoch in range(epochs): for model_name in models: model = models[model_name] optimizer = optimizers[model_name] scheduler = schedulers[model_name] # 训练阶段 model.train() train_loss = 0.0 for inputs, targets in train_loader: inputs, targets = inputs.to(device), targets.to(device) optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, targets) loss.backward() # 梯度裁剪(防止梯度爆炸) torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0) optimizer.step() train_loss += loss.item() * inputs.size(0) avg_train_loss = train_loss / len(train_loader.dataset) history[model_name]['train_loss'].append(avg_train_loss) # 验证阶段 model.eval() val_loss = 0.0 with torch.no_grad(): for inputs, targets in val_loader: inputs, targets = inputs.to(device), targets.to(device) outputs = model(inputs) loss = criterion(outputs, targets) val_loss += loss.item() * inputs.size(0) avg_val_loss = val_loss / len(val_loader.dataset) history[model_name]['val_loss'].append(avg_val_loss) # 更新学习率 scheduler.step(avg_val_loss) # 保存最佳模型 if avg_val_loss < history[model_name]['best_val_loss']: history[model_name]['best_val_loss'] = avg_val_loss torch.save(model.state_dict(), f'models/best_{model_name}.pth') # 打印训练进度 if (epoch + 1) % 5 == 0: print(f"{model_name} Epoch [{epoch+1}/{epochs}], " f"训练损失: {avg_train_loss:.6f}, " f"验证损失: {avg_val_loss:.6f}, " f"学习率: {optimizer.param_groups[0]['lr']:.6f}") print(f"训练完成! 总耗时: {time.time()-start_time:.2f}秒") return history 训练模型 print(“\n开始训练模型…”) history = train_models(rnn_model, lstm_model, train_loader, val_loader, epochs, patience) 加载最佳模型 rnn_model.load_state_dict(torch.load(models/best_RNN.pth’)) lstm_model.load_state_dict(torch.load(models/best_LSTM.pth’)) ===== 6. 模型评估 ===== def evaluate_model(model, loader): “”“评估模型性能”“” model.eval() predictions = [] actuals = [] with torch.no_grad(): for inputs, targets in loader: inputs = inputs.to(device) outputs = model(inputs).cpu().numpy() # 反归一化预测值 dummy = np.zeros((len(outputs), len(features))) dummy[:, target_idx] = outputs.flatten() pred_actual = scaler.inverse_transform(dummy)[:, target_idx] # 反归一化真实值 target_dummy = np.zeros((len(targets), len(features))) target_dummy[:, target_idx] = targets.numpy().flatten() target_actual = scaler.inverse_transform(target_dummy)[:, target_idx] predictions.extend(pred_actual) actuals.extend(target_actual) return np.array(predictions), np.array(actuals) print(“\n评估测试集性能…”) rnn_pred, test_actual = evaluate_model(rnn_model, test_loader) lstm_pred, _ = evaluate_model(lstm_model, test_loader) 计算评估指标 def calculate_metrics(actual, pred): mse = mean_squared_error(actual, pred) mae = mean_absolute_error(actual, pred) rmse = np.sqrt(mse) r2 = 1 - mse / np.var(actual) return mse, mae, rmse, r2 rnn_mse, rnn_mae, rnn_rmse, rnn_r2 = calculate_metrics(test_actual, rnn_pred) lstm_mse, lstm_mae, lstm_rmse, lstm_r2 = calculate_metrics(test_actual, lstm_pred) print(“\n测试集评估结果对比:”) print(f"{‘模型’:<10} {‘MSE’:<8} {‘MAE’:<8} {‘RMSE’:<8} {‘R²’:<8}“) print(f”{‘RNN’:<10} {rnn_mse:.2f} {rnn_mae:.2f} {rnn_rmse:.2f} {rnn_r2:.4f}“) print(f”{‘LSTM’:<10} {lstm_mse:.2f} {lstm_mae:.2f} {lstm_rmse:.2f} {lstm_r2:.4f}") ===== 7. 结果可视化 ===== plt.figure(figsize=(15, 10)) 7.1 损失曲线对比 plt.subplot(2, 2, 1) plt.plot(history[‘RNN’][‘train_loss’], label=‘RNN训练损失’, color=‘blue’, alpha=0.7) plt.plot(history[‘RNN’][‘val_loss’], label=‘RNN验证损失’, color=‘blue’, linestyle=‘–’) plt.plot(history[‘LSTM’][‘train_loss’], label=‘LSTM训练损失’, color=‘red’, alpha=0.7) plt.plot(history[‘LSTM’][‘val_loss’], label=‘LSTM验证损失’, color=‘red’, linestyle=‘–’) plt.title(‘训练与验证损失对比’) plt.xlabel(‘Epoch’) plt.ylabel(‘MSE损失’) plt.legend() plt.grid(alpha=0.3) 7.2 预测结果对比 plt.subplot(2, 2, 2) plt.plot(test_actual[:100], label=‘真实值’, linewidth=2, color=‘black’) plt.plot(rnn_pred[:100], label=‘RNN预测’, linestyle=‘–’, color=‘blue’) plt.plot(lstm_pred[:100], label=‘LSTM预测’, linestyle=‘–’, color=‘red’) plt.title(f’PM2.5浓度预测对比 (前100个样本)') plt.xlabel(‘样本索引) plt.ylabel(‘PM2.5浓度(μg/m³)’) plt.legend() plt.grid(alpha=0.3) 7.3 RNN预测误差分布 plt.subplot(2, 2, 3) rnn_errors = rnn_pred - test_actual plt.hist(rnn_errors, bins=50, alpha=0.7, color=‘blue’) plt.title(‘RNN预测误差分布’) plt.xlabel(‘预测误差’) plt.ylabel(‘频数’) plt.grid(alpha=0.3) 7.4 LSTM预测误差分布 plt.subplot(2, 2, 4) lstm_errors = lstm_pred - test_actual plt.hist(lstm_errors, bins=50, alpha=0.7, color=‘red’) plt.title(‘LSTM预测误差分布’) plt.xlabel(‘预测误差’) plt.ylabel(‘频数’) plt.grid(alpha=0.3) plt.tight_layout() plt.savefig(‘pm25_prediction_comparison.png’, dpi=300) plt.show() print(“模型训练和评估完成!”)你重新根据错误,把这个代码改善一下,然后输出完整代码给我
最新发布
12-03
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值