Day 31

尝试针对之前的心脏病项目ipynb,将他按照今天的示例项目整理成规范的形式,思考下哪些部分可以未来复用。

# src/data/data_loader.py
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
 
def load_data(file_path: str) -> pd.DataFrame:
    """加载心脏病数据集"""
    try:
        df = pd.read_csv(file_path)
        return df
    except FileNotFoundError:
        print(f"错误: 文件 {file_path} 未找到")
        return None
 
def preprocess_data(df: pd.DataFrame) -> pd.DataFrame:
    """预处理心脏病数据集"""
    # 处理缺失值
    df = df.dropna()
    
    # 数据标准化/归一化
    # ...
    
    return df
 
def split_data(df: pd.DataFrame, target_col: str, test_size: float = 0.2, random_state: int = 42):
    """将数据分为训练集和测试集"""
    X = df.drop(target_col, axis=1)
    y = df[target_col]
    return train_test_split(X, y, test_size=test_size, random_state=random_state)
# src/features/feature_engineering.py
from sklearn.preprocessing import StandardScaler
 
def build_features(X_train, X_test):
    """构建和转换特征"""
    # 特征标准化
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.transform(X_test)
    
    # 特征选择/提取
    # ...
    
    return X_train_scaled, X_test_scaled, scaler
# src/models/model_training.py
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix
import joblib
 
def train_model(X_train, y_train):
    """训练随机森林分类器"""
    model = RandomForestClassifier(n_estimators=100, random_state=42)
    model.fit(X_train, y_train)
    return model
 
def evaluate_model(model, X_test, y_test):
    """评估模型性能"""
    y_pred = model.predict(X_test)
    print("分类报告:")
    print(classification_report(y_test, y_pred))
    
    print("混淆矩阵:")
    print(confusion_matrix(y_test, y_pred))
    
    return y_pred
 
def save_model(model, model_path: str):
    """保存模型到文件"""
    joblib.dump(model, model_path)
 
def load_model(model_path: str):
    """从文件加载模型"""
    return joblib.load(model_path)

@浙大疏锦行

要将查询结果中 `day1` 到 `day31` 进行行转列操作,可以使用 `UNPIVOT` 和 `PIVOT` 函数来实现。以下是实现该功能的 SQL 语句: ```sql -- 第一步:使用 UNPIVOT 将 day1 到 day31 列转换为行 WITH UnpivotedData AS ( SELECT Month 年月, ProLineID 产线, LineType 线类别, PortNo 品番, ShortPortNo 背番, CASE WHEN CHARINDEX('TNGA', MotorName) > 0 THEN SUBSTRING(MotorName, CHARINDEX('TNGA', MotorName) + 4, 2) ELSE NULL END AS 线组, CASE WHEN CHARINDEX('2', MotorName) > 0 THEN SUBSTRING(MotorName, CHARINDEX('2', MotorName) - 0, LEN(MotorName)) ELSE NULL END AS 机型分类, OutVolume, OutVolumeName, DayNumber, DayValue FROM VPRP_ProPlan_JG UNPIVOT ( DayValue FOR DayNumber IN (day1, day2, day3, day4, day5, day6, day7, day8, day9, day10, day11, day12, day13, day14, day15, day16, day17, day18, day19, day20, day21, day22, day23, day24, day25, day26, day27, day28, day29, day30, day31) ) AS unpvt WHERE prolineid = '12' AND MotorName IS NOT NULL AND MotorName NOT LIKE '%发动机%' AND MotorName NOT LIKE '%高压%' AND MotorName NOT LIKE '%低压%' AND month = convert(varchar(7), getdate(), 120) ) -- 第二步:使用 PIVOT 将行转换回列 SELECT 年月, 产线, 线类别, 品番, 背番, 线组, 机型分类, OutVolume, OutVolumeName, [day1], [day2], [day3], [day4], [day5], [day6], [day7], [day8], [day9], [day10], [day11], [day12], [day13], [day14], [day15], [day16], [day17], [day18], [day19], [day20], [day21], [day22], [day23], [day24], [day25], [day26], [day27], [day28], [day29], [day30], [day31] FROM UnpivotedData PIVOT ( MAX(DayValue) FOR DayNumber IN ([day1], [day2], [day3], [day4], [day5], [day6], [day7], [day8], [day9], [day10], [day11], [day12], [day13], [day14], [day15], [day16], [day17], [day18], [day19], [day20], [day21], [day22], [day23], [day24], [day25], [day26], [day27], [day28], [day29], [day30], [day31]) ) AS pvt; ``` ### 代码解释 1. **`UNPIVOT` 部分**:将 `day1` 到 `day31` 列转换为行,形成一个中间结果集 `UnpivotedData`,其中 `DayNumber` 表示列名,`DayValue` 表示对应的值。 2. **`PIVOT` 部分**:将中间结果集 `UnpivotedData` 中的行再转换回列,使用 `MAX` 函数聚合 `DayValue`,最终得到行转列的结果。 ###
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值