C4top-关于堆的判断 (最小堆)

本文介绍了一种使用数组模拟最小堆的数据结构方法,并通过具体实例演示了如何实现元素的插入以及如何判断堆中元素间的关系,如根节点、兄弟节点等。
关于堆的判断   

将一系列给定数字顺序插入一个初始为空的小顶堆H[]。随后判断一系列相关命题是否为真。命题分下列几种:

  • x is the rootx是根结点;
  • x and y are siblingsxy是兄弟结点;
  • x is the parent of yxy的父结点;
  • x is a child of yxy的一个子结点。

输入格式:

每组测试第1行包含2个正整数N\le 1000)和M\le 20),分别是插入元素的个数、以及需要判断的命题数。下一行给出区间[-10000, 10000][10000,10000]内的N个要被插入一个初始为空的小顶堆的整数。之后M行,每行给出一个命题。题目保证命题中的结点键值都是存在的。

输出格式:

对输入的每个命题,如果其为真,则在一行中输出T,否则输出F

输入样例:

5 4
46 23 26 24 10
24 is the root
26 and 23 are siblings
46 is the parent of 23
23 is a child of 10

输出样例:

F
T
F
T
 
  • 时间限制:400ms
  • 内存限制:64MB
  • 代码长度限制:16kB
  • 判题程序:系统默认
  • 作者:陈越
  • 单位:浙江大学
题目判定
解题思路
用数组模拟一个最小堆,输入一个数就更新一次,不能输入完毕后一次性建堆。
根据每句话的特点分辨出要求的是什么,并分离出数值量。
父节点下标如果是i,则子节点下标是i*2+1和i*2。
注意可能存在负数,如果用的一次读入一行含空格的字符穿,还需要吃掉第一行的回车符。
判断输入的代码巨丑……见谅(っ*´Д`)っ

解题程序
#include<cstdio>
#include<cstring>
#include<cmath>
#include<set>
#include<cstdlib>
#include<iostream>
#include<algorithm>
using namespace std;
#define MAXN 200010
#define INF 0x3f3f3f3f
int heap[MAXN],sz=0;
int cmp(int a,int b)
{
    return a>b;
}
void push(int x)
{
    int i=++sz;
    while(i>1)
    {
        int p=i/2;//父亲节点的编号
        if(heap[p]<=x) break;//已经没有大小颠倒
        heap[i]=heap[p];//交换节点数值
        i=p;
    }
    heap[i]=x;
}

int pop()
{
    int ret=heap[0];
    int temp=sz;
    int x=heap[--temp];
    int i=0;
    while(i*2+1<temp)
    {
        int a=i*2+1,b=i*2+2;
        if(b<temp&&heap[b]<heap[a]) a=b;
        if(heap[a]>=x) break;
        heap[i]=heap[a];
        i=a;
    }
    heap[i]=x;
    return ret;
}
int main()
{
#ifdef ONLINE_JUDGE
#else
    freopen("G:/cbx/read.txt","r",stdin);
    //freopen("G:/cbx/out.txt","w",stdout);
#endif
    ios::sync_with_stdio(false);
    cin.tie(0);
    int n,m;
    cin>>n>>m;
    for(int i=0; i<n; ++i)
    {
        int x;
        cin>>x;
        push(x);
    }
    int flag=-1;
    char s[100];
    cin.getline(s,100);// 吃掉回车符
    for(int i=0; i<m; ++i)
    {
        int a=0,b=0,pos,minu=1;
        cin.getline(s,100);
        //cout<<s<<endl;
        int len=strlen(s);
        if(s[len-1]=='t')//是根结点
        {
            flag=1;
            for(int j=0; j<len; ++j)
            {
                if(s[j]=='-')
                {
                    minu=-1;
                    continue;
                }
                a*=10;
                a+=int(s[j]-'0');
                if(s[j+1]==' ')
                {
                    a*=minu;
                    break;
                }
            }
        }
        else if(s[len-1]=='s')//是兄弟结点
        {
            flag=2;
            for(int j=0; j<len; ++j)
            {
                if(s[j]=='-')
                {
                    minu=-1;
                    continue;
                }
                a*=10;
                a+=int(s[j]-'0');
                if(s[j+1]==' ')
                {
                    a*=minu;
                    minu=1;
                    pos=j+6;
                    break;
                }
            }
            for(int j=pos; j<len; ++j)
            {
                if(s[j]=='-')
                {
                    minu=-1;
                    continue;
                }
                b*=10;
                b+=int(s[j]-'0');
                if(s[j+1]==' ')
                {
                    b*=minu;
                    break;
                }
            }
        }
        else
        {
            int cnt=0;
            for(int j=0; j<len; ++j)
            {
                if(s[j]==' ') ++cnt;
                if(cnt==2)
                {
                    if(s[j+1]=='t')
                    {
                        flag=3;//是父结点
                        for(int k=0; k<len; ++k)
                        {
                            if(s[k]=='-')
                            {
                                minu=-1;
                                continue;
                            }
                            a*=10;
                            a+=int(s[k]-'0');
                            if(s[k+1]==' ')
                            {
                                a*=minu;
                                minu=1;
                                pos=k+19;
                                break;
                            }
                        }
                        for(int k=pos; k<len; ++k)
                        {
                            if(s[k]=='-')
                            {
                                minu=-1;
                                continue;
                            }
                            b*=10;
                            b+=int(s[k]-'0');
                            if(s[k+1]=='\0')
                            {
                                b*=minu;
                                break;
                            }
                        }
                    }
                    else if(s[j+1]=='a')
                    {
                        flag=4;//是子结点
                        for(int k=0; k<len; ++k)
                        {
                            if(s[k]=='-')
                            {
                                minu=-1;
                                continue;
                            }
                            a*=10;
                            a+=int(s[k]-'0');
                            if(s[k+1]==' ')
                            {
                                a*=minu;
                                minu=1;
                                pos=k+16;
                                break;
                            }
                        }
                        for(int k=pos; k<len; ++k)
                        {
                            if(s[k]=='-')
                            {
                                minu=-1;
                                continue;
                            }
                            b*=10;
                            b+=int(s[k]-'0');
                            if(s[k+1]=='\0')
                            {
                                b*=minu;
                                break;
                            }
                        }
                    }
                    if(flag!=-1) break;
                }
            }
        }
        //cout<<a<<" "<<b<<endl;
        if(flag==1)//是根结点
        {
            if(heap[1]==a) cout<<"T"<<endl;
            else cout<<"F"<<endl;
        }
        else if(flag==2)//是兄弟结点
        {
            int j=0;
            for(int i=1; i<=n; ++i)
                if(heap[i]==a)
                {
                    j=i;
                    break;
                }
            j/=2;
            if(j!=0&&((heap[j*2]==a&&heap[j*2+1]==b)||(heap[j*2]==b&&heap[j*2+1]==a)))
                cout<<"T"<<endl;
            else cout<<"F"<<endl;
        }
        else if(flag==3)//是父结点
        {
            int j=0;
            for(int i=1; i<=n; ++i)
                if(heap[i]==a)
                {
                    j=i;
                    break;
                }
            if(j!=0&&((heap[j*2]==b||heap[j*2+1]==b)))
                cout<<"T"<<endl;
            else cout<<"F"<<endl;
        }
        else if(flag==4)//是子结点
        {
            int j=0;
            for(int i=1; i<=n; ++i)
                if(heap[i]==a)
                {
                    j=i;
                    break;
                }
            if(j!=0&&((heap[j/2]==b||heap[(j-1)/2]==b)))
                cout<<"T"<<endl;
            else cout<<"F"<<endl;
        }
    }
}


结合上述提供完整代码 ''' 问题1 单变量线性 首先对附件1的数据进行预处理,使用fillna结合for循环将“催化剂组合编号”与“催化剂组合”信息与温度等一一对应,确保每个数据点都能对应到正确的催化剂组合 将乙醇转化率、C4烯烃选择性转化为数值类型,便于后续统计分析和可视化处理 使用最小二乘法求回归方程,使用LinearRegression模型对每组数据进行最小二乘拟合,分析温度与乙醇转化率、C4烯烃选择性等之间的线性关系。计算斜率、截距和R²值,评估拟合效果。 将数据以折线图、散点图、回归线的形式直观显示 具体分析关系根据图表命名可得 问题2 ①Co负载量 ②装料比 ③乙醇浓度 ④温度 ⑤乙醇转化率 ⑥C4烯烃选择性 使用statsmodels对特征进行标准化处理(消除量纲影响),构建多元线性回归模型,拟合乙醇转化率和C4烯烃选择性两个目标变量的回归模型 转化率=β 0+β 1(Co负载量)+β 2 (装料比)+β 3(温度)+ϵ 选择性=β 0+β 1(Co负载量)+β 2 (装料比)+β 3(温度)+ϵ (这两个式子的β数值不同) 通过OLS模型(普通最小二乘)的t值和p值判断参数显著性(p<0.05为显著) 评估 R²、F统计量、AIC、BIC 等指标(在.csv,直接ai提取有效信息就行) 箱线图:用于数据分布可视化,直观反映数据中位数、四分位数,适合对比不同组别的数据分布。箱体表示数据中间的50% 折线图:阴影部分面积表示置信区间(数据波动范围) ''' import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.linear_model import LinearRegression from sklearn.metrics import r2_score from sklearn.preprocessing import StandardScaler import statsmodels.api as sm import os plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False os.makedirs('./results', exist_ok=True) os.makedirs('./processed_data', exist_ok=True) os.makedirs('./results/problem3', exist_ok=True) # 这是新增的关键行 def preprocess_attachment1(file_path): df = pd.read_excel(file_path) df.fillna('', inplace=True) cleaned_ids = [] cleaned_descs = [] current_id = '' current_desc = '' for index, row in df.iterrows(): catalyst_id = row['催化剂组合编号'] catalyst_desc = row['催化剂组合'] if catalyst_id != '': current_id = catalyst_id current_desc = catalyst_desc cleaned_ids.append(current_id) cleaned_descs.append(current_desc) df['催化剂组合编号'] = cleaned_ids df['催化剂组合'] = cleaned_descs numeric_cols = df.columns[2:] for col in numeric_cols: df[col] = pd.to_numeric(df[col], errors='coerce') return df def preprocess_attachment2(file_path): df = pd.read_excel(file_path) df.columns = ['时间(min)', '乙醇转化率(%)', '乙烯选择性(%)', 'C4烯烃选择性(%)', '乙醛选择性(%)', '碳数为4-12脂肪醇(%)', '甲基苯甲醛和甲基苯甲醇(%)', '其他(%)'] df['时间(min)'] = pd.to_numeric(df['时间(min)'], errors='coerce') df.fillna(method='ffill', inplace=True) for col in df.columns[1:]: df[col] = pd.to_numeric(df[col], errors='coerce') df.dropna(inplace=True) df['C4烯烃收率(%)'] = df['乙醇转化率(%)'] * df['C4烯烃选择性(%)'] / 100 return df # 线性回归函数 def linear_regression_fit(x, y): if len(x) < 2: return None, None, None X = x.values.reshape(-1, 1) model = LinearRegression() model.fit(X, y) slope = model.coef_[0] intercept = model.intercept_ y_pred = model.predict(X) r2 = r2_score(y, y_pred) return slope, intercept, r2 def plot_grouped_line_with_regression(data, x_col, y_col, group_col, title, xlabel, ylabel, save_name=None): plt.figure(figsize=(14, 8)) colors = plt.cm.tab20.colors groups = data[group_col].unique() results = [] for i, group in enumerate(groups): subset = data[data[group_col] == group] x = subset[x_col] y = subset[y_col] if len(x) < 2: print(f"组 {group} 数据不足,跳过拟合") continue slope, intercept, r2 = linear_regression_fit(x, y) if slope is None: continue y_fit = slope * x + intercept color = colors[i % len(colors)] plt.plot(x, y, marker='o', linestyle='--', color=color, label=f'{group} (原始数据)') plt.plot(x, y_fit, color=color, linestyle='-', linewidth=2, label=f'{group} 拟合') equation = f'y = {slope:.4f}x + {intercept:.4f}' for xi, yi in zip(x, y): results.append({ '组别': group, 'x': xi, 'y': yi, 'y_拟合': slope * xi + intercept, '斜率': slope, '截距': intercept, 'R2': r2, '方程': equation }) result_df = pd.DataFrame(results) plt.title(title, fontsize=14) plt.xlabel(xlabel, fontsize=12) plt.ylabel(ylabel, fontsize=12) plt.grid(True) plt.legend(fontsize=10, bbox_to_anchor=(1.05, 1), loc='upper left') plt.tight_layout() if save_name: plt.savefig(f"./results/{save_name}.png", dpi=300, bbox_inches='tight') result_df.to_csv(f"./results/{save_name}_回归结果.csv", index=False, encoding='utf_8_sig') plt.show() def plot_ethanol_vs_c4_with_regression(data, ethanol_col, c4_col, group_col, title, xlabel, ylabel, save_name=None): plt.figure(figsize=(14, 8)) colors = plt.cm.tab20.colors groups = data[group_col].unique() results = [] for i, group in enumerate(groups): subset = data[data[group_col] == group] x = subset[ethanol_col] y = subset[c4_col] if len(x) < 2: print(f"组 {group} 数据不足,跳过拟合") continue slope, intercept, r2 = linear_regression_fit(x, y) if slope is None: continue y_fit = slope * x + intercept color = colors[i % len(colors)] plt.plot(x, y, marker='o', linestyle='--', color=color, label=f'{group} (原始数据)') plt.plot(x, y_fit, color=color, linestyle='-', linewidth=2, label=f'{group} 拟合') equation = f'y = {slope:.4f}x + {intercept:.4f}' for xi, yi in zip(x, y): results.append({ '组别': group, 'x': xi, 'y': yi, 'y_拟合': slope * xi + intercept, '斜率': slope, '截距': intercept, 'R2': r2, '方程': equation }) result_df = pd.DataFrame(results) plt.title(title, fontsize=14) plt.xlabel(xlabel, fontsize=12) plt.ylabel(ylabel, fontsize=12) plt.grid(True) plt.legend(fontsize=10, bbox_to_anchor=(1.05, 1), loc='upper left') plt.tight_layout() if save_name: plt.savefig(f"./results/{save_name}.png", dpi=300, bbox_inches='tight') result_df.to_csv(f"./results/{save_name}_回归结果.csv", index=False, encoding='utf_8_sig') plt.show() def analyze_attachment2_data(data): plt.figure(figsize=(10, 6)) sns.lineplot(x='时间(min)', y='乙醇转化率(%)', data=data, marker='o', color='g', label='原始数据') slope, intercept, r2 = linear_regression_fit(data['时间(min)'], data['乙醇转化率(%)']) if slope is not None: x_fit = data['时间(min)'] y_fit = slope * x_fit + intercept plt.plot(x_fit, y_fit, color='orange', linestyle='--', label=f'线性拟合\ny = {slope:.4f}x + {intercept:.4f}\nR² = {r2:.4f}') plt.title('乙醇转化率随时间变化(350°C)', fontsize=14) plt.xlabel('时间(min)', fontsize=12) plt.ylabel('乙醇转化率 (%)', fontsize=12) plt.legend() plt.grid(True) plt.tight_layout() plt.savefig(f'./results/attachment2_乙醇转化率_线性拟合.png', dpi=300, bbox_inches='tight') plt.show() plt.figure(figsize=(10, 6)) sns.lineplot(x='时间(min)', y='C4烯烃选择性(%)', data=data, marker='o', color='b', label='原始数据') slope, intercept, r2 = linear_regression_fit(data['时间(min)'], data['C4烯烃选择性(%)']) if slope is not None: x_fit = data['时间(min)'] y_fit = slope * x_fit + intercept plt.plot(x_fit, y_fit, color='red', linestyle='--', label=f'线性拟合\ny = {slope:.4f}x + {intercept:.4f}\nR² = {r2:.4f}') plt.title('C4烯烃选择性随时间变化(350°C)', fontsize=14) plt.xlabel('时间(min)', fontsize=12) plt.ylabel('C4烯烃选择性 (%)', fontsize=12) plt.legend() plt.grid(True) plt.tight_layout() plt.savefig(f'./results/attachment2_C4选择性_线性拟合.png', dpi=300, bbox_inches='tight') plt.show() plt.figure(figsize=(12, 6)) for col in data.columns[2:-1]: # 排除时间列和收率列 sns.lineplot(x='时间(min)', y=col, data=data, label=col) plt.title('各产物选择性随时间变化', fontsize=14) plt.xlabel('时间(min)') plt.ylabel('选择性 (%)') plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left') plt.grid(True) plt.tight_layout() plt.savefig('./results/attachment2_各产物选择性.png', dpi=300, bbox_inches='tight') plt.show() plt.figure(figsize=(10, 6)) sns.scatterplot(x='乙醇转化率(%)', y='C4烯烃选择性(%)', data=data) plt.title('乙醇转化率与C4烯烃选择性关系', fontsize=14) plt.xlabel('乙醇转化率 (%)') plt.ylabel('C4烯烃选择性 (%)') plt.grid(True) plt.tight_layout() plt.savefig('./results/attachment2_转化率与选择性关系.png', dpi=300, bbox_inches='tight') plt.show() plt.figure(figsize=(10, 6)) sns.lineplot(x='时间(min)', y='C4烯烃收率(%)', data=data, marker='o', color='purple', label='原始数据') slope, intercept, r2 = linear_regression_fit(data['时间(min)'], data['C4烯烃收率(%)']) if slope is not None: x_fit = data['时间(min)'] y_fit = slope * x_fit + intercept plt.plot(x_fit, y_fit, color='orange', linestyle='--', label=f'线性拟合\ny = {slope:.4f}x + {intercept:.4f}\nR² = {r2:.4f}') plt.title('C4烯烃收率随时间变化(350°C)', fontsize=14) plt.xlabel('时间(min)', fontsize=12) plt.ylabel('C4烯烃收率 (%)', fontsize=12) plt.legend() plt.grid(True) plt.tight_layout() plt.savefig(f'./results/attachment2_C4收率_线性拟合.png', dpi=300, bbox_inches='tight') plt.show() def parse_catalyst_components(df): df['Co负载量(wt%)'] = np.nan df['Co/SiO2质量(mg)'] = np.nan df['HAP质量(mg)'] = np.nan df['装料比(Co/SiO2:HAP)'] = np.nan df['乙醇浓度(ml/min)'] = np.nan df['装料方式'] = ['I' if 'A' in idx else 'II' for idx in df['催化剂组合编号']] for idx, row in df.iterrows(): desc = row['催化剂组合'] # Co负载量 if 'wt%' in desc: wt_percent = desc.split('wt%')[0].split()[-1] try: df.at[idx, 'Co负载量(wt%)'] = float(wt_percent) except: pass # Co/SiO2质量和HAP质量 parts = desc.split('-') for part in parts: if 'mg' in part and 'Co/SiO2' in part: try: mass = float(part.split('mg')[0].strip()) df.at[idx, 'Co/SiO2质量(mg)'] = mass except: pass elif 'mg' in part and 'HAP' in part: try: mass = float(part.split('mg')[0].strip()) df.at[idx, 'HAP质量(mg)'] = mass except: pass elif '乙醇浓度' in part: try: conc = float(part.split('乙醇浓度')[1].split('ml/min')[0].strip()) df.at[idx, '乙醇浓度(ml/min)'] = conc except: pass # 装料比 co_mass = df.at[idx, 'Co/SiO2质量(mg)'] hap_mass = df.at[idx, 'HAP质量(mg)'] if not np.isnan(co_mass) and not np.isnan(hap_mass) and hap_mass != 0: df.at[idx, '装料比(Co/SiO2:HAP)'] = co_mass / hap_mass return df def analyze_catalyst_effects(df): df = parse_catalyst_components(df) df.to_csv('./processed_data/attachment1_with_catalyst_params.csv', index=False, encoding='utf_8_sig') os.makedirs('./results/problem2', exist_ok=True) # 移除缺失值 regression_df = df.dropna(subset=['Co负载量(wt%)', '装料比(Co/SiO2:HAP)', '乙醇浓度(ml/min)', '温度', '乙醇转化率(%)', 'C4烯烃选择性(%)']) # 特征标准化 features = regression_df[['Co负载量(wt%)', '装料比(Co/SiO2:HAP)', '乙醇浓度(ml/min)', '温度']] scaler = StandardScaler() scaled_features = scaler.fit_transform(features) scaler_df = pd.DataFrame({ '特征': features.columns, '均值': scaler.mean_, '标准差': scaler.scale_ }) scaler_df.to_csv('./results/problem2/standardization_params.csv', index=False, encoding='utf_8_sig') # 构建回归模型 X_conv = sm.add_constant(scaled_features) y_conv = regression_df['乙醇转化率(%)'] model_conv = sm.OLS(y_conv, X_conv).fit() y_select = regression_df['C4烯烃选择性(%)'] model_select = sm.OLS(y_select, X_conv).fit() with open('./results/problem2/regression_results.txt', 'w') as f: f.write("乙醇转化率回归分析结果:\n") f.write(str(model_conv.summary())) f.write("\n\n" + "=" * 80 + "\n\n") f.write("C4烯烃选择性回归分析结果:\n") f.write(str(model_select.summary())) # 模型诊断信息 residuals_conv = model_conv.resid residuals_select = model_select.resid pd.DataFrame({'乙醇转化率残差': residuals_conv, 'C4烯烃选择性残差': residuals_select}).to_csv( './results/problem2/regression_residuals.csv', index=False, encoding='utf_8_sig' ) # VIF 检验多重共线性 from statsmodels.stats.outliers_influence import variance_inflation_factor vif_data = pd.DataFrame() vif_data["特征"] = features.columns vif_data["VIF"] = [variance_inflation_factor(scaled_features, i) for i in range(scaled_features.shape[1])] vif_data.to_csv('./results/problem2/vif_values.csv', index=False, encoding='utf_8_sig') # Durbin-Watson 检验自相关 with open('./results/problem2/durbin_watson.txt', 'w') as f: f.write(f"乙醇转化率模型 Durbin-Watson 检验: {sm.stats.durbin_watson(residuals_conv):.4f}\n") f.write(f"C4烯烃选择性模型 Durbin-Watson 检验: {sm.stats.durbin_watson(residuals_select):.4f}") # 回归系数输出 regression_coefficients = pd.DataFrame({ '特征': ['截距'] + list(features.columns), '乙醇转化率系数': [model_conv.params[0]] + list(model_conv.params[1:]), 'C4烯烃选择性系数': [model_select.params[0]] + list(model_select.params[1:]) }) regression_coefficients.to_csv('./results/problem2/regression_coefficients.csv', index=False, encoding='utf_8_sig') # 箱线图统计量输出 unique_categories = sorted(df['Co负载量(wt%)'].dropna().unique()) boxplot_stats_list = [] for val in unique_categories: stats = df[df['Co负载量(wt%)'] == val]['乙醇转化率(%)'].describe() stats_dict = { 'Co负载量(wt%)': val, 'count': stats['count'], 'mean': stats['mean'], 'std': stats['std'], 'min': stats['min'], '25%': stats['25%'], '50%': stats['50%'], '75%': stats['75%'], 'max': stats['max'] } boxplot_stats_list.append(stats_dict) boxplot_stats = pd.DataFrame(boxplot_stats_list) boxplot_stats.to_csv('./results/problem2/boxplot_stats.csv', index=False, encoding='utf_8_sig') boxplot_stats.to_pickle('./results/problem2/boxplot_stats.pkl') temp_co_conversion = df.groupby(['温度', 'Co负载量(wt%)'])['乙醇转化率(%)'].agg(['mean', 'std', 'count']).reset_index() temp_co_conversion.columns = ['温度', 'Co负载量(wt%)', '平均乙醇转化率(%)', '标准差', '样本量'] temp_co_conversion.to_csv('./results/problem2/interaction_temp_co_loading.csv', index=False, encoding='utf_8_sig') # 折线图置信区间 plt.figure(figsize=(10, 6)) sns.lineplot(x='温度', y='乙醇转化率(%)', hue='Co负载量(wt%)', ci='sd', data=df) plt.title('温度与Co负载量对乙醇转化率的交互影响(含标准差)') plt.grid(True) plt.tight_layout() plt.savefig('./results/problem2/temp_vs_conversion_by_co_loading_with_error.png', dpi=300, bbox_inches='tight') plt.show() return regression_coefficients, boxplot_stats #问题3 def analyze_c4_olefin_yield(df): # 计算C4烯烃收率 df['C4烯烃收率(%)'] = df['乙醇转化率(%)'] * df['C4烯烃选择性(%)'] / 100 # 找出最高C4烯烃收率的实验 max_yield_row = df.loc[df['C4烯烃收率(%)'].idxmax()] print(f"\n最高C4烯烃收率为{max_yield_row['C4烯烃收率(%)']:.2f}%") print(f"对应的催化剂组合为:{max_yield_row['催化剂组合']}") print(f"对应的温度为:{max_yield_row['温度']}°C") # 分析不同温度下的C4烯烃收率 plt.figure(figsize=(12, 6)) sns.lineplot(x='温度', y='C4烯烃收率(%)', hue='催化剂组合编号', data=df, marker='o') plt.title('不同催化剂组合下C4烯烃收率随温度变化趋势', fontsize=14) plt.xlabel('温度 (°C)', fontsize=12) plt.ylabel('C4烯烃收率 (%)', fontsize=12) plt.grid(True) plt.legend(title='催化剂组合', bbox_to_anchor=(1.05, 1), loc='upper left') plt.tight_layout() plt.savefig('./results/problem3/c4_olefin_yield_vs_temperature.png', dpi=300, bbox_inches='tight') plt.show() # 找出温度低于350°C时的最佳组合 low_temp_df = df[df['温度'] < 350] if not low_temp_df.empty: max_yield_low_temp_row = low_temp_df.loc[low_temp_df['C4烯烃收率(%)'].idxmax()] print(f"\n温度低于350°C时最高C4烯烃收率为{max_yield_low_temp_row['C4烯烃收率(%)']:.2f}%") print(f"对应的催化剂组合为:{max_yield_low_temp_row['催化剂组合']}") print(f"对应的温度为:{max_yield_low_temp_row['温度']}°C") else: print("\n没有温度低于350°C的实验数据") # 保存包含C4烯烃收率的数据 df.to_csv('./processed_data/attachment1_with_yield.csv', index=False, encoding='utf_8_sig') return max_yield_row, max_yield_low_temp_row if 'max_yield_low_temp_row' in locals() else None def plot_c4_olefin_yield_comparison(df): # 按催化剂组合和温度分组统计平均C4烯烃收率 grouped = df.groupby(['催化剂组合编号', '温度'])['C4烯烃收率(%)'].mean().reset_index() # 找出每个催化剂组合的最佳温度 best_temp_per_catalyst = grouped.loc[grouped.groupby('催化剂组合编号')['C4烯烃收率(%)'].idxmax()] best_temp_per_catalyst = best_temp_per_catalyst.sort_values(by='C4烯烃收率(%)', ascending=False) # 绘制柱状图显示不同催化剂组合的最佳C4烯烃收率 plt.figure(figsize=(14, 8)) sns.barplot(x='催化剂组合编号', y='C4烯烃收率(%)', data=best_temp_per_catalyst, palette='viridis') plt.title('不同催化剂组合的最佳C4烯烃收率对比', fontsize=16) plt.xlabel('催化剂组合编号', fontsize=14) plt.ylabel('最佳C4烯烃收率 (%)', fontsize=14) plt.xticks(rotation=45) plt.grid(True, axis='y', linestyle='--', alpha=0.7) plt.tight_layout() plt.savefig('./results/problem3/best_c4_olefin_yield_by_catalyst.png', dpi=300, bbox_inches='tight') plt.show() # 找出前5个最佳催化剂组合 top_5_catalysts = best_temp_per_catalyst.head(5) print("\n前5个最佳催化剂组合:") for i, (_, row) in enumerate(top_5_catalysts.iterrows()): print(f"{i+1}. 催化剂组合:{row['催化剂组合编号']}") print(f" 最佳温度:{row['温度']}°C") print(f" C4烯烃收率:{row['C4烯烃收率(%)']:.2f}%") print() return best_temp_per_catalyst def analyze_catalyst_components_impact(df): # 解析催化剂成分 df = parse_catalyst_components(df) # 计算C4烯烃收率 df['C4烯烃收率(%)'] = df['乙醇转化率(%)'] * df['C4烯烃选择性(%)'] / 100 # 分析Co负载量的影响 co_loading_impact = df.groupby('Co负载量(wt%)')['C4烯烃收率(%)'].mean().reset_index() plt.figure(figsize=(10, 6)) sns.barplot(x='Co负载量(wt%)', y='C4烯烃收率(%)', data=co_loading_impact, palette='viridis') plt.title('Co负载量对C4烯烃收率的影响', fontsize=14) plt.xlabel('Co负载量 (wt%)', fontsize=12) plt.ylabel('平均C4烯烃收率 (%)', fontsize=12) plt.grid(True, axis='y') plt.tight_layout() plt.savefig('./results/problem3/co_loading_impact.png', dpi=300, bbox_inches='tight') plt.show() # 分析装料比的影响 feed_ratio_impact = df.groupby('装料比(Co/SiO2:HAP)')['C4烯烃收率(%)'].mean().reset_index() plt.figure(figsize=(10, 6)) sns.barplot(x='装料比(Co/SiO2:HAP)', y='C4烯烃收率(%)', data=feed_ratio_impact, palette='viridis') plt.title('装料比对C4烯烃收率的影响', fontsize=14) plt.xlabel('装料比 (Co/SiO2:HAP)', fontsize=12) plt.ylabel('平均C4烯烃收率 (%)', fontsize=12) plt.grid(True, axis='y') plt.tight_layout() plt.savefig('./results/problem3/feed_ratio_impact.png', dpi=300, bbox_inches='tight') plt.show() # 分析乙醇浓度的影响 ethanol_conc_impact = df.groupby('乙醇浓度(ml/min)')['C4烯烃收率(%)'].mean().reset_index() plt.figure(figsize=(10, 6)) sns.barplot(x='乙醇浓度(ml/min)', y='C4烯烃收率(%)', data=ethanol_conc_impact, palette='viridis') plt.title('乙醇浓度对C4烯烃收率的影响', fontsize=14) plt.xlabel('乙醇浓度 (ml/min)', fontsize=12) plt.ylabel('平均C4烯烃收率 (%)', fontsize=12) plt.grid(True, axis='y') plt.tight_layout() plt.savefig('./results/problem3/ethanol_conc_impact.png', dpi=300, bbox_inches='tight') plt.show() return co_loading_impact, feed_ratio_impact, ethanol_conc_impact if __name__ == "__main__": df1 = preprocess_attachment1('C:/Users/Yeah/Desktop/数模/第五题/B/附件1.xlsx') df2 = preprocess_attachment2('C:/Users/Yeah/Desktop/数模/第五题/B/附件2.xlsx') df1.to_csv('./processed_data/attachment1_processed.csv', index=False, encoding='utf_8_sig') df2.to_csv('./processed_data/attachment2_processed.csv', index=False, encoding='utf_8_sig') plot_grouped_line_with_regression( data=df1, x_col='温度', y_col='乙醇转化率(%)', group_col='催化剂组合编号', title='不同催化剂组合下乙醇转化率随温度变化趋势(含线性拟合)', xlabel='温度 (°C)', ylabel='乙醇转化率 (%)', save_name='ethanol_conversion_regression' ) plot_grouped_line_with_regression( data=df1, x_col='温度', y_col='C4烯烃选择性(%)', group_col='催化剂组合编号', title='不同催化剂组合下C4烯烃选择性随温度变化趋势(含线性拟合)', xlabel='温度 (°C)', ylabel='C4烯烃选择性 (%)', save_name='c4_selectivity_regression' ) plot_ethanol_vs_c4_with_regression( data=df1, ethanol_col='乙醇转化率(%)', c4_col='C4烯烃选择性(%)', group_col='催化剂组合编号', title='乙醇转化率与C4烯烃选择性关系图(按催化剂组合)', xlabel='乙醇转化率 (%)', ylabel='C4烯烃选择性 (%)', save_name='ethanol_vs_c4_regression' ) analyze_attachment2_data(df2) # 问题2 analyze_catalyst_effects(df1) #问题3 ''' 问题1 单变量线性 首先对附件1的数据进行预处理,使用fillna结合for循环将“催化剂组合编号”与“催化剂组合”信息与温度等一一对应,确保每个数据点都能对应到正确的催化剂组合 将乙醇转化率、C4烯烃选择性转化为数值类型,便于后续统计分析和可视化处理 使用最小二乘法求回归方程,使用LinearRegression模型对每组数据进行最小二乘拟合,分析温度与乙醇转化率、C4烯烃选择性等之间的线性关系。计算斜率、截距和R²值,评估拟合效果。 将数据以折线图、散点图、回归线的形式直观显示 具体分析关系根据图表命名可得 问题2 ①Co负载量 ②装料比 ③乙醇浓度 ④温度 ⑤乙醇转化率 ⑥C4烯烃选择性 使用statsmodels对特征进行标准化处理(消除量纲影响),构建多元线性回归模型,拟合乙醇转化率和C4烯烃选择性两个目标变量的回归模型 转化率=β 0+β 1(Co负载量)+β 2 (装料比)+β 3(温度)+ϵ 选择性=β 0+β 1(Co负载量)+β 2 (装料比)+β 3(温度)+ϵ (这两个式子的β数值不同) 通过OLS模型(普通最小二乘)的t值和p值判断参数显著性(p<0.05为显著) 评估 R²、F统计量、AIC、BIC 等指标(在.csv,直接ai提取有效信息就行) 箱线图:用于数据分布可视化,直观反映数据中位数、四分位数,适合对比不同组别的数据分布。箱体表示数据中间的50% 折线图:阴影部分面积表示置信区间(数据波动范围) ''' import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.linear_model import LinearRegression from sklearn.metrics import r2_score from sklearn.preprocessing import StandardScaler import statsmodels.api as sm import os plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False os.makedirs('./results', exist_ok=True) os.makedirs('./processed_data', exist_ok=True) def preprocess_attachment1(file_path): df = pd.read_excel(file_path) df.fillna('', inplace=True) cleaned_ids = [] cleaned_descs = [] current_id = '' current_desc = '' for index, row in df.iterrows(): catalyst_id = row['催化剂组合编号'] catalyst_desc = row['催化剂组合'] if catalyst_id != '': current_id = catalyst_id current_desc = catalyst_desc cleaned_ids.append(current_id) cleaned_descs.append(current_desc) df['催化剂组合编号'] = cleaned_ids df['催化剂组合'] = cleaned_descs numeric_cols = df.columns[2:] for col in numeric_cols: df[col] = pd.to_numeric(df[col], errors='coerce') return df def preprocess_attachment2(file_path): df = pd.read_excel(file_path) df.columns = ['时间(min)', '乙醇转化率(%)', '乙烯选择性(%)', 'C4烯烃选择性(%)', '乙醛选择性(%)', '碳数为4-12脂肪醇(%)', '甲基苯甲醛和甲基苯甲醇(%)', '其他(%)'] df['时间(min)'] = pd.to_numeric(df['时间(min)'], errors='coerce') df.fillna(method='ffill', inplace=True) for col in df.columns[1:]: df[col] = pd.to_numeric(df[col], errors='coerce') df.dropna(inplace=True) df['C4烯烃收率(%)'] = df['乙醇转化率(%)'] * df['C4烯烃选择性(%)'] / 100 return df # 线性回归函数 def linear_regression_fit(x, y): if len(x) < 2: return None, None, None X = x.values.reshape(-1, 1) model = LinearRegression() model.fit(X, y) slope = model.coef_[0] intercept = model.intercept_ y_pred = model.predict(X) r2 = r2_score(y, y_pred) return slope, intercept, r2 def plot_grouped_line_with_regression(data, x_col, y_col, group_col, title, xlabel, ylabel, save_name=None): plt.figure(figsize=(14, 8)) colors = plt.cm.tab20.colors groups = data[group_col].unique() results = [] for i, group in enumerate(groups): subset = data[data[group_col] == group] x = subset[x_col] y = subset[y_col] if len(x) < 2: print(f"组 {group} 数据不足,跳过拟合") continue slope, intercept, r2 = linear_regression_fit(x, y) if slope is None: continue y_fit = slope * x + intercept color = colors[i % len(colors)] plt.plot(x, y, marker='o', linestyle='--', color=color, label=f'{group} (原始数据)') plt.plot(x, y_fit, color=color, linestyle='-', linewidth=2, label=f'{group} 拟合') equation = f'y = {slope:.4f}x + {intercept:.4f}' for xi, yi in zip(x, y): results.append({ '组别': group, 'x': xi, 'y': yi, 'y_拟合': slope * xi + intercept, '斜率': slope, '截距': intercept, 'R2': r2, '方程': equation }) result_df = pd.DataFrame(results) plt.title(title, fontsize=14) plt.xlabel(xlabel, fontsize=12) plt.ylabel(ylabel, fontsize=12) plt.grid(True) plt.legend(fontsize=10, bbox_to_anchor=(1.05, 1), loc='upper left') plt.tight_layout() if save_name: plt.savefig(f"./results/{save_name}.png", dpi=300, bbox_inches='tight') result_df.to_csv(f"./results/{save_name}_回归结果.csv", index=False, encoding='utf_8_sig') plt.show() def plot_ethanol_vs_c4_with_regression(data, ethanol_col, c4_col, group_col, title, xlabel, ylabel, save_name=None): plt.figure(figsize=(14, 8)) colors = plt.cm.tab20.colors groups = data[group_col].unique() results = [] for i, group in enumerate(groups): subset = data[data[group_col] == group] x = subset[ethanol_col] y = subset[c4_col] if len(x) < 2: print(f"组 {group} 数据不足,跳过拟合") continue slope, intercept, r2 = linear_regression_fit(x, y) if slope is None: continue y_fit = slope * x + intercept color = colors[i % len(colors)] plt.plot(x, y, marker='o', linestyle='--', color=color, label=f'{group} (原始数据)') plt.plot(x, y_fit, color=color, linestyle='-', linewidth=2, label=f'{group} 拟合') equation = f'y = {slope:.4f}x + {intercept:.4f}' for xi, yi in zip(x, y): results.append({ '组别': group, 'x': xi, 'y': yi, 'y_拟合': slope * xi + intercept, '斜率': slope, '截距': intercept, 'R2': r2, '方程': equation }) result_df = pd.DataFrame(results) plt.title(title, fontsize=14) plt.xlabel(xlabel, fontsize=12) plt.ylabel(ylabel, fontsize=12) plt.grid(True) plt.legend(fontsize=10, bbox_to_anchor=(1.05, 1), loc='upper left') plt.tight_layout() if save_name: plt.savefig(f"./results/{save_name}.png", dpi=300, bbox_inches='tight') result_df.to_csv(f"./results/{save_name}_回归结果.csv", index=False, encoding='utf_8_sig') plt.show() def analyze_attachment2_data(data): plt.figure(figsize=(10, 6)) sns.lineplot(x='时间(min)', y='乙醇转化率(%)', data=data, marker='o', color='g', label='原始数据') slope, intercept, r2 = linear_regression_fit(data['时间(min)'], data['乙醇转化率(%)']) if slope is not None: x_fit = data['时间(min)'] y_fit = slope * x_fit + intercept plt.plot(x_fit, y_fit, color='orange', linestyle='--', label=f'线性拟合\ny = {slope:.4f}x + {intercept:.4f}\nR² = {r2:.4f}') plt.title('乙醇转化率随时间变化(350°C)', fontsize=14) plt.xlabel('时间(min)', fontsize=12) plt.ylabel('乙醇转化率 (%)', fontsize=12) plt.legend() plt.grid(True) plt.tight_layout() plt.savefig(f'./results/attachment2_乙醇转化率_线性拟合.png', dpi=300, bbox_inches='tight') plt.show() plt.figure(figsize=(10, 6)) sns.lineplot(x='时间(min)', y='C4烯烃选择性(%)', data=data, marker='o', color='b', label='原始数据') slope, intercept, r2 = linear_regression_fit(data['时间(min)'], data['C4烯烃选择性(%)']) if slope is not None: x_fit = data['时间(min)'] y_fit = slope * x_fit + intercept plt.plot(x_fit, y_fit, color='red', linestyle='--', label=f'线性拟合\ny = {slope:.4f}x + {intercept:.4f}\nR² = {r2:.4f}') plt.title('C4烯烃选择性随时间变化(350°C)', fontsize=14) plt.xlabel('时间(min)', fontsize=12) plt.ylabel('C4烯烃选择性 (%)', fontsize=12) plt.legend() plt.grid(True) plt.tight_layout() plt.savefig(f'./results/attachment2_C4选择性_线性拟合.png', dpi=300, bbox_inches='tight') plt.show() plt.figure(figsize=(12, 6)) for col in data.columns[2:-1]: # 排除时间列和收率列 sns.lineplot(x='时间(min)', y=col, data=data, label=col) plt.title('各产物选择性随时间变化', fontsize=14) plt.xlabel('时间(min)') plt.ylabel('选择性 (%)') plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left') plt.grid(True) plt.tight_layout() plt.savefig('./results/attachment2_各产物选择性.png', dpi=300, bbox_inches='tight') plt.show() plt.figure(figsize=(10, 6)) sns.scatterplot(x='乙醇转化率(%)', y='C4烯烃选择性(%)', data=data) plt.title('乙醇转化率与C4烯烃选择性关系', fontsize=14) plt.xlabel('乙醇转化率 (%)') plt.ylabel('C4烯烃选择性 (%)') plt.grid(True) plt.tight_layout() plt.savefig('./results/attachment2_转化率与选择性关系.png', dpi=300, bbox_inches='tight') plt.show() plt.figure(figsize=(10, 6)) sns.lineplot(x='时间(min)', y='C4烯烃收率(%)', data=data, marker='o', color='purple', label='原始数据') slope, intercept, r2 = linear_regression_fit(data['时间(min)'], data['C4烯烃收率(%)']) if slope is not None: x_fit = data['时间(min)'] y_fit = slope * x_fit + intercept plt.plot(x_fit, y_fit, color='orange', linestyle='--', label=f'线性拟合\ny = {slope:.4f}x + {intercept:.4f}\nR² = {r2:.4f}') plt.title('C4烯烃收率随时间变化(350°C)', fontsize=14) plt.xlabel('时间(min)', fontsize=12) plt.ylabel('C4烯烃收率 (%)', fontsize=12) plt.legend() plt.grid(True) plt.tight_layout() plt.savefig(f'./results/attachment2_C4收率_线性拟合.png', dpi=300, bbox_inches='tight') plt.show() def parse_catalyst_components(df): df['Co负载量(wt%)'] = np.nan df['Co/SiO2质量(mg)'] = np.nan df['HAP质量(mg)'] = np.nan df['装料比(Co/SiO2:HAP)'] = np.nan df['乙醇浓度(ml/min)'] = np.nan df['装料方式'] = ['I' if 'A' in idx else 'II' for idx in df['催化剂组合编号']] for idx, row in df.iterrows(): desc = row['催化剂组合'] # Co负载量 if 'wt%' in desc: wt_percent = desc.split('wt%')[0].split()[-1] try: df.at[idx, 'Co负载量(wt%)'] = float(wt_percent) except: pass # Co/SiO2质量和HAP质量 parts = desc.split('-') for part in parts: if 'mg' in part and 'Co/SiO2' in part: try: mass = float(part.split('mg')[0].strip()) df.at[idx, 'Co/SiO2质量(mg)'] = mass except: pass elif 'mg' in part and 'HAP' in part: try: mass = float(part.split('mg')[0].strip()) df.at[idx, 'HAP质量(mg)'] = mass except: pass elif '乙醇浓度' in part: try: conc = float(part.split('乙醇浓度')[1].split('ml/min')[0].strip()) df.at[idx, '乙醇浓度(ml/min)'] = conc except: pass # 装料比 co_mass = df.at[idx, 'Co/SiO2质量(mg)'] hap_mass = df.at[idx, 'HAP质量(mg)'] if not np.isnan(co_mass) and not np.isnan(hap_mass) and hap_mass != 0: df.at[idx, '装料比(Co/SiO2:HAP)'] = co_mass / hap_mass return df def analyze_catalyst_effects(df): df = parse_catalyst_components(df) df.to_csv('./processed_data/attachment1_with_catalyst_params.csv', index=False, encoding='utf_8_sig') os.makedirs('./results/problem2', exist_ok=True) # 移除缺失值 regression_df = df.dropna(subset=['Co负载量(wt%)', '装料比(Co/SiO2:HAP)', '乙醇浓度(ml/min)', '温度', '乙醇转化率(%)', 'C4烯烃选择性(%)']) # 特征标准化 features = regression_df[['Co负载量(wt%)', '装料比(Co/SiO2:HAP)', '乙醇浓度(ml/min)', '温度']] scaler = StandardScaler() scaled_features = scaler.fit_transform(features) scaler_df = pd.DataFrame({ '特征': features.columns, '均值': scaler.mean_, '标准差': scaler.scale_ }) scaler_df.to_csv('./results/problem2/standardization_params.csv', index=False, encoding='utf_8_sig') # 构建回归模型 X_conv = sm.add_constant(scaled_features) y_conv = regression_df['乙醇转化率(%)'] model_conv = sm.OLS(y_conv, X_conv).fit() y_select = regression_df['C4烯烃选择性(%)'] model_select = sm.OLS(y_select, X_conv).fit() with open('./results/problem2/regression_results.txt', 'w') as f: f.write("乙醇转化率回归分析结果:\n") f.write(str(model_conv.summary())) f.write("\n\n" + "=" * 80 + "\n\n") f.write("C4烯烃选择性回归分析结果:\n") f.write(str(model_select.summary())) # 模型诊断信息 residuals_conv = model_conv.resid residuals_select = model_select.resid pd.DataFrame({'乙醇转化率残差': residuals_conv, 'C4烯烃选择性残差': residuals_select}).to_csv( './results/problem2/regression_residuals.csv', index=False, encoding='utf_8_sig' ) # VIF 检验多重共线性 from statsmodels.stats.outliers_influence import variance_inflation_factor vif_data = pd.DataFrame() vif_data["特征"] = features.columns vif_data["VIF"] = [variance_inflation_factor(scaled_features, i) for i in range(scaled_features.shape[1])] vif_data.to_csv('./results/problem2/vif_values.csv', index=False, encoding='utf_8_sig') # Durbin-Watson 检验自相关 with open('./results/problem2/durbin_watson.txt', 'w') as f: f.write(f"乙醇转化率模型 Durbin-Watson 检验: {sm.stats.durbin_watson(residuals_conv):.4f}\n") f.write(f"C4烯烃选择性模型 Durbin-Watson 检验: {sm.stats.durbin_watson(residuals_select):.4f}") # 回归系数输出 regression_coefficients = pd.DataFrame({ '特征': ['截距'] + list(features.columns), '乙醇转化率系数': [model_conv.params[0]] + list(model_conv.params[1:]), 'C4烯烃选择性系数': [model_select.params[0]] + list(model_select.params[1:]) }) regression_coefficients.to_csv('./results/problem2/regression_coefficients.csv', index=False, encoding='utf_8_sig') # 箱线图统计量输出 unique_categories = sorted(df['Co负载量(wt%)'].dropna().unique()) boxplot_stats_list = [] for val in unique_categories: stats = df[df['Co负载量(wt%)'] == val]['乙醇转化率(%)'].describe() stats_dict = { 'Co负载量(wt%)': val, 'count': stats['count'], 'mean': stats['mean'], 'std': stats['std'], 'min': stats['min'], '25%': stats['25%'], '50%': stats['50%'], '75%': stats['75%'], 'max': stats['max'] } boxplot_stats_list.append(stats_dict) boxplot_stats = pd.DataFrame(boxplot_stats_list) boxplot_stats.to_csv('./results/problem2/boxplot_stats.csv', index=False, encoding='utf_8_sig') boxplot_stats.to_pickle('./results/problem2/boxplot_stats.pkl') temp_co_conversion = df.groupby(['温度', 'Co负载量(wt%)'])['乙醇转化率(%)'].agg(['mean', 'std', 'count']).reset_index() temp_co_conversion.columns = ['温度', 'Co负载量(wt%)', '平均乙醇转化率(%)', '标准差', '样本量'] temp_co_conversion.to_csv('./results/problem2/interaction_temp_co_loading.csv', index=False, encoding='utf_8_sig') # 折线图置信区间 plt.figure(figsize=(10, 6)) sns.lineplot(x='温度', y='乙醇转化率(%)', hue='Co负载量(wt%)', ci='sd', data=df) plt.title('温度与Co负载量对乙醇转化率的交互影响(含标准差)') plt.grid(True) plt.tight_layout() plt.savefig('./results/problem2/temp_vs_conversion_by_co_loading_with_error.png', dpi=300, bbox_inches='tight') plt.show() return regression_coefficients, boxplot_stats #问题3 def analyze_c4_olefin_yield(df): # 计算C4烯烃收率 df['C4烯烃收率(%)'] = df['乙醇转化率(%)'] * df['C4烯烃选择性(%)'] / 100 # 找出最高C4烯烃收率的实验 max_yield_row = df.loc[df['C4烯烃收率(%)'].idxmax()] print(f"\n最高C4烯烃收率为{max_yield_row['C4烯烃收率(%)']:.2f}%") print(f"对应的催化剂组合为:{max_yield_row['催化剂组合']}") print(f"对应的温度为:{max_yield_row['温度']}°C") # 分析不同温度下的C4烯烃收率 plt.figure(figsize=(12, 6)) sns.lineplot(x='温度', y='C4烯烃收率(%)', hue='催化剂组合编号', data=df, marker='o') plt.title('不同催化剂组合下C4烯烃收率随温度变化趋势', fontsize=14) plt.xlabel('温度 (°C)', fontsize=12) plt.ylabel('C4烯烃收率 (%)', fontsize=12) plt.grid(True) plt.legend(title='催化剂组合', bbox_to_anchor=(1.05, 1), loc='upper left') plt.tight_layout() plt.savefig('./results/problem3/c4_olefin_yield_vs_temperature.png', dpi=300, bbox_inches='tight') plt.show() # 找出温度低于350°C时的最佳组合 low_temp_df = df[df['温度'] < 350] if not low_temp_df.empty: max_yield_low_temp_row = low_temp_df.loc[low_temp_df['C4烯烃收率(%)'].idxmax()] print(f"\n温度低于350°C时最高C4烯烃收率为{max_yield_low_temp_row['C4烯烃收率(%)']:.2f}%") print(f"对应的催化剂组合为:{max_yield_low_temp_row['催化剂组合']}") print(f"对应的温度为:{max_yield_low_temp_row['温度']}°C") else: print("\n没有温度低于350°C的实验数据") # 保存包含C4烯烃收率的数据 df.to_csv('./processed_data/attachment1_with_yield.csv', index=False, encoding='utf_8_sig') return max_yield_row, max_yield_low_temp_row if 'max_yield_low_temp_row' in locals() else None def plot_c4_olefin_yield_comparison(df): # 按催化剂组合和温度分组统计平均C4烯烃收率 grouped = df.groupby(['催化剂组合编号', '温度'])['C4烯烃收率(%)'].mean().reset_index() # 找出每个催化剂组合的最佳温度 best_temp_per_catalyst = grouped.loc[grouped.groupby('催化剂组合编号')['C4烯烃收率(%)'].idxmax()] best_temp_per_catalyst = best_temp_per_catalyst.sort_values(by='C4烯烃收率(%)', ascending=False) # 绘制柱状图显示不同催化剂组合的最佳C4烯烃收率 plt.figure(figsize=(14, 8)) sns.barplot(x='催化剂组合编号', y='C4烯烃收率(%)', data=best_temp_per_catalyst, palette='viridis') plt.title('不同催化剂组合的最佳C4烯烃收率对比', fontsize=16) plt.xlabel('催化剂组合编号', fontsize=14) plt.ylabel('最佳C4烯烃收率 (%)', fontsize=14) plt.xticks(rotation=45) plt.grid(True, axis='y', linestyle='--', alpha=0.7) plt.tight_layout() plt.savefig('./results/problem3/best_c4_olefin_yield_by_catalyst.png', dpi=300, bbox_inches='tight') plt.show() # 找出前5个最佳催化剂组合 top_5_catalysts = best_temp_per_catalyst.head(5) print("\n前5个最佳催化剂组合:") for i, (_, row) in enumerate(top_5_catalysts.iterrows()): print(f"{i+1}. 催化剂组合:{row['催化剂组合编号']}") print(f" 最佳温度:{row['温度']}°C") print(f" C4烯烃收率:{row['C4烯烃收率(%)']:.2f}%") print() return best_temp_per_catalyst def analyze_catalyst_components_impact(df): # 解析催化剂成分 df = parse_catalyst_components(df) # 使用统一的数据解析函数 # 计算C4烯烃收率 df['C4烯烃收率(%)'] = df['乙醇转化率(%)'] * df['C4烯烃选择性(%)'] / 100 # 分析Co负载量的影响 co_loading_impact = df.groupby('Co负载量(wt%)')['C4烯烃收率(%)'].mean().reset_index() plt.figure(figsize=(10, 6)) sns.barplot(x='Co负载量(wt%)', y='C4烯烃收率(%)', data=co_loading_impact, palette='viridis') plt.title('Co负载量对C4烯烃收率的影响', fontsize=14) plt.xlabel('Co负载量 (wt%)', fontsize=12) plt.ylabel('平均C4烯烃收率 (%)', fontsize=12) plt.grid(True, axis='y') plt.tight_layout() plt.savefig('./results/problem3/co_loading_impact.png', dpi=300, bbox_inches='tight') plt.show() # 分析装料比的影响 feed_ratio_impact = df.groupby('装料比(Co/SiO2:HAP)')['C4烯烃收率(%)'].mean().reset_index() plt.figure(figsize=(10, 6)) sns.barplot(x='装料比(Co/SiO2:HAP)', y='C4烯烃收率(%)', data=feed_ratio_impact, palette='viridis') plt.title('装料比对C4烯烃收率的影响', fontsize=14) plt.xlabel('装料比 (Co/SiO2:HAP)', fontsize=12) plt.ylabel('平均C4烯烃收率 (%)', fontsize=12) plt.grid(True, axis='y') plt.tight_layout() plt.savefig('./results/problem3/feed_ratio_impact.png', dpi=300, bbox_inches='tight') plt.show() # 分析乙醇浓度的影响 ethanol_conc_impact = df.groupby('乙醇浓度(ml/min)')['C4烯烃收率(%)'].mean().reset_index() plt.figure(figsize=(10, 6)) sns.barplot(x='乙醇浓度(ml/min)', y='C4烯烃收率(%)', data=ethanol_conc_impact, palette='viridis') plt.title('乙醇浓度对C4烯烃收率的影响', fontsize=14) plt.xlabel('乙醇浓度 (ml/min)', fontsize=12) plt.ylabel('平均C4烯烃收率 (%)', fontsize=12) plt.grid(True, axis='y') plt.tight_layout() plt.savefig('./results/problem3/ethanol_conc_impact.png', dpi=300, bbox_inches='tight') plt.show() return co_loading_impact, feed_ratio_impact, ethanol_conc_impact if __name__ == "__main__": df1 = preprocess_attachment1('C:/Users/Yeah/Desktop/数模/第五题/B/附件1.xlsx') df2 = preprocess_attachment2('C:/Users/Yeah/Desktop/数模/第五题/B/附件2.xlsx') df1.to_csv('./processed_data/attachment1_processed.csv', index=False, encoding='utf_8_sig') df2.to_csv('./processed_data/attachment2_processed.csv', index=False, encoding='utf_8_sig') plot_grouped_line_with_regression( data=df1, x_col='温度', y_col='乙醇转化率(%)', group_col='催化剂组合编号', title='不同催化剂组合下乙醇转化率随温度变化趋势(含线性拟合)', xlabel='温度 (°C)', ylabel='乙醇转化率 (%)', save_name='ethanol_conversion_regression' ) plot_grouped_line_with_regression( data=df1, x_col='温度', y_col='C4烯烃选择性(%)', group_col='催化剂组合编号', title='不同催化剂组合下C4烯烃选择性随温度变化趋势(含线性拟合)', xlabel='温度 (°C)', ylabel='C4烯烃选择性 (%)', save_name='c4_selectivity_regression' ) plot_ethanol_vs_c4_with_regression( data=df1, ethanol_col='乙醇转化率(%)', c4_col='C4烯烃选择性(%)', group_col='催化剂组合编号', title='乙醇转化率与C4烯烃选择性关系图(按催化剂组合)', xlabel='乙醇转化率 (%)', ylabel='C4烯烃选择性 (%)', save_name='ethanol_vs_c4_regression' ) analyze_attachment2_data(df2) # 问题2 analyze_catalyst_effects(df1) #问题3 # 分析C4烯烃收率 max_yield_row, max_yield_low_temp_row = analyze_c4_olefin_yield(df1) # 绘制C4烯烃收率对比图 best_catalysts = plot_c4_olefin_yield_comparison(df1) # 分析催化剂成分的影响 co_loading_impact, feed_ratio_impact, ethanol_conc_impact = analyze_catalyst_components_impact(df1)
最新发布
07-30
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值