Pandas 批量拆分与合并Excel文件

最新推荐文章于 2026-01-01 23:33:02 发布

原创最新推荐文章于 2026-01-01 23:33:02 发布 · 379 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#pandas #excel

将一个大的excel等份拆分成多个excel
将多个小excel合并成一个大excel,并标记来源

import pandas as pd

work_dir = r'C:\TELCEL_MEXICO_BOT\A\Weather.xlsx'
df_source = pd.read_excel(work_dir)
print(df_source.head())

         ymd bWendu yWendu tianqi fengxiang fengji  aqi aqiInfo  aqiLevel
0 2025-01-01    3°C   -6°C   晴~多云       西北风   1-2级   59       优         2
1 2025-01-02    2°C   -9°C      阴       东南风   3-4级   48       优         1
2 2025-01-03    2°C   -2°C   晴~多云        西风   4-8级   28       良         1
3 2025-01-04    0°C   -4°C   晴~多云        东风   2-5级   30       良         1
4 2025-01-05    3°C   -1°C     小雨        东风   3-5级   25       良         1

print(df_source.index)
RangeIndex(start=0, stop=19, step=1)

print(df_source.shape)
(19, 9)

total_row_count = df_source.shape[0]
print(total_row_count)
19

将一个大excel等份拆成多个excel

使用df.iloc方法，将一个大的dataframe,拆分成多个小的dataframe
将使用dataframe.to_excel保存每个小excel

计算拆分后的每个excel行数

# 这个大excel，会拆分给下面几个人

work_dir = r'C:\TELCEL_MEXICO_BOT\A\Weather.xlsx'
df_source = pd.read_excel(work_dir)
print(df_source.head())

         ymd bWendu yWendu tianqi fengxiang fengji  aqi aqiInfo  aqiLevel
0 2025-01-01    3°C   -6°C   晴~多云       西北风   1-2级   59       优         2
1 2025-01-02    2°C   -9°C      阴       东南风   3-4级   48       优         1
2 2025-01-03    2°C   -2°C   晴~多云        西风   4-8级   28       良         1
3 2025-01-04    0°C   -4°C   晴~多云        东风   2-5级   30       良         1
4 2025-01-05    3°C   -1°C     小雨        东风   3-5级   25       良         1

total_row_count = df_source.shape[0]
print(total_row_count)
19

user_names = ['张飞','刘备','关羽','诸葛亮','曹操','张辽']
# 每个人的任务数目
split_size = total_row_count // len(user_names)
if total_row_count % len(user_names) !=0:
    split_size +=1

print(split_size)
4

拆分成多个dataframe

df_subs = []
for idx, user_name in enumerate(user_names):
    # iloc的开始索引
    begin = idx * split_size
    #iloc的结果索引
    end = begin + split_size
    #实现df按照iloc拆分
    df_sub = df_source.iloc[begin:end]
    #将每个了df存入列表
    df_subs.append((idx, user_name, df_sub))

##将每个dataframe存入excel
for idx, user_name, df_sub in df_subs:
    file_name = fr'C:\TELCEL_MEXICO_BOT\A\Weather_{idx},{user_name.xlsx}'
    df_sub.to_excel(file_name,index=False, engine='openpyxl')

二，合并多个小excel到一个大excel

遍历文件夹，得到要合并的excel文件列表
分别读取到dataframe, 给每个df添加一列用于标记来源
使用pd.concat进行df批量合并
将合并后的dataframe输出到excel

## 直接从DeepSeek上面获取到了代码和详细说明
import os
import pandas as pd

# 定义路径
input_folder = r'C:\TELCEL_MEXICO_BOT\A\excel_split_merge'
output_file = os.path.join(input_folder, 'merged_excel.xlsx')

# 获取所有 Excel 文件
excel_files = [f for f in os.listdir(input_folder) if f.endswith('.xlsx')]


# 检查是否有文件
if not excel_files:
    print("没有找到 Excel 文件！")
else:
    # 创建一个空的 DataFrame 用于存储合并后的数据
    merged_df = pd.DataFrame()

    # 遍历所有 Excel 文件并合并
    for file in excel_files:
        file_path = os.path.join(input_folder, file)
        # 提取姓名部分
        try:
            name = file.split('_')[-1].split('.')[0]  # 提取最后一个下划线后的部分
        except IndexError:
            print(f"文件名格式不正确: {file}")
            continue  # 跳过格式不正确的文件

        df = pd.read_excel(file_path)  # 读取 Excel 文件
        df['姓名'] = name  # 添加姓名列
        merged_df = pd.concat([merged_df, df], ignore_index=True)  # 合并数据

    # 将合并后的数据保存到新的 Excel 文件
    merged_df.to_excel(output_file, index=False)
    print(f"合并完成，文件已保存到: {output_file}")