FIRST_ROWS优化模式语言排序模糊匹配问题

标题比较长,不过只有这样才能把问题描述清楚。

问题详细描述为,在FIRST_ROWS优化模式下,将会话排序和比较方式设置为语义模式,即忽略大小写模式,对字段进行LIKE模糊查询,可能导致错误的结果。

关于大小写不敏感的查询的详细描述,可以参考:http://yangtingkun.itpub.net/post/468/460324


下面直接看问题的现象:

SQL> CREATE TABLE T1 (ID NUMBER PRIMARY KEY, NAME VARCHAR2(30));

表已创建。

SQL> CREATE INDEX IND_T1_NAME ON T1(NAME);

索引已创建。

SQL> INSERT INTO T1 SELECT ROWNUM, CHR(64 + ROWNUM)
2 FROM ALL_OBJECTS WHERE ROWNUM <= 26;

已创建26行。

SQL> COMMIT;

提交完成。

SQL> ALTER SESSION SET NLS_COMP = LINGUISTIC;

会话已更改。

SQL> ALTER SESSION SET NLS_SORT = BINARY_CI;

会话已更改。

SQL> SELECT * FROM T1 WHERE NAME LIKE 'a%';

ID NAME
---------- ------------------------------
1 A

SQL> SELECT /*+ FIRST_ROWS */ * FROM T1 WHERE NAME LIKE 'a%';

未选定行

只要修改上面提到的关键点中的任意一个,就不会产生这个错误的现象:

SQL> SELECT /*+ ALL_ROWS */ * FROM T1 WHERE NAME LIKE 'a%';

ID NAME
---------- ------------------------------
1 A

SQL> SELECT /*+ FIRST_ROWS */ * FROM T1 WHERE NAME = 'a';

ID NAME
---------- ------------------------------
1 A

SQL> ALTER SESSION SET NLS_SORT = BINARY;

会话已更改。

SQL> ALTER SESSION SET NLS_COMP = BINARY;

会话已更改。

SQL> SELECT /*+ FIRST_ROWS */ * FROM T1 WHERE NAME LIKE 'A%';

ID NAME
---------- ------------------------------
1 A

SQL> ALTER SESSION SET NLS_COMP = LINGUISTIC;

会话已更改。

SQL> ALTER SESSION SET NLS_SORT = BINARY_CI;

会话已更改。

SQL> SELECT /*+ FIRST_ROWS */ * FROM T1 WHERE NAME LIKE 'A%';

ID NAME
---------- ------------------------------
1 A

SQL> SELECT /*+ FIRST_ROWS */ * FROM T1 WHERE NAME LIKE 'a';

未选定行

通过上面的几个查询可以看到,问题和FIRST_ROWSLIKE操作以及基于语义的排序直接相关,下面看看Oracle在异常情况下采用了何种执行计划:

SQL> SET AUTOT ON EXP
SQL> SELECT /*+ FIRST_ROWS */ * FROM T1 WHERE NAME LIKE 'a';

未选定行

执行计划
----------------------------------------------------------
Plan hash value: 3350237141

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 35 (0)| 00:00:01 |
|* 1 | VIEW | index$_join$_001 | 1 | 30 | 35 (0)| 00:00:01 |
|* 2 | HASH JOIN | | | | | |
|* 3 | INDEX RANGE SCAN | IND_T1_NAME | 1 | 30 | 3 (34)| 00:00:01 |
| 4 | INDEX FAST FULL SCAN| SYS_C006622 | 1 | 30 | 33 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("NAME" LIKE 'a')
2 - access(ROWID=ROWID)
3 - access("NAME" LIKE 'a')

Note
-----
- dynamic sampling used for this statement

由于索引中并不包含语义查询的结果,因此Oracle这里必须访问表才能得到最终的结果,因此这个执行计划是错误的:

SQL> SELECT * FROM T1 WHERE NAME = 'a';

ID NAME
---------- ------------------------------
1 A

执行计划
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 30 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(NLSSORT("NAME",'nls_sort=''BINARY_CI''')=HEXTORAW('6100')
)

Note
-----
- dynamic sampling used for this statement

SQL> SELECT /*+ INDEX(T1) */ * FROM T1 WHERE NAME = 'a';

ID NAME
---------- ------------------------------
1 A

执行计划
----------------------------------------------------------
Plan hash value: 159298173

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 827 (1)| 00:00:10 |
|* 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 30 | 827 (1)| 00:00:10 |
| 2 | INDEX FULL SCAN | SYS_C006622 | 26 | | 26 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(NLSSORT("NAME",'nls_sort=''BINARY_CI''')=HEXTORAW('6100') )

Note
-----
- dynamic sampling used for this statement

上面的两个执行计划已经说明了问题的关键,Oracle对于语义的排序无法通过索引获取,必须要访问表或者相应的函数索引,详细描述可以参考文章开头部分给出的链接。

而采用了FIRST_ROWS优化模式后,当操作为LIKE时,Oracle优化器选择了错误的执行计划进行了优化,采用索引的范围扫描代替了表,从而引发了错误:

SQL> SELECT /*+ INDEX_JOIN(T1 IND_T1_NAME SYS_C006622) */ *
2 FROM T1
3 WHERE NAME LIKE 'a';

未选定行

执行计划
----------------------------------------------------------
Plan hash value: 3350237141

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 35 (0)| 00:00:01 |
|* 1 | VIEW | index$_join$_001 | 1 | 30 | 35 (0)| 00:00:01 |
|* 2 | HASH JOIN | | | | | |
|* 3 | INDEX RANGE SCAN | IND_T1_NAME | 1 | 30 | 3 (34)| 00:00:01 |
| 4 | INDEX FAST FULL SCAN| SYS_C006622 | 1 | 30 | 33 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("NAME" LIKE 'a')
2 - access(ROWID=ROWID)
3 - access("NAME" LIKE 'a')

Note
-----
- dynamic sampling used for this statement

现在没有使用FIRST_ROWS,而采用HINT也达到了相同的效果。而解决这个问题的方法就是通过HINT来避免索引范围扫描的发生。

SQL> ALTER SESSION SET OPTIMIZER_MODE = FIRST_ROWS;

会话已更改。

SQL> SELECT * FROM T1 WHERE NAME LIKE 'a';

未选定行

执行计划
----------------------------------------------------------
Plan hash value: 3350237141

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 35 (0)| 00:00:01 |
|* 1 | VIEW | index$_join$_001 | 1 | 30 | 35 (0)| 00:00:01 |
|* 2 | HASH JOIN | | | | | |
|* 3 | INDEX RANGE SCAN | IND_T1_NAME | 1 | 30 | 3 (34)| 00:00:01 |
| 4 | INDEX FAST FULL SCAN| SYS_C006622 | 1 | 30 | 33 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("NAME" LIKE 'a')
2 - access(ROWID=ROWID)
3 - access("NAME" LIKE 'a')

Note
-----
- dynamic sampling used for this statement

SQL> SELECT /*+ FULL(T1) */ * FROM T1 WHERE NAME LIKE 'a';

ID NAME
---------- ------------------------------
1 A

执行计划
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 30 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("NAME" LIKE 'a')

Note
-----
- dynamic sampling used for this statement

SQL> SELECT /*+ NO_INDEX(T1) */ * FROM T1 WHERE NAME LIKE 'a';

ID NAME
---------- ------------------------------
1 A

执行计划
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 30 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 30 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("NAME" LIKE 'a')

Note
-----
- dynamic sampling used for this statement

查询metalinkOracleDoc ID: Note:5252496.8明确说明了这个bug,这个bug会在Oracle10.2.0.4和11.1.0.6中被Fixed

import tkinter as tk from tkinter import filedialog, messagebox, ttk, scrolledtext import csv from datetime import datetime import logging import os from collections import defaultdict class CSVProcessorApp: def __init__(self, root): self.root = root self.root.title("CSV_ProcessPro") self.root.geometry("800x600") self.root.resizable(False, False) # 初始化变量 self.file_path = tk.StringVar() self.csv_data = [] self.headers = [] self.raw_data = [] # 存储原始数据 self.header_row_index = tk.IntVar(value=0) # 表头行索引 self.setup_variables() self.setup_logging() self.create_widgets() self.setup_styles() def setup_styles(self): """设置全局样式""" self.style = ttk.Style() self.style.configure("TFrame", background="#f0f0f0") self.style.configure("TLabel", background="#f0f0f0", font=('Arial', 9)) self.style.configure("TButton", font=('Arial', 9, 'bold')) self.style.configure("Accent.TButton", foreground="black", font=('Arial', 9, 'bold'), borderwidth=2, relief="raised") self.style.map("Accent.TButton", background=[("active", "#4a90e2"), ("!active", "#d4e6ff")], bordercolor=[("active", "#4a90e2"), ("!active", "#ffcc00")]) self.style.configure("Remove.TButton", foreground="black", font=('Arial', 8), background="#ffcccc", borderwidth=1, relief="solid") self.style.map("Remove.TButton", background=[("active", "#ff9999"), ("!active", "#ffcccc")]) self.style.configure("Header.TCombobox", font=('Arial', 9)) def setup_variables(self): """初始化所有动态变量""" # 排序相关 self.sort_header = tk.StringVar() self.sort_order = tk.StringVar(value="升序") # 去重相关 self.dedupe_header = tk.StringVar() # 删除行相关 self.delete_keyword = tk.StringVar() self.delete_column = tk.StringVar() self.delete_case_sensitive = tk.BooleanVar() # 合并文件相关 self.merge_file_paths = [] self.merge_column = tk.StringVar() # 状态变量 self.enable_sort = tk.BooleanVar() self.enable_dedupe = tk.BooleanVar() self.enable_custom_letter_sort = tk.BooleanVar() self.letter_range_start = tk.StringVar(value="A") self.letter_range_end = tk.StringVar(value="Z") # 组合处理相关 self.enable_delete = tk.BooleanVar(value=True) self.enable_combined_sort = tk.BooleanVar(value=True) self.enable_combined_dedupe = tk.BooleanVar(value=True) def setup_logging(self): """配置日志记录""" logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', handlers=[ logging.FileHandler('csv_processor.log', encoding='utf-8'), logging.StreamHandler() ] ) self.logger = logging.getLogger(__name__) self.logger.info("===== 程序启动 =====") def create_widgets(self): """创建所有界面组件""" # 主容器 main_container = ttk.Frame(self.root, padding=5) main_container.pack(fill=tk.BOTH, expand=True) # 使用notebook分页组织功能 self.notebook = ttk.Notebook(main_container) self.notebook.pack(fill=tk.BOTH, expand=True) # 创建各个标签页 self.create_file_tab() self.create_process_tab() self.create_delete_tab() self.create_merge_tab() self.create_combined_tab() # 新增组合处理标签页 self.create_log_tab() def create_file_tab(self): """创建文件操作标签页""" tab = ttk.Frame(self.notebook) self.notebook.add(tab, text="文件操作") # 文件选择部分 frame = ttk.LabelFrame(tab, text="CSV文件选择", padding=10) frame.pack(fill=tk.X, padx=5, pady=5) ttk.Label(frame, text="文件路径:").grid(row=0, column=0, sticky=tk.W) ttk.Entry(frame, textvariable=self.file_path, width=40).grid(row=0, column=1, sticky=tk.EW) ttk.Button(frame, text="浏览", command=self.select_file).grid(row=0, column=2, padx=5) # 表头行选择 ttk.Label(frame, text="表头选择:").grid(row=1, column=0, sticky=tk.W) self.header_row_combobox = ttk.Combobox( frame, textvariable=self.header_row_index, state="readonly", width=5, style="Header.TCombobox" ) self.header_row_combobox.grid(row=1, column=1, sticky=tk.W) ttk.Label(frame, text="(0表示第一行)").grid(row=1, column=2, sticky=tk.W) # 重新解析按钮 ttk.Button(frame, text="重新解析", command=self.reparse_data, style="Accent.TButton").grid(row=1, column=3, padx=5) # 文件信息显示 self.file_info = scrolledtext.ScrolledText(tab, height=8, width=80) self.file_info.pack(fill=tk.X, padx=5, pady=5) def create_process_tab(self): """创建数据处理标签页""" tab = ttk.Frame(self.notebook) self.notebook.add(tab, text="排序/去重") # 排序选项部分 frame = ttk.LabelFrame(tab, text="排序选项", padding=10) frame.pack(fill=tk.X, padx=5, pady=5) ttk.Checkbutton(frame, text="启用排序", variable=self.enable_sort, command=self.toggle_sort).grid(row=0, column=0, sticky=tk.W) ttk.Label(frame, text="排序表头:").grid(row=1, column=0, sticky=tk.W) self.sort_header_combobox = ttk.Combobox(frame, textvariable=self.sort_header, state="readonly") self.sort_header_combobox.grid(row=1, column=1, sticky=tk.EW) ttk.Label(frame, text="排序方式:").grid(row=2, column=0, sticky=tk.W) self.sort_order_combobox = ttk.Combobox( frame, textvariable=self.sort_order, values=["升序", "降序", "自定义字母排序"] ) self.sort_order_combobox.grid(row=2, column=1, sticky=tk.W) # 自定义字母排序范围 ttk.Checkbutton(frame, text="启用字母范围过滤", variable=self.enable_custom_letter_sort, command=self.toggle_letter_sort).grid(row=3, column=0, sticky=tk.W) ttk.Label(frame, text="字母范围:").grid(row=4, column=0, sticky=tk.W) self.letter_range_start_entry = ttk.Entry(frame, textvariable=self.letter_range_start, width=5) self.letter_range_start_entry.grid(row=4, column=1, sticky=tk.W) ttk.Label(frame, text="到").grid(row=4, column=2) self.letter_range_end_entry = ttk.Entry(frame, textvariable=self.letter_range_end, width=5) self.letter_range_end_entry.grid(row=4, column=3, sticky=tk.W) # 去重选项部分 frame = ttk.LabelFrame(tab, text="去重选项", padding=10) frame.pack(fill=tk.X, padx=5, pady=5) ttk.Checkbutton(frame, text="启用去重", variable=self.enable_dedupe, command=self.toggle_dedupe).grid(row=0, column=0, sticky=tk.W) ttk.Label(frame, text="去重表头:").grid(row=1, column=0, sticky=tk.W) self.dedupe_header_combobox = ttk.Combobox(frame, textvariable=self.dedupe_header, state="readonly") self.dedupe_header_combobox.grid(row=1, column=1, sticky=tk.EW) # 处理按钮 btn_frame = ttk.Frame(tab) btn_frame.pack(pady=10) ttk.Button(btn_frame, text="处理并保存到桌面", command=self.process_csv, style="Accent.TButton").pack() def create_delete_tab(self): """创建删除行标签页""" tab = ttk.Frame(self.notebook) self.notebook.add(tab, text="删除行") frame = ttk.LabelFrame(tab, text="删除包含指定字符的行", padding=10) frame.pack(fill=tk.X, padx=5, pady=5) # 删除条件设置 ttk.Label(frame, text="搜索列:").grid(row=0, column=0, sticky=tk.W) self.delete_column_combobox = ttk.Combobox(frame, textvariable=self.delete_column, state="readonly") self.delete_column_combobox.grid(row=0, column=1, sticky=tk.EW) ttk.Label(frame, text="关键字:").grid(row=1, column=0, sticky=tk.W) ttk.Entry(frame, textvariable=self.delete_keyword).grid(row=1, column=1, sticky=tk.EW) ttk.Checkbutton(frame, text="区分大小写", variable=self.delete_case_sensitive).grid(row=2, column=0, sticky=tk.W) # 执行按钮 btn_frame = ttk.Frame(tab) btn_frame.pack(pady=10) ttk.Button(btn_frame, text="执行删除并保存到桌面", command=self.delete_rows_with_keyword, style="Accent.TButton").pack() def create_merge_tab(self): """创建文件合并标签页""" tab = ttk.Frame(self.notebook) self.notebook.add(tab, text="文件合并") # 合并文件部分 frame = ttk.LabelFrame(tab, text="合并CSV文件", padding=10) frame.pack(fill=tk.BOTH, expand=True, padx=5, pady=5) # 文件列表容器 list_frame = ttk.Frame(frame) list_frame.pack(fill=tk.BOTH, expand=True) ttk.Label(list_frame, text="已选择文件:").grid(row=0, column=0, sticky=tk.W) # 文件列表和滚动条 self.merge_file_canvas = tk.Canvas(list_frame, height=150) self.merge_file_canvas.grid(row=1, column=0, sticky=tk.EW) scrollbar = ttk.Scrollbar(list_frame, orient="vertical", command=self.merge_file_canvas.yview) scrollbar.grid(row=1, column=1, sticky=tk.NS) self.merge_file_canvas.configure(yscrollcommand=scrollbar.set) self.merge_file_frame = ttk.Frame(self.merge_file_canvas) self.merge_file_canvas.create_window((0, 0), window=self.merge_file_frame, anchor="nw") # 按钮区域 btn_frame = ttk.Frame(frame) btn_frame.pack(fill=tk.X, pady=5) ttk.Button(btn_frame, text="添加文件", command=self.add_merge_file).pack(side=tk.LEFT, padx=5) ttk.Button(btn_frame, text="清空列表", command=self.clear_merge_list).pack(side=tk.LEFT, padx=5) # 合并选项 opt_frame = ttk.Frame(frame) opt_frame.pack(fill=tk.X, pady=5) ttk.Label(opt_frame, text="合并依据列(可选):").grid(row=0, column=0, sticky=tk.W) self.merge_column_combo = ttk.Combobox(opt_frame, textvariable=self.merge_column, state="readonly") self.merge_column_combo.grid(row=0, column=1, sticky=tk.EW) # 合并按钮 btn_frame = ttk.Frame(tab) btn_frame.pack(pady=10) ttk.Button(btn_frame, text="执行合并并保存到桌面", command=self.merge_csv_files, style="Accent.TButton").pack() def create_combined_tab(self): """创建组合处理标签页""" tab = ttk.Frame(self.notebook) self.notebook.add(tab, text="组合处理") # 组合处理选项 frame = ttk.LabelFrame(tab, text="组合处理选项", padding=10) frame.pack(fill=tk.X, padx=5, pady=5) # 删除行选项 delete_frame = ttk.Frame(frame) delete_frame.pack(fill=tk.X, pady=5) ttk.Checkbutton(delete_frame, text="启用删除行", variable=self.enable_delete).pack(side=tk.LEFT, padx=5) ttk.Label(delete_frame, text="搜索列:").pack(side=tk.LEFT, padx=5) self.combined_delete_column_combobox = ttk.Combobox( delete_frame, textvariable=self.delete_column, state="readonly", width=15 ) self.combined_delete_column_combobox.pack(side=tk.LEFT, padx=5) ttk.Label(delete_frame, text="关键字:").pack(side=tk.LEFT, padx=5) ttk.Entry(delete_frame, textvariable=self.delete_keyword, width=15).pack(side=tk.LEFT, padx=5) ttk.Checkbutton(delete_frame, text="区分大小写", variable=self.delete_case_sensitive).pack(side=tk.LEFT, padx=5) # 排序选项 sort_frame = ttk.Frame(frame) sort_frame.pack(fill=tk.X, pady=5) ttk.Checkbutton(sort_frame, text="启用排序", variable=self.enable_combined_sort).pack(side=tk.LEFT, padx=5) ttk.Label(sort_frame, text="排序表头:").pack(side=tk.LEFT, padx=5) self.combined_sort_header_combobox = ttk.Combobox( sort_frame, textvariable=self.sort_header, state="readonly", width=15 ) self.combined_sort_header_combobox.pack(side=tk.LEFT, padx=5) ttk.Label(sort_frame, text="排序方式:").pack(side=tk.LEFT, padx=5) self.combined_sort_order_combobox = ttk.Combobox( sort_frame, textvariable=self.sort_order, values=["升序", "降序", "自定义字母排序"], width=15 ) self.combined_sort_order_combobox.pack(side=tk.LEFT, padx=5) # 自定义字母排序范围(新增) ttk.Checkbutton( sort_frame, text="启用字母范围", variable=self.enable_custom_letter_sort ).pack(side=tk.LEFT, padx=5) ttk.Label(sort_frame, text="从").pack(side=tk.LEFT, padx=5) ttk.Entry(sort_frame, textvariable=self.letter_range_start, width=3).pack(side=tk.LEFT) ttk.Label(sort_frame, text="到").pack(side=tk.LEFT, padx=5) ttk.Entry(sort_frame, textvariable=self.letter_range_end, width=3).pack(side=tk.LEFT) # 去重选项 dedupe_frame = ttk.Frame(frame) dedupe_frame.pack(fill=tk.X, pady=5) ttk.Checkbutton(dedupe_frame, text="启用去重", variable=self.enable_combined_dedupe).pack(side=tk.LEFT, padx=5) ttk.Label(dedupe_frame, text="去重表头:").pack(side=tk.LEFT, padx=5) self.combined_dedupe_header_combobox = ttk.Combobox( dedupe_frame, textvariable=self.dedupe_header, state="readonly", width=15 ) self.combined_dedupe_header_combobox.pack(side=tk.LEFT, padx=5) # 处理按钮 btn_frame = ttk.Frame(tab) btn_frame.pack(pady=10) ttk.Button(btn_frame, text="执行组合处理并保存到桌面", command=self.combined_process, style="Accent.TButton").pack() def create_log_tab(self): """创建日志标签页""" tab = ttk.Frame(self.notebook) self.notebook.add(tab, text="运行日志") self.log_text = scrolledtext.ScrolledText(tab, height=15, width=80) self.log_text.pack(fill=tk.BOTH, expand=True, padx=5, pady=5) def log_message(self, message, level="info"): """记录日志并显示在GUI中""" log_methods = { "info": self.logger.info, "error": self.logger.error, "warning": self.logger.warning } # 记录到日志文件 log_methods.get(level, self.logger.info)(message) # 显示在GUI日志标签页 timestamp = datetime.now().strftime("%H:%M:%S") tagged_msg = f"[{timestamp}] {message}" self.log_text.insert(tk.END, tagged_msg + "\n") self.log_text.see(tk.END) # 同时在文件信息标签页显示重要信息 if level in ["error", "warning"]: self.file_info.config(state=tk.NORMAL) self.file_info.insert(tk.END, tagged_msg + "\n") self.file_info.config(state=tk.DISABLED) self.file_info.see(tk.END) def select_file(self): """选择CSV文件""" file_path = filedialog.askopenfilename( title="选择CSV文件", filetypes=[("CSV文件", "*.csv"), ("文本文件", "*.txt"), ("所有文件", "*.*")] ) if file_path: self.file_path.set(file_path) self.log_message(f"已选择文件: {file_path}") self.load_csv(file_path) def reparse_data(self): """重新解析数据(使用新的表头行)""" if not self.file_path.get(): messagebox.showwarning("警告", "请先选择CSV文件") return self.log_message(f"重新解析数据,使用表头行: {self.header_row_index.get()}") self.parse_data(self.raw_data) def load_csv(self, file_path): """加载CSV文件内容""" try: with open(file_path, 'r', encoding='utf-8-sig') as file: reader = csv.reader(file) self.raw_data = list(reader) # 保存原始数据 self.parse_data(self.raw_data) self.log_message(f"文件加载成功,共 {len(self.csv_data)} 行") except Exception as e: error_msg = f"读取CSV文件失败: {str(e)}" self.log_message(error_msg, "error") messagebox.showerror("错误", error_msg) def parse_data(self, raw_data): """解析原始数据,根据选择的表头行""" if not raw_data: return # 更新表头行选择框 row_options = list(range(len(raw_data))) self.header_row_combobox['values'] = row_options # 使用用户选择的表头行 header_index = self.header_row_index.get() if header_index < 0 or header_index >= len(raw_data): header_index = 0 self.header_row_index.set(0) # 设置表头和数据 # 表头行之前的数据保留为数据行 self.headers = raw_data[header_index] self.csv_data = raw_data[:header_index] + raw_data[header_index+1:] # 更新UI self.update_ui_with_headers() self.show_file_info(self.file_path.get()) def show_file_info(self, file_path): """显示文件信息""" self.file_info.config(state=tk.NORMAL) self.file_info.delete(1.0, tk.END) info = [ f"文件路径: {file_path}", f"总行数: {len(self.csv_data)}", f"列数: {len(self.headers)}", f"表头: {', '.join(self.headers)}", f"表头行: {self.header_row_index.get()}", "="*40, "前5行数据预览:" ] self.file_info.insert(tk.END, "\n".join(info) + "\n") # 显示前5行数据 for i, row in enumerate(self.csv_data[:5], 1): self.file_info.insert(tk.END, f"{i}. {', '.join(row)}\n") self.file_info.config(state=tk.DISABLED) def update_ui_with_headers(self): """根据加载的CSV更新UI元素""" # 更新所有下拉框 for combo in [ self.sort_header_combobox, self.dedupe_header_combobox, self.delete_column_combobox, self.merge_column_combo, self.combined_delete_column_combobox, self.combined_sort_header_combobox, self.combined_dedupe_header_combobox ]: combo['values'] = self.headers # 设置默认值 if self.headers: self.sort_header.set(self.headers[0]) self.dedupe_header.set(self.headers[0]) self.delete_column.set(self.headers[0]) self.merge_column.set("") def toggle_sort(self): """切换排序功能的启用状态""" state = "normal" if self.enable_sort.get() else "disabled" self.sort_header_combobox['state'] = state self.sort_order_combobox['state'] = state self.toggle_letter_sort() self.log_message(f"排序功能 {'启用' if self.enable_sort.get() else '禁用'}") def toggle_dedupe(self): """切换去重功能的启用状态""" state = "normal" if self.enable_dedupe.get() else "disabled" self.dedupe_header_combobox['state'] = state self.log_message(f"去重功能 {'启用' if self.enable_dedupe.get() else '禁用'}") def toggle_letter_sort(self): """控制字母范围输入框的启用状态""" if not self.enable_sort.get(): return state = "normal" if self.enable_custom_letter_sort.get() else "disabled" self.letter_range_start_entry['state'] = state self.letter_range_end_entry['state'] = state self.log_message(f"字母范围过滤 {'启用' if self.enable_custom_letter_sort.get() else '禁用'}") def add_merge_file(self): """添加要合并的文件""" file_paths = filedialog.askopenfilenames( title="选择要合并的CSV文件", filetypes=[("CSV文件", "*.csv"), ("文本文件", "*.txt"), ("所有文件", "*.*")] ) if file_paths: for path in file_paths: if path not in self.merge_file_paths: self.merge_file_paths.append(path) self.update_merge_file_list() def clear_merge_list(self): """清空合并文件列表""" if self.merge_file_paths: self.merge_file_paths = [] self.update_merge_file_list() self.log_message("已清空合并文件列表") def update_merge_file_list(self): """更新合并文件列表显示""" # 清除现有内容 for widget in self.merge_file_frame.winfo_children(): widget.destroy() if not self.merge_file_paths: ttk.Label(self.merge_file_frame, text="尚未选择任何文件").pack() self.merge_file_canvas.configure(scrollregion=self.merge_file_canvas.bbox("all")) return # 添加文件列表 for i, path in enumerate(self.merge_file_paths): row_frame = ttk.Frame(self.merge_file_frame) row_frame.pack(fill=tk.X, pady=2) ttk.Label(row_frame, text=f"{i+1}. {os.path.basename(path)}", width=40, anchor="w").pack(side=tk.LEFT) ttk.Button(row_frame, text="移除", command=lambda p=path: self.remove_merge_file(p), style="Remove.TButton").pack(side=tk.LEFT, padx=2) # 更新滚动区域 self.merge_file_frame.update_idletasks() self.merge_file_canvas.configure(scrollregion=self.merge_file_canvas.bbox("all")) def remove_merge_file(self, file_path): """移除指定的合并文件""" if file_path in self.merge_file_paths: self.merge_file_paths.remove(file_path) self.update_merge_file_list() self.log_message(f"已移除文件: {file_path}") def delete_rows(self, data, column, keyword, case_sensitive): """删除包含关键字的行(通用方法)""" if not column or not keyword or not data: return data try: col_index = self.headers.index(column) if not case_sensitive: keyword = keyword.lower() new_data = [data[0]] # 保留表头 deleted_count = 0 for row in data[1:]: if len(row) > col_index: value = row[col_index] compare_value = value if case_sensitive else value.lower() if keyword not in compare_value: new_data.append(row) else: deleted_count += 1 self.log_message(f"删除行: 移除了 {deleted_count} 行包含 '{keyword}' 的数据") return new_data except Exception as e: error_msg = f"删除行时出错: {str(e)}" self.log_message(error_msg, "error") messagebox.showerror("错误", error_msg) return data def sort_data(self, data, header, order, enable_letter_sort, letter_start, letter_end): """对数据进行排序(通用方法)""" if not header or not data: return data try: sort_index = self.headers.index(header) reverse = (order == "降序") # 字母范围过滤 if enable_letter_sort: try: letter_start = letter_start.upper() letter_end = letter_end.upper() if not (len(letter_start) == 1 and len(letter_end) == 1 and letter_start.isalpha() and letter_end.isalpha()): raise ValueError("字母范围必须是单个字母(如A-Z)") filtered_rows = [] for row in data[1:]: # 跳过表头 if len(row) > sort_index: value = str(row[sort_index]).strip().upper() if value and letter_start <= value[0] <= letter_end: filtered_rows.append(row) data = [data[0]] + filtered_rows self.log_message(f"字母范围过滤完成:{letter_start} 到 {letter_end}") except Exception as e: self.log_message(f"字母范围过滤失败: {str(e)}", "error") messagebox.showerror("错误", f"字母范围过滤失败: {str(e)}") return data # 排序逻辑 def sort_key(row): if len(row) > sort_index: value = row[sort_index] # 尝试解析为日期 for fmt in ("%Y-%m-%d %H:%M:%S", "%Y-%m-%d", "%m/%d/%Y", "%Y.%m.%d"): try: return datetime.strptime(value, fmt) except ValueError: continue # 尝试解析为数字 try: return float(value) except ValueError: pass return value.lower() # 默认按字符串排序 return "" # 执行排序 if order == "自定义字母排序": data[1:] = sorted( data[1:], key=lambda x: str(sort_key(x)).lower() if len(x) > sort_index else "", reverse=False ) else: data[1:] = sorted(data[1:], key=sort_key, reverse=reverse) self.log_message(f"排序完成,表头 '{header}',顺序: {order}") return data except Exception as e: self.log_message(f"排序时出错: {str(e)}", "error") messagebox.showerror("错误", f"排序时出错: {str(e)}") return data def dedupe_data(self, data, header): """对数据进行去重(通用方法)""" if not header or not data: return data try: dedupe_index = self.headers.index(header) seen = set() unique_rows = [data[0]] # 保留表头 for row in data[1:]: if len(row) > dedupe_index: key = row[dedupe_index] if key not in seen: seen.add(key) unique_rows.append(row) self.log_message( f"去重完成,根据表头 '{header}' 删除重复项," f"原始行数: {len(data)},去重后行数: {len(unique_rows)}" ) return unique_rows except Exception as e: self.log_message(f"去重时出错: {str(e)}", "error") messagebox.showerror("错误", f"去重时出错: {str(e)}") return data def delete_rows_with_keyword(self): """删除包含关键字的行并保存到桌面""" if not self.file_path.get(): messagebox.showwarning("警告", "请先选择CSV文件") return column = self.delete_column.get() keyword = self.delete_keyword.get() if not column: messagebox.showwarning("警告", "请选择要搜索的列") return if not keyword: messagebox.showwarning("警告", "请输入要搜索的关键字") return try: # 执行删除 processed_data = self.delete_rows( self.csv_data, column, keyword, self.delete_case_sensitive.get() ) # 生成保存路径 operation = f"deleted_{keyword}" save_path = self.generate_filename(self.file_path.get(), operation) # 保存文件 if self.save_csv_file(processed_data, save_path): # 更新当前数据 self.csv_data = processed_data messagebox.showinfo("成功", f"结果已保存到桌面:\n{os.path.basename(save_path)}") except Exception as e: error_msg = f"删除行时出错: {str(e)}" self.log_message(error_msg, "error") messagebox.showerror("错误", error_msg) def get_desktop_path(self): """获取桌面路径""" try: desktop = os.path.join(os.path.join(os.environ['USERPROFILE']), 'Desktop') if os.path.exists(desktop): return desktop except KeyError: pass # 如果上面的方法失败,尝试其他方法 desktop = os.path.join(os.path.expanduser('~'), 'Desktop') if os.path.exists(desktop): return desktop # 如果还是失败,返回当前目录 return os.getcwd() def generate_filename(self, original_name, operation): """生成新的文件名""" if not original_name: original_name = "processed" base = os.path.basename(original_name) name, ext = os.path.splitext(base) # 清理操作名称中的特殊字符 clean_op = "".join(c if c.isalnum() else "_" for c in operation) timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") new_name = f"{name}_{clean_op}_{timestamp}{ext}" return os.path.join(self.get_desktop_path(), new_name) def save_csv_file(self, data, save_path): """保存CSV文件到指定路径""" try: with open(save_path, 'w', encoding='utf-8-sig', newline='') as file: writer = csv.writer(file) writer.writerows(data) # 更新文件信息显示 self.show_file_info(save_path) self.log_message(f"文件已保存到: {save_path}") return True except Exception as e: error_msg = f"保存文件时出错: {str(e)}" self.log_message(error_msg, "error") messagebox.showerror("错误", error_msg) return False def process_csv(self): """处理CSV文件(排序、去重等)并保存到桌面""" if not self.file_path.get(): messagebox.showwarning("警告", "请先选择CSV文件") return if not self.csv_data: messagebox.showwarning("警告", "CSV文件没有数据") return self.log_message("开始处理CSV文件...") processed_data = self.csv_data.copy() # 去重处理 if self.enable_dedupe.get(): processed_data = self.dedupe_data( processed_data, self.dedupe_header.get() ) # 排序处理 if self.enable_sort.get(): processed_data = self.sort_data( processed_data, self.sort_header.get(), self.sort_order.get(), self.enable_custom_letter_sort.get(), self.letter_range_start.get(), self.letter_range_end.get() ) # 生成操作描述 operations = [] if self.enable_sort.get(): operations.append(f"sorted_{self.sort_header.get()}_{self.sort_order.get()}") if self.enable_dedupe.get(): operations.append(f"deduped_{self.dedupe_header.get()}") operation = "_".join(operations) if operations else "processed" # 生成保存路径 save_path = self.generate_filename(self.file_path.get(), operation) # 保存文件 if self.save_csv_file(processed_data, save_path): # 更新当前数据 self.csv_data = processed_data messagebox.showinfo("成功", f"文件处理完成,已保存到桌面:\n{os.path.basename(save_path)}") def combined_process(self): """组合处理:删除行 -> 排序 -> 去重""" if not self.file_path.get(): messagebox.showwarning("警告", "请先选择CSV文件") return if not self.csv_data: messagebox.showwarning("警告", "CSV文件没有数据") return self.log_message("开始组合处理CSV文件...") processed_data = self.csv_data.copy() operations = [] # 1. 删除行 if self.enable_delete.get(): column = self.delete_column.get() keyword = self.delete_keyword.get() if column and keyword: processed_data = self.delete_rows( processed_data, column, keyword, self.delete_case_sensitive.get() ) operations.append(f"deleted_{keyword}") # 2. 排序 if self.enable_combined_sort.get(): header = self.sort_header.get() order = self.sort_order.get() if header: processed_data = self.sort_data( processed_data, header, order, self.enable_custom_letter_sort.get(), self.letter_range_start.get(), self.letter_range_end.get() ) operations.append(f"sorted_{header}_{order}") # 3. 去重 if self.enable_combined_dedupe.get(): header = self.dedupe_header.get() if header: processed_data = self.dedupe_data( processed_data, header ) operations.append(f"deduped_{header}") # 生成操作描述 operation = "combined_" + "_".join(operations) if operations else "combined_processed" # 生成保存路径 save_path = self.generate_filename(self.file_path.get(), operation) # 保存文件 if self.save_csv_file(processed_data, save_path): # 更新当前数据 self.csv_data = processed_data messagebox.showinfo("成功", f"组合处理完成,已保存到桌面:\n{os.path.basename(save_path)}") def merge_csv_files(self): """合并多个CSV文件并保存到桌面""" if not self.merge_file_paths: messagebox.showwarning("警告", "请先添加要合并的文件") return try: # 检查所有文件是否存在 missing_files = [f for f in self.merge_file_paths if not os.path.exists(f)] if missing_files: raise FileNotFoundError(f"以下文件不存在: {', '.join(missing_files)}") merge_column = self.merge_column.get() common_headers = None all_data = [] # 收集所有文件的表头和数据 header_sets = [] for file_path in self.merge_file_paths: with open(file_path, 'r', encoding='utf-8-sig') as file: reader = csv.reader(file) data = list(reader) if data: header_sets.append(set(data[0])) all_data.append(data) # 找出共同表头 if header_sets: common_headers = set(header_sets[0]) for headers in header_sets[1:]: common_headers.intersection_update(headers) common_headers = sorted(common_headers) if not common_headers: raise ValueError("选中的文件没有共同的列,无法合并") # 如果没有指定合并依据列,使用所有共同列 merge_indices = None if merge_column: if merge_column not in common_headers: raise ValueError(f"合并依据列 '{merge_column}' 不在共同列中") merge_indices = [i for i, h in enumerate(common_headers) if h == merge_column] # 合并数据 merged_data = [common_headers.copy()] key_counter = defaultdict(int) for data in all_data: if not data: continue headers = data[0] header_map = {h: i for i, h in enumerate(headers)} for row in data[1:]: # 如果指定了合并列,检查是否已存在相同键 if merge_indices: merge_values = [row[header_map[h]] for h in common_headers if h == merge_column] if merge_values: key = tuple(merge_values) key_counter[key] += 1 if key_counter[key] > 1: continue # 跳过重复键的行 # 构建新行,只保留共同列 new_row = [] for col in common_headers: if col in header_map and len(row) > header_map[col]: new_row.append(row[header_map[col]]) else: new_row.append("") merged_data.append(new_row) # 生成操作描述 operation = "merged" if merge_column: operation += f"_by_{merge_column}" # 生成保存路径 first_file = os.path.basename(self.merge_file_paths[0]) save_path = self.generate_filename(first_file, operation) # 保存文件 if self.save_csv_file(merged_data, save_path): messagebox.showinfo("成功", f"文件合并完成,已保存到桌面:\n{os.path.basename(save_path)}") except Exception as e: error_msg = f"合并文件时出错: {str(e)}" self.log_message(error_msg, "error") messagebox.showerror("错误", error_msg) if __name__ == "__main__": root = tk.Tk() app = CSVProcessorApp(root) root.mainloop() 对这个代码的所有功能优化,要求 1.执行完功能操作输出的新的csv文件仍保留原数据默认表头和所自定义设定的表头以及原数据默认表头和自定义设定的表头中间的数据, 2.对于关键字要求缩小范围,不是只要有数据包含这个关键字就可以 并输出一份完整代码
07-10
很好,给出修改后的文件,一下是我目前的文件内容# excel_to_clm.py import os import re import json from openpyxl import load_workbook import xlrd class ExcelToCLMConverter: def __init__(self, config_path="config.json", output_dir="output"): self.config = self.load_config(config_path) self.data_rows = [] self.output_dir = output_dir os.makedirs(self.output_dir, exist_ok=True) def load_config(self, path): with open(path, 'r', encoding='utf-8') as f: return json.load(f) def col_to_letter(self, col): """Convert 0-based column index to Excel column letter""" col += 1 result = "" while col > 0: col -= 1 result = chr(col % 26 + ord('A')) + result col //= 26 return result def is_valid_power(self, value): try: float(value) return True except: return False def get_cell_value(self, ws_obj, row_idx, col_idx): fmt = ws_obj["format"] if fmt == "xls": return ws_obj["sheet"].cell_value(row_idx, col_idx) else: cell = ws_obj["sheet"].cell(row=row_idx + 1, column=col_idx + 1) return cell.value if cell.value is not None else "" def find_table_header_row(self, ws_obj): """查找包含 'Mode' 和 'Rate' 的那一行(表头)""" fmt = ws_obj["format"] ws = ws_obj["sheet"] for r in range(15): # 前15行 mode_col = None rate_col = None if fmt == "xlsx": if r + 1 > ws.max_row: continue for c in range(1, ws.max_column + 1): cell = ws.cell(row=r + 1, column=c) if not cell.value: continue val = str(cell.value).strip() if val == "Mode": mode_col = c elif val == "Rate": rate_col = c if mode_col and rate_col and abs(mode_col - rate_col) == 1: print(f"✅ 找到表头行: 第 {r+1} 行 (Mode={mode_col}, Rate={rate_col})") return r, mode_col, rate_col else: if r >= ws.nrows: continue for c in range(ws.ncols): val = ws.cell_value(r, c) if not val: continue val = str(val).strip() if val == "Mode": mode_col = c elif val == "Rate": rate_col = c if mode_col and rate_col and abs(mode_col - rate_col) == 1: print(f"✅ 找到表头行: 第 {r+1} 行 (Mode={mode_col}, Rate={rate_col})") return r, mode_col, rate_col return None, None, None def find_auth_power_above_row(self, ws_obj, start_row): """从顶部开始扫描前10行,模糊查找 '认证功率'""" fmt = ws_obj["format"] ws = ws_obj["sheet"] # 扫描前 min(10, start_row + 1) 行 max_scan_row = min(10, start_row + 5) print(f"🔍 开始向上查找 '认证功率',扫描第 0 ~ {max_scan_row - 1} 行...") if fmt == "xlsx": # 先检查合并单元格 for mr in ws.merged_cells.ranges: top_left = ws.cell(row=mr.min_row, column=mr.min_col) val = str(top_left.value) if top_left.value else "" # 模糊匹配:只要包含“证功率”就算 if "证功率" in val or "认证功" in val or "Cert" in val: r_idx = mr.min_row - 1 # 0-based if r_idx < max_scan_row: start = mr.min_col - 1 end = mr.max_col - 1 print(f"📌 发现合并单元格含 '证功率': '{val}' → 位置: {self.col_to_letter(start)}{mr.min_row}") return start, end, r_idx # 再查普通单元格 for r in range(max_scan_row): found_in_row = [] for c in range(1, ws.max_column + 1): cell = ws.cell(row=r + 1, column=c) if not cell.value: continue val = str(cell.value).strip() if "证功率" in val or "认证功" in val or "Cert" in val: found_in_row.append((c, val)) if found_in_row: c_first = found_in_row[0][0] print(f"📌 第 {r + 1} 行发现疑似 '认证功率': {[v for _, v in found_in_row]}") return c_first - 1, c_first - 1, r else: # xlrd (.xls) for r in range(min(ws.nrows, max_scan_row)): row_vals = [] for c in range(ws.ncols): val = ws.cell_value(r, c) if not val: continue val_str = str(val).strip() if "证功率" in val_str or "认证功" in val_str or "Cert" in val_str: row_vals.append(val_str) if row_vals: print(f"📌 第 {r + 1} 行发现疑似 '认证功率': {row_vals}") # 返回第一个匹配列 for c in range(ws.ncols): val = ws.cell_value(r, c) if val and ("证功率" in str(val) or "认证功" in str(val)): return c, c, r print("❌ 未能找到任何包含 '证功率' 的单元格") return None, None, None def parse_ch_columns_in_specific_row(self, ws_obj, row_idx, ch_range): """在指定行查找 CHx 列""" fmt = ws_obj["format"] ws = ws_obj["sheet"] start_ch, end_ch = ch_range ch_cols = {} if fmt == "xlsx": if row_idx + 1 > ws.max_row: return {} for c in range(1, ws.max_column + 1): cell = ws.cell(row=row_idx + 1, column=c) if not cell.value: continue text = str(cell.value).strip().upper() if text.startswith("CH"): try: ch_num = int(re.search(r"CH(\d+)", text).group(1)) if start_ch <= ch_num <= end_ch: if ch_num not in ch_cols: ch_cols[ch_num] = c - 1 # 0-based print(f"✅ 发现 CH{ch_num} → 第 {c} 列 ({self.col_to_letter(c-1)})") except Exception as e: print(f"⚠️ 无法解析 CH: {text}") else: if row_idx >= ws.nrows: return {} for c in range(ws.ncols): val = ws.cell_value(row_idx, c) if not val: continue text = str(val).strip().upper() if text.startswith("CH"): try: ch_num = int(re.search(r"CH(\d+)", text).group(1)) if start_ch <= ch_num <= end_ch: if ch_num not in ch_cols: ch_cols[ch_num] = c print(f"✅ 发现 CH{ch_num} → 第 {c} 列") except: pass return ch_cols def generate_clm_line(self, power_dbm, ch, rate_set, comment, range_macro): return f'{{ {power_dbm:.1f}, {ch}, {rate_set}, "{comment}", {range_macro} }},' def convert_sheet_with_config(self, ws_obj, sheet_name, sheet_config): ch_range = sheet_config["channel_range"] # Step 1: 找 Mode & Rate 表头 header_row_idx, mode_col, rate_col = self.find_table_header_row(ws_obj) if header_row_idx is None: print(f"🟡 跳过 '{sheet_name}':未找到 'Mode' 和 'Rate'") return # Step 2: 向上找 认证功率 auth_start, auth_end, auth_row = self.find_auth_power_above_row(ws_obj, header_row_idx) if auth_start is None: print(f"🟡 跳过 '{sheet_name}':未找到 '认证功率'") return # Step 3: CHx 在 认证功率 + 2 行 ch_row_idx = auth_row + 2 # Step 4: 解析 CHx 列 ch_cols = self.parse_ch_columns_in_specific_row(ws_obj, ch_row_idx, ch_range) if not ch_cols: print(f"⚠️ 在 '{sheet_name}' 中未找到有效的 CHx 列") return # Step 5: 开始读数据 data_start_row = header_row_idx + 1 nrows = ws_obj["sheet"].nrows if ws_obj["format"] == "xls" else ws_obj["sheet"].max_row current_mode = None for row_idx in range(data_start_row, nrows): mode_val = self.get_cell_value(ws_obj, row_idx, mode_col - 1) rate_val = self.get_cell_value(ws_obj, row_idx, rate_col - 1) mode_val = str(mode_val).strip() if mode_val else "" rate_val = str(rate_val).strip() if rate_val else "" if mode_val in sheet_config["modes"]: current_mode = mode_val if not rate_val or current_mode not in sheet_config["modes"]: continue rate_set = sheet_config["rate_set_map"][current_mode] comment_tmpl = sheet_config["comment_template"][current_mode] comment = comment_tmpl.replace("{rate}", rate_val) for ch, col_idx in ch_cols.items(): power_val = self.get_cell_value(ws_obj, row_idx, col_idx) if not self.is_valid_power(power_val): continue try: power_dbm = float(power_val) except: continue range_macro = sheet_config["range_macro_template"].format(ch=ch) line = self.generate_clm_line(power_dbm, ch, rate_set, comment, range_macro) self.data_rows.append(line) print(f"✔️ 成功从 '{sheet_name}' 添加 {len([r for r in self.data_rows if sheet_name in r])} 条记录") def clean_sheet_name(self, name): """ 清洗 sheet 名称,保留字母、数字、中文、点号、等号 删除无关符号如 ()()、空格等 """ # 保留: 字母、数字、中文、点、等号 cleaned = re.sub(r'[^\w\.\=\u4e00-\u9fa5]', '', str(name)) return cleaned def match_sheet_to_config(self, sheet_name): cleaned = self.clean_sheet_name(sheet_name) for cfg in self.config["sheets"]: for pat in cfg["pattern"]: if re.search(pat, cleaned, re.I): print(f"🧹 '{sheet_name}' → 清洗后: '{cleaned}'") print(f"✅ 匹配成功!'{sheet_name}' → [{cfg['band']}] 配置") return cfg print(f"🧹 '{sheet_name}' → 清洗后: '{cleaned}'") print(f"🟡 未匹配到 '{sheet_name}' 的模式,跳过...") return None def convert(self, file_path): ext = os.path.splitext(file_path)[-1].lower() if ext == ".xlsx": wb = load_workbook(file_path, data_only=True) sheets = [{"sheet": ws, "format": "xlsx"} for ws in wb.worksheets] elif ext == ".xls": wb = xlrd.open_workbook(file_path) sheets = [{"sheet": ws, "format": "xls"} for ws in wb.sheets()] else: raise ValueError("仅支持 .xls 或 .xlsx 文件") for i, ws_obj in enumerate(sheets): sheet_name = wb.sheet_names()[i] if ext == ".xls" else ws_obj["sheet"].title config = self.match_sheet_to_config(sheet_name) if config: self.convert_sheet_with_config(ws_obj, sheet_name, config) # 写出结果 output_file = os.path.join(self.output_dir, "clm_data.c") with open(output_file, "w", encoding="utf-8") as f: f.write("// Auto-generated CLM Power Table\n\n") f.write("#include \"clm.h\"\n\n") f.write("const clm_power_entry_t clm_power_table[] = {\n") for line in self.data_rows: f.write(f" {line}\n") f.write("};\n") f.write(f"\n// 📊 总共 {len(self.data_rows)} 条功率条目\n") print(f"\n🎉 成功生成: {output_file}") print(f"📊 总共 {len(self.data_rows)} 条功率条目") if __name__ == "__main__": converter = ExcelToCLMConverter() converter.convert("Archer BE900US 2.xlsx") # ← 修改为你的真实文件名 { "sheets": [ { "pattern": ["2\\.4G|2\\.4G|2G"], "band": "2.4G", "channel_range": [1, 13], "modes": ["11b", "11g", "11n", "11AC", "11AX", "11BE"], "rate_set_map": { "11b": "RATE_SET_CCK", "11g": "RATE_SET_OFDM", "11n": "RATE_SET_HT", "11AC": "RATE_SET_VHT", "11AX": "RATE_SET_HE", "11BE": "RATE_SET_EHT" }, "comment_template": { "11b": "2.4G CCK {rate}", "11g": "2.4G OFDM {rate}", "11n": "2.4G HT20 MCS{rate}", "11AC": "2.4G VHT20 MCS{rate}", "11AX": "2.4G HE20 MCS{rate}", "11BE": "2.4G EHT20 MCS{rate}" }, "range_macro_template": "CH{ch}_RANGE" }, { "pattern": ["5G"], "band": "5G", "channel_range": [36, 165], "modes": ["11a", "11g", "11n", "11AC", "11AX", "11BE"], "rate_set_map": { "11a": "RATE_SET_OFDM", "11n": "RATE_SET_HT", "11AC": "RATE_SET_VHT", "11AX": "RATE_SET_HE", "11BE": "RATE_SET_EHT" }, "comment_template": { "11a": "5G OFDM {rate}", "11n": "5G HT20 MCS{rate}", "11AC": "5G VHT20 MCS{rate}", "11AX": "5G HE20 MCS{rate}", "11BE": "5G EHT20 MCS{rate}" }, "range_macro_template": "CH{ch}_RANGE" }, { "pattern": ["6G.*NSS=1"], "band": "6G NSS=1", "channel_range": [1, 233], "modes": ["11BE"], "rate_set_map": { "11BE": "RATE_SET_EHT" }, "comment_template": { "11BE": "6G EHT20 MCS{rate}" }, "range_macro_template": "CH{ch}_RANGE" } ] }
10-10
为什么在这个打包文件无法识别,但是在文件excel_to_clm就能执行# clm_generator/excel_to_clm.py import os from datetime import datetime import re import json from pathlib import Path from openpyxl import load_workbook import xlrd from jinja2 import Template from collections import defaultdict from utils import resource_path, get_output_dir from pathlib import Path class ExcelToCLMConverter: def __init__(self, config_path="config/config.json", output_dir="output", locale_id=None, locale_display_name=None): self.last_generated_manifest = None self.used_ranges = [] # 改成 list 更合理(保持顺序) # === Step 1: 使用 resource_path 解析所有路径 === self.config_file_path = resource_path(config_path) if not os.path.exists(self.config_file_path): raise FileNotFoundError(f"配置文件不存在: {self.config_file_path}") with open(self.config_file_path, 'r', encoding='utf-8') as f: self.config = json.load(f) print(f" 配置文件已加载: {self.config_file_path}") # === 计算项目根目录:config 文件所在目录的父级 === project_root = Path(self.config_file_path).parent.parent # 即 F:\issue # === Step 2: 处理 target_c_file === rel_c_path = self.config.get("target_c_file", "input/wlc_clm_data_6726b0.c") self.target_c_file = resource_path(rel_c_path) if not os.path.exists(self.target_c_file): raise FileNotFoundError(f"配置中指定的 C 源文件不存在: {self.target_c_file}") print(f" 已定位目标 C 文件: {self.target_c_file}") # === Step 3: 初始化输出目录 === if os.path.isabs(output_dir): self.output_dir = output_dir else: self.output_dir = str(get_output_dir()) Path(self.output_dir).mkdir(parents=True, exist_ok=True) print(f" 输出目录: {self.output_dir}") # === Step 4: locale 设置 === self.locale_id = locale_id or self.config.get("DEFAULT_LOCALE_ID", "DEFAULT") self.locale_display_name = ( locale_display_name or self.config.get("DEFAULT_DISPLAY_NAME") or self.locale_id.replace('-', '_').upper() ) # === Step 5: channel_set_map 加载 === persisted_map = self.config.get("channel_set_map") if persisted_map is None: raise KeyError("配置文件缺少必需字段 'channel_set_map'") if not isinstance(persisted_map, dict): raise TypeError(f"channel_set_map 必须是字典类型,当前类型: {type(persisted_map)}") self.channel_set_map = {str(k): int(v) for k, v in persisted_map.items()} print(f" 成功加载 channel_set_map (共 {len(self.channel_set_map)} 项): {dict(self.channel_set_map)}") # === 初始化数据容器 === self.tx_power_data = [] self.tx_limit_entries = [] self.eirp_entries = [] self.global_ch_min = None self.global_ch_max = None self.generated_ranges = [] def reset(self): """重置所有运行时数据,便于多次生成""" self.tx_limit_entries.clear() self.eirp_entries.clear() self.used_ranges.clear() self.tx_power_data.clear() # 同步清空 self.generated_ranges.clear() self.last_generated_manifest = None print(" 所有生成数据已重置") print(f"🔧 初始化完成。目标C文件: {self.target_c_file}") print(f"📁 输出目录: {self.output_dir}") print(f"🌐 Locale ID: {self.locale_id}") # ==================== 新增工具方法:大小写安全查询 ==================== def _ci_get(self, data_dict, key): """ Case-insensitive 字典查找 """ for k, v in data_dict.items(): if k.lower() == key.lower(): return v return None def _ci_contains(self, data_list, item): """ Case-insensitive 判断元素是否在列表中 """ return any(x.lower() == item.lower() for x in data_list) def parse_mode_cell(self, cell_value): if not cell_value: return None val = str(cell_value).strip() val = re.sub(r'\s+', ' ', val.replace('\n', ' ').replace('\r', ' ')) val_upper = val.upper() found_modes = [] # 改进:使用 match + 允许后续内容(比如 20M),不要求全匹配 if re.match(r'^11AC\s*/\s*AX', val_upper) or re.match(r'^11AX\s*/\s*AC', val_upper): found_modes = ['11AC', '11AX'] print(f" 解析复合模式 '{val}' → {found_modes}") # ======== 一般情况:正则匹配标准模式 ======== else: mode_patterns = [ (r'\b11BE\b|\bEHT\b', '11BE'), (r'\b11AX\b|\bHE\b', '11AX'), (r'\b11AC\b|\bVHT\b', '11AC'), (r'\b11N\b|\bHT\b', '11N'), (r'\b11G\b|\bERP\b', '11G'), (r'\b11B\b|\bDSSS\b|\bCCK\b', '11B') ] for pattern, canonical in mode_patterns: if re.search(pattern, val_upper) and canonical not in found_modes: found_modes.append(canonical) # ======== 提取带宽 ======== bw_match = re.search(r'(20|40|80|160)\s*(?:MHZ|M)?\b', val_upper) bw = bw_match.group(1) if bw_match else None # fallback 带宽 if not bw: if all(m in ['11B', '11G'] for m in found_modes): bw = '20' else: bw = '20' if not found_modes: print(f" 无法识别物理模式: '{cell_value}'") return None return { "phy_mode_list": found_modes, "bw": bw } def format_phy_mode(self, mode: str) -> str: """ 自定义物理层模式输出格式: - 11B/G/N 输出为小写:11b / 11g / 11n - 其他保持原样(如 11AC, 11BE) """ return { '11B': '11b', '11G': '11g', '11N': '11n' }.get(mode, mode) def col_to_letter(self, col): col += 1 result = "" while col > 0: col -= 1 result = chr(col % 26 + ord('A')) + result col //= 26 return result def is_valid_power(self, value): try: float(value) return True except (ValueError, TypeError): return False def get_cell_value(self, ws_obj, row_idx, col_idx): fmt = ws_obj["format"] if fmt == "xls": return str(ws_obj["sheet"].cell_value(row_idx, col_idx)).strip() else: cell = ws_obj["sheet"].cell(row=row_idx + 1, column=col_idx + 1) val = cell.value return str(val).strip() if val is not None else "" def find_table_header_row(self, ws_obj): """查找包含 'Mode' 和 'Rate' 的表头行""" fmt = ws_obj["format"] ws = ws_obj["sheet"] for r in range(15): mode_col = rate_col = None if fmt == "xlsx": if r + 1 > ws.max_row: continue for c in range(1, ws.max_column + 1): cell = ws.cell(row=r + 1, column=c) if not cell.value: continue val = str(cell.value).strip() if val == "Mode": mode_col = c elif val == "Rate": rate_col = c if mode_col and rate_col and abs(mode_col - rate_col) == 1: print(f" 找到表头行: 第 {r+1} 行") return r, mode_col - 1, rate_col - 1 # 转为 0-based else: if r >= ws.nrows: continue for c in range(ws.ncols): val = ws.cell_value(r, c) if not val: continue val = str(val).strip() if val == "Mode": mode_col = c elif val == "Rate": rate_col = c if mode_col and rate_col and abs(mode_col - rate_col) == 1: print(f" 找到表头行: 第 {r+1} 行") return r, mode_col, rate_col return None, None, None def find_auth_power_above_row(self, ws_obj, start_row): """查找 '认证功率' 所在的合并单元格及其列范围""" fmt = ws_obj["format"] ws = ws_obj["sheet"] print(f" 开始向上查找 '认证功率',扫描第 0 ~ {start_row} 行...") if fmt == "xlsx": for mr in ws.merged_cells.ranges: top_left = ws.cell(row=mr.min_row, column=mr.min_col) val = str(top_left.value) if top_left.value else "" if "证功率" in val or "Cert" in val: r_idx = mr.min_row - 1 if r_idx <= start_row: start_col = mr.min_col - 1 end_col = mr.max_col - 1 print(f" 发现合并单元格含 '证功率': '{val}' → {self.col_to_letter(start_col)}{mr.min_row}") return start_col, end_col, r_idx # fallback:普通单元格 for r in range(start_row + 1): for c in range(1, ws.max_column + 1): cell = ws.cell(row=r + 1, column=c) if cell.value and ("证功率" in str(cell.value)): print(f" 普通单元格发现 '证功率': '{cell.value}' @ R{r+1}C{c}") return c - 1, c - 1, r else: for r in range(min(ws.nrows, start_row + 1)): for c in range(ws.ncols): val = ws.cell_value(r, c) if val and ("证功率" in str(val)): print(f" 发现 '证功率': '{val}' @ R{r+1}C{c+1}") return c, c, r return None, None, None def parse_ch_columns_under_auth(self, ws_obj, ch_row_idx, auth_start_col, auth_end_col): """ 只解析位于 [auth_start_col, auth_end_col] 区间内的 CHx 列 """ fmt = ws_obj["format"] ws = ws_obj["sheet"] ch_map = {} print(f" 解析 CH 行(第 {ch_row_idx + 1} 行),限定列范围: Col {auth_start_col} ~ {auth_end_col}") if fmt == "xlsx": for c in range(auth_start_col, auth_end_col + 1): cell = ws.cell(row=ch_row_idx + 1, column=c + 1) val = self.get_cell_value(ws_obj, ch_row_idx, c) match = re.search(r"CH(\d+)", val, re.I) if match: ch_num = int(match.group(1)) ch_map[ch_num] = c print(f" 发现 CH{ch_num} @ Col{c}") else: for c in range(auth_start_col, auth_end_col + 1): val = self.get_cell_value(ws_obj, ch_row_idx, c) match = re.search(r"CH(\d+)", val, re.I) if match: ch_num = int(match.group(1)) ch_map[ch_num] = c print(f" 发现 CH{ch_num} @ Col{c}") if not ch_map: print(" 在指定区域内未找到任何 CHx 列") else: chs = sorted(ch_map.keys()) print(f" 成功提取 CH{min(chs)}-{max(chs)} 共 {len(chs)} 个信道") return ch_map def encode_power(self, dbm): return int(round((float(dbm) + 1.5) * 4)) def merge_consecutive_channels(self, ch_list): if not ch_list: return [] sorted_ch = sorted(ch_list) ranges = [] start = end = sorted_ch[0] for ch in sorted_ch[1:]: if ch == end + 1: end = ch else: ranges.append((start, end)) start = end = ch ranges.append((start, end)) return ranges def collect_tx_limit_data(self, ws_obj, sheet_config, header_row_idx, auth_row, auth_start, auth_end, mode_col, rate_col): ch_row_idx = auth_row + 2 nrows = ws_obj["sheet"].nrows if ws_obj["format"] == "xls" else ws_obj["sheet"].max_row if ch_row_idx >= nrows: print(f" CH 行 ({ch_row_idx + 1}) 超出范围") return [] # 提取认证功率下方的 CH 列映射 ch_map = self.parse_ch_columns_under_auth(ws_obj, ch_row_idx, auth_start, auth_end) if not ch_map: return [] entries = [] row_mode_info = {} # {row_index: parsed_mode_info} fmt = ws_obj["format"] ws = ws_obj["sheet"] # ======== 第一步:构建 row_mode_info —— 使用新解析器 ======== if fmt == "xlsx": merged_cells_map = {} for mr in ws.merged_cells.ranges: for r in range(mr.min_row - 1, mr.max_row): for c in range(mr.min_col - 1, mr.max_col): merged_cells_map[(r, c)] = mr for row_idx in range(header_row_idx + 1, nrows): cell_value = None is_merged = (row_idx, mode_col) in merged_cells_map if is_merged: mr = merged_cells_map[(row_idx, mode_col)] top_cell = ws.cell(row=mr.min_row, column=mr.min_col) cell_value = top_cell.value else: raw_cell = ws.cell(row=row_idx + 1, column=mode_col + 1) cell_value = raw_cell.value mode_info = self.parse_mode_cell(cell_value) if mode_info: if is_merged: mr = merged_cells_map[(row_idx, mode_col)] for r in range(mr.min_row - 1, mr.max_row): if header_row_idx < r < nrows: row_mode_info[r] = mode_info.copy() else: row_mode_info[row_idx] = mode_info.copy() else: for row_idx in range(header_row_idx + 1, ws.nrows): cell_value = self.get_cell_value(ws_obj, row_idx, mode_col) mode_info = self.parse_mode_cell(cell_value) if mode_info: row_mode_info[row_idx] = mode_info.copy() # ======== 第二步:生成条目======== for row_idx in range(header_row_idx + 1, nrows): mode_info = row_mode_info.get(row_idx) if not mode_info: continue bw_clean = mode_info["bw"] has_valid_power = False for ch, col_idx in ch_map.items(): power_val = self.get_cell_value(ws_obj, row_idx, col_idx) if self.is_valid_power(power_val): has_valid_power = True break if not has_valid_power: print(f" 跳过空行: 第 {row_idx + 1} 行(无任何有效功率值)") continue # ---- 遍历每个 phy_mode ---- for phy_mode in mode_info["phy_mode_list"]: formatted_mode = self.format_phy_mode(phy_mode) mode_key = f"{formatted_mode}_{bw_clean}M" # 改为大小写不敏感判断 if not self._ci_contains(sheet_config.get("modes", []), mode_key): print(f" 忽略不支持的模式: {mode_key}") continue # === 获取 rate_set 定义(可能是 str 或 list)=== raw_rate_set = self._ci_get(sheet_config["rate_set_map"], mode_key) if not raw_rate_set: print(f" 找不到 rate_set 映射: {mode_key}") continue # 统一转为 list 处理 if isinstance(raw_rate_set, str): rate_set_list = [raw_rate_set] elif isinstance(raw_rate_set, list): rate_set_list = raw_rate_set else: continue # 非法类型跳过 for rate_set_macro in rate_set_list: ch_count = 0 for ch, col_idx in ch_map.items(): power_val = self.get_cell_value(ws_obj, row_idx, col_idx) if not self.is_valid_power(power_val): continue try: power_dbm = float(power_val) except: continue encoded_power = self.encode_power(power_dbm) entries.append({ "ch": ch, "power_dbm": round(power_dbm, 2), "encoded_power": encoded_power, "rate_set_macro": rate_set_macro, # <<< 每个 macro 单独一条记录 "mode": phy_mode, "bw": bw_clean, "src_row": row_idx + 1, "band": sheet_config["band"] }) ch_count += 1 print( f"📊 已采集第 {row_idx + 1} 行 → {formatted_mode} {bw_clean}M, {ch_count} 个信道, 使用宏: {rate_set_macro}" ) return entries def compress_tx_limit_entries(self, raw_entries, sheet_config): """ 压缩TX限制条目。 Args: raw_entries (list): 原始条目列表。 sheet_config (dict): Excel表格配置字典。 Returns: list: 压缩后的条目列表。 """ from collections import defaultdict modes_order = sheet_config["modes"] # 构建小写映射用于排序(key: "11n_20M") mode_lower_to_index = {mode.lower(): idx for idx, mode in enumerate(modes_order)} range_template = sheet_config["range_macro_template"] group_key = lambda e: (e["encoded_power"], e["rate_set_macro"]) groups = defaultdict(list) for e in raw_entries: groups[group_key(e)].append(e) compressed = [] for (encoded_power, rate_set_macro), entries_in_group in groups.items(): first = entries_in_group[0] power_dbm = first["power_dbm"] mode = first["mode"] # 如 '11N' bw = first["bw"] # 如 '20' 或 '40' ch_list = sorted(e["ch"] for e in entries_in_group) for start, end in self.merge_consecutive_channels(ch_list): range_macro = range_template.format( band=sheet_config["band"], bw=bw, start=start, end=end ) # === 新增:查找或分配 CHANNEL_SET_ID === assigned_id = -1 # 表示:这不是 regulatory 范围,无需映射 # === 新增:记录到 generated_ranges === segment_ch_list = list(range(start, end + 1)) self._record_generated_range( range_macro=range_macro, band=sheet_config["band"], bw=bw, ch_start=start, ch_end=end, channels=segment_ch_list ) # 格式化物理层模式(如 '11N' -> '11n') formatted_mode = self.format_phy_mode(mode) # 构造 mode_key 用于查找排序优先级 mode_key = f"{formatted_mode}_{bw}M" mode_order_idx = mode_lower_to_index.get(mode_key.lower(), 999) # 生成注释 comment = f"/* {power_dbm:5.2f}dBm, CH{start}-{end}, {formatted_mode} @ {bw}MHz */" # 新增:生成该段落的实际信道列表 segment_ch_list = list(range(start, end + 1)) compressed.append({ "encoded_power": encoded_power, "range_macro": range_macro, "rate_set_macro": rate_set_macro, "comment": comment, "_mode_order": mode_order_idx, "bw": bw, # 带宽数字(字符串) "mode": formatted_mode, # 统一格式化的模式名 "ch_start": start, "ch_end": end, "power_dbm": round(power_dbm, 2), "ch_list": segment_ch_list, # 关键!用于 global_ch_min/max 统计 }) # 排序后删除临时字段 compressed.sort(key=lambda x: x["_mode_order"]) for item in compressed: del item["_mode_order"] return compressed def _record_generated_range(self, range_macro, band, bw, ch_start, ch_end, channels): """ 记录生成的 RANGE 宏信息,供后续输出 manifest 使用 """ self.generated_ranges.append({ "range_macro": range_macro, "band": band, "bandwidth": int(bw), "channels": sorted(channels), "start_channel": ch_start, "end_channel": int(ch_end), "source_sheet": getattr(self, 'current_sheet_name', 'unknown') }) def clean_sheet_name(self, name): cleaned = re.sub(r'[^\w\.\=\u4e00-\u9fa5]', '', str(name)) return cleaned def match_sheet_to_config(self, sheet_name): cleaned = self.clean_sheet_name(sheet_name) for cfg in self.config["sheets"]: for pat in cfg["pattern"]: if re.search(pat, cleaned, re.I): print(f" '{sheet_name}' → 清洗后: '{cleaned}'") print(f" 匹配成功!'{sheet_name}' → [{cfg['band']}] 配置") return cfg print(f" '{sheet_name}' → 清洗后: '{cleaned}'") print(f"未匹配到 '{sheet_name}' 的模式,跳过...") return None def convert_sheet_with_config(self, ws_obj, sheet_name, sheet_config): self.current_sheet_name = sheet_name # 设置当前 sheet 名,供 _record_generated_range 使用 header_row_idx, mode_col, rate_col = self.find_table_header_row(ws_obj) if header_row_idx is None: print(f" 跳过 '{sheet_name}':未找到 'Mode' 和 'Rate'") return auth_start, auth_end, auth_row = self.find_auth_power_above_row(ws_obj, header_row_idx) if auth_start is None: print(f" 跳过 '{sheet_name}':未找到 '认证功率'") return raw_entries = self.collect_tx_limit_data( ws_obj, sheet_config, header_row_idx, auth_row, auth_start, auth_end, mode_col, rate_col ) if not raw_entries: print(f" 从 '{sheet_name}' 未收集到有效数据") return compressed = self.compress_tx_limit_entries(raw_entries, sheet_config) # 仅对 2.4G 频段进行信道边界统计 band = str(sheet_config.get("band", "")).strip().upper() if band in ["2G", "2.4G", "2.4GHZ", "BGN"]: # 执行信道统计 for entry in compressed: ch_range = entry.get("ch_list") or [] if not ch_range: continue ch_start = min(ch_range) ch_end = max(ch_range) # 更新全局最小最大值 if self.global_ch_min is None or ch_start < self.global_ch_min: self.global_ch_min = ch_start if self.global_ch_max is None or ch_end > self.global_ch_max: self.global_ch_max = ch_end # 强制打印当前状态 print(f" [Band={band}] 累计 2.4G 信道范围: CH{self.global_ch_min} – CH{self.global_ch_max}") self.tx_limit_entries.extend(compressed) print(f" 成功从 '{sheet_name}' 添加 {len(compressed)} 条压缩后 TX 限幅条目") # 可选调试输出 if band == "2G" and self.global_ch_min is not None: print(f" 当前累计 2.4G 信道范围: CH{self.global_ch_min} – CH{self.global_ch_max}") def render_from_template(self, template_path, context, output_path): """ 根据模板生成文件。 Args: template_path (str): 模板文件路径。 context (dict): 渲染模板所需的上下文数据。 output_path (str): 输出文件的路径。 Returns: None Raises: FileNotFoundError: 如果指定的模板文件不存在。 IOError: 如果在读取或写入文件时发生错误。 """ template_path = resource_path(template_path) with open(template_path, 'r', encoding='utf-8') as f: template = Template(f.read()) content = template.render(**context) os.makedirs(os.path.dirname(output_path), exist_ok=True) with open(output_path, 'w', encoding='utf-8') as f: f.write(content) print(f" 已生成: {output_path}") def generate_outputs(self, finalize_manifest=True): print(" 正在执行 generate_outputs()...") if not self.tx_limit_entries: print(" 无 TX 限幅数据可输出") return # === Step 1: 使用 "HT" 分类 entries === normal_entries = [] ht_entries = [] for e in self.tx_limit_entries: macro = e.get("rate_set_macro", "") if "HT" in macro: ht_entries.append(e) else: normal_entries.append(e) print(f" 自动分类结果:") print(f" ├─ Normal 模式(不含 HT): {len(normal_entries)} 条") print(f" └─ HT 模式(含 HT): {len(ht_entries)} 条") # === Step 2: 构建 g_tx_limit_normal 结构(按 bw 排序)=== def build_normal_structure(entries): grouped = defaultdict(list) for e in entries: bw = str(e["bw"]) grouped[bw].append(e) result = [] for bw in ["20", "40", "80", "160"]: if bw in grouped: sorted_entries = sorted(grouped[bw], key=lambda x: (x["ch_start"], x["encoded_power"])) result.append((bw, sorted_entries)) return result normal_struct = build_normal_structure(normal_entries) # === Step 3: 构建 g_tx_limit_ht 结构(严格顺序)=== def build_ht_structure(entries): groups = defaultdict(list) for e in entries: bw = str(e["bw"]) if "EXT4" in e["rate_set_macro"]: level = "ext4" elif "EXT" in e["rate_set_macro"]: level = "ext" else: level = "base" groups[(level, bw)].append(e) order = [ ("base", "20"), ("base", "40"), ("ext", "20"), ("ext", "40"), ("ext4", "20"), ("ext4", "40") ] segments = [] active_segment_count = sum(1 for key in order if key in groups) for idx, (level, bw) in enumerate(order): key = (level, bw) if key not in groups: continue seg_entries = sorted(groups[key], key=lambda x: (x["ch_start"], x["encoded_power"])) count = len(seg_entries) header_flags = f"CLM_DATA_FLAG_WIDTH_{bw} | CLM_DATA_FLAG_MEAS_COND" if idx < active_segment_count - 1: header_flags += " | CLM_DATA_FLAG_MORE" if level != "base": header_flags += " | CLM_DATA_FLAG_FLAG2" segment = { "header_flags": header_flags, "count": count, "entries": seg_entries } if level == "ext": segment["flag2"] = "CLM_DATA_FLAG2_RATE_TYPE_EXT" elif level == "ext4": segment["flag2"] = "CLM_DATA_FLAG2_RATE_TYPE_EXT4" segments.append(segment) return segments ht_segments = build_ht_structure(ht_entries) # === Step 4: fallback range 和 CHANNEL_SET 自动创建逻辑 === channel_set_comment = "Fallback 2.4GHz channel set (default)" if self.global_ch_min is not None and self.global_ch_max is not None: fallback_range_macro = f"RANGE_2G_20M_{self.global_ch_min}_{self.global_ch_max}" fallback_ch_start = self.global_ch_min fallback_ch_end = self.global_ch_max # 待修改 print(f" 正在设置监管 fallback 范围: {fallback_range_macro}") fallback_channel_set_id = 1 self.channel_set_map[fallback_range_macro] = fallback_channel_set_id print(f" 已绑定监管 fallback: {fallback_range_macro} → CHANNEL_SET_{fallback_channel_set_id}") else: fallback_range_macro = "RANGE_2G_20M_1_11" fallback_ch_start = 1 fallback_ch_end = 11 fallback_channel_set_id = 1 self.channel_set_map[fallback_range_macro] = fallback_channel_set_id print(" 未检测到有效的 2.4G 信道范围,使用默认 fallback: RANGE_2G_20M_1_11 → CHANNEL_SET_1") # 待修改 # === Step 5: 渲染上下文集合 === timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") locale_id_safe = self.locale_id.replace('-', '_') context_clm = { "locale_id": locale_id_safe, "eirp_entries": self.eirp_entries or [], "fallback_encoded_eirp": 30, "fallback_range_macro": fallback_range_macro, "fallback_ch_start": fallback_ch_start, "fallback_ch_end": fallback_ch_end, "entries_grouped_by_bw": normal_struct, } context_tables = { "timestamp": timestamp, "locale_id": locale_id_safe, "locale_display_name": self.locale_display_name, "normal_table": normal_struct, "ht_segments": ht_segments, "fallback_encoded_eirp": 30, "fallback_range_macro": fallback_range_macro, "fallback_ch_start": fallback_ch_start, "fallback_ch_end": fallback_ch_end, "fallback_channel_set_id": fallback_channel_set_id, "channel_set_comment": channel_set_comment, } # 确保输出目录存在 output_dir = Path(self.output_dir) output_dir.mkdir(parents=True, exist_ok=True) #待修改 # 分析 tx_limit_table.c 的变更 output_path = output_dir / "tx_limit_table.c" template_path = "templates/tx_limit_table.c.j2" # 待修改 # 读取原始文件内容(如果存在) original_lines = [] file_existed = output_path.exists() if file_existed: try: original_lines = output_path.read_text(encoding='utf-8').splitlines() except Exception as e: print(f" 无法读取旧文件 {output_path}: {e}") # 生成新内容 try: new_content = self.render_from_template_string( template_path=template_path, context=context_tables ) new_lines = new_content.splitlines() except Exception as e: print(f" 模板渲染失败 ({template_path}): {e}") raise # 比较差异并决定是否写入 if not file_existed: print(f" 将创建新文件: {output_path}") elif original_lines != new_lines: print(f" 检测到变更,将更新文件: {output_path}") else: print(f" 文件内容未变,跳过写入: {output_path}") # 即使不写也要继续后续流程(如 manifest) # 写入新内容(除非完全一致且已存在) if not file_existed or original_lines != new_lines: try: output_path.write_text(new_content, encoding='utf-8') print(f" 已写入 → {output_path}") except Exception as e: print(f" 写入文件失败 {output_path}: {e}") raise #待修改 # === Step 6: 实际渲染其他模板文件 === try: self.render_from_template( "templates/clm_locale.c.j2", context_clm, str(output_dir / f"locale_{self.locale_id.lower()}.c") ) print(f" 生成 locale 文件 → locale_{self.locale_id.lower()}.c") self.render_from_template( "templates/clm_macros.h.j2", context_tables, str(output_dir / "clm_macros.h") ) print(f" 生成宏定义头文件 → clm_macros.h") except Exception as e: print(f" 模板渲染失败: {e}") raise # 待修改 # === Step 7: 生成 manifest.json(仅当 finalize_manifest=True)=== if finalize_manifest: used_range_macros = sorted(set(entry["range_macro"] for entry in self.tx_limit_entries)) self.used_ranges = used_range_macros manifest_data = { "timestamp": datetime.now().isoformat(), "locale_id": self.locale_id, "locale_display_name": self.locale_display_name, "used_ranges_count": len(used_range_macros), "used_ranges": used_range_macros } manifest_path = output_dir / "generated_ranges_manifest.json" try: with open(manifest_path, 'w', encoding='utf-8') as f: json.dump(manifest_data, f, indent=4, ensure_ascii=False) print(f" 已生成精简 manifest 文件: {manifest_path}") print(f" 共 {len(used_range_macros)} 个唯一 RANGE 宏被使用:") for macro in used_range_macros: print(f" - {macro}") self.last_generated_manifest = str(manifest_path) except Exception as e: print(f" 写入 manifest 失败: {e}") else: print(" 跳过 manifest 文件生成 (finalize_manifest=False)") # === Final Step: 保存 channel_set 映射配置 === self.save_channel_set_map_to_config() #待修改 # 最终总结 print(f" 所有输出文件生成完成。") print(f" 输出路径: {self.output_dir}") print(f" 功率表名称: {self.locale_display_name} ({self.locale_id})") # 待修改 def render_from_template_string(self, template_path, context): from jinja2 import Environment, FileSystemLoader import os # 解析模板目录 template_dir = os.path.dirname(resource_path(template_path)) loader = FileSystemLoader(template_dir) env = Environment(loader=loader) filename = os.path.basename(template_path) template = env.get_template(filename) return template.render(**context) def log_changes_to_file(self, changes, output_dir, locale_id, total_entries): """将变更摘要写入日志文件""" log_dir = Path(output_dir) / "change_logs" log_dir.mkdir(exist_ok=True) # 使用时间戳生成唯一文件名 timestamp_str = datetime.now().strftime("%Y%m%d_%H%M%S") log_path = log_dir / f"change_{locale_id}_{timestamp_str}.log" timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") with open(log_path, 'w', encoding='utf-8') as f: # 覆盖写入最新变更 f.write(f"========================================\n") f.write(f"CLM 变更日志\n") f.write(f"========================================\n") f.write(f"时间: {timestamp}\n") f.write(f"地区码: {locale_id}\n") f.write(f"总 TX 条目数: {total_entries}\n") f.write(f"\n") if not any(changes.values()): f.write(" 本次运行无任何变更,所有文件已是最新状态。\n") else: if changes['added_ranges']: f.write(f" 新增 RANGE ({len(changes['added_ranges'])}):\n") for r in sorted(changes['added_ranges']): f.write(f" → {r}\n") f.write(f"\n") if changes['removed_ranges']: f.write(f" 删除 RANGE ({len(changes['removed_ranges'])}):\n") for r in sorted(changes['removed_ranges']): f.write(f" → {r}\n") f.write(f"\n") if changes['modified_ranges']: f.write(f" 修改 RANGE ({len(changes['modified_ranges'])}):\n") for r in sorted(changes['modified_ranges']): f.write(f" → {r}\n") f.write(f"\n") other_adds = changes['other_additions'] other_dels = changes['other_deletions'] if other_adds or other_dels: f.write(f" 其他变更:\n") for line in other_adds[:10]: f.write(f" ➕ {line}\n") for line in other_dels[:10]: f.write(f" ➖ {line}\n") if len(other_adds) > 10 or len(other_dels) > 10: f.write(f" ... 还有 {len(other_adds) + len(other_dels) - 20} 处未显示\n") f.write(f"\n") f.write(f" 输出目录: {output_dir}\n") f.write(f"备份文件: {Path(self.target_c_file).with_suffix('.c.bak')}\n") f.write(f"========================================\n") print(f" 已保存变更日志 → {log_path}") def save_channel_set_map_to_config(self): """将当前 channel_set_map 写回 config.json 的 channel_set_map 字段""" try: # 清理:只保留 fallback 类型的 RANGE(可正则匹配) valid_keys = [ k for k in self.channel_set_map.keys() if re.match(r'RANGE_[\dA-Z]+_\d+M_\d+_\d+', k) # 如 RANGE_2G_20M_1_11 ] filtered_map = {k: v for k, v in self.channel_set_map.items() if k in valid_keys} # 更新主配置中的字段 self.config["channel_set_map"] = filtered_map # 使用过滤后的版本 with open(self.config_file_path, 'w', encoding='utf-8') as f: json.dump(self.config, f, indent=4, ensure_ascii=False) print(f" 已成功将精简后的 channel_set_map 写回配置文件: {filtered_map}") except Exception as e: print(f" 写入配置文件失败: {e}") raise def convert(self, file_path): # =============== 每次都更新备份 C 文件 =============== c_source = Path(self.target_c_file) c_backup = c_source.with_suffix(c_source.suffix + ".bak") if not c_source.exists(): raise FileNotFoundError(f"目标 C 文件不存在: {c_source}") ext = os.path.splitext(file_path)[-1].lower() if ext == ".xlsx": wb = load_workbook(file_path, data_only=True) sheets = [{"sheet": ws, "format": "xlsx"} for ws in wb.worksheets] elif ext == ".xls": wb = xlrd.open_workbook(file_path) sheets = [{"sheet": ws, "format": "xls"} for ws in wb.sheets()] else: raise ValueError("仅支持 .xls 或 .xlsx 文件") for i, ws_obj in enumerate(sheets): sheet_name = wb.sheet_names()[i] if ext == ".xls" else ws_obj["sheet"].title config = self.match_sheet_to_config(sheet_name) if config: self.convert_sheet_with_config(ws_obj, sheet_name, config) self.generate_outputs() def parse_excel(self): """ 【UI 兼容】供 PyQt UI 调用的入口方法 将当前 self.input_file 中的数据解析并填充到 tx_limit_entries """ print(f"📂 开始解析: {self.input_file}") if not os.path.exists(self.input_file): print(f"❌ 文件不存在: {self.input_file}") raise FileNotFoundError(...) else: print(f"✅ 文件已找到,大小: {os.path.getsize(self.input_file)} 字节") if not hasattr(self, 'input_file') or not self.input_file: raise ValueError("未设置 input_file 属性!") if not os.path.exists(self.input_file): raise FileNotFoundError(f"文件不存在: {self.input_file}") print(f" 开始解析 Excel 文件: {self.input_file}") try: self.convert(self.input_file) # 调用已有逻辑 print(f" Excel 解析完成,共生成 {len(self.tx_limit_entries)} 条 TX 限幅记录") except Exception as e: print(f" 解析失败: {e}") raise if __name__ == "__main__": import os # 切换到脚本所在目录(可选,根据实际需求) script_dir = os.path.dirname(__file__) os.chdir(script_dir) # 直接使用默认参数(或从其他地方获取) config_path = "config/config.json" output_dir = "output" locale_id = None # 或指定默认值,如 "DEFAULT" display_name = None # 或指定默认值 input_file = "input/Archer BE900US 2.xlsx" # 创建转换器实例并执行 converter = ExcelToCLMConverter( config_path=config_path, output_dir=output_dir, locale_id=locale_id, locale_display_name=display_name ) converter.convert(input_file)
最新发布
10-21
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值