跑DeepInf模型时候的bug:train.py: error: the following arguments are required: --file_dir

解决DeepInf模型训练报错:train.py要求--file_dir参数
在运行DeepInf模型的train.py时遇到错误,提示缺少--file_dir参数。该问题涉及到Python命令行参数解析,需要正确提供数据目录。

%run train.py
usage: train.py [-h] [--verbosity VERBOSITY]
                [--tensorboard-log TENSORBOARD_LOG] [--model MODEL]
                [
(vllm-deepseek) [root@lmogpu63(15.95) ~]# CUDA_VISIBLE_DEVICES=7 swift sft --torch_dtype 'bfloat16' --model '/opt/tmp/lora/20250918_181334/model' --dataset '/opt/tmp/lora/20250918_181334/dataset/train.jsonl' --split_dataset_ratio '0.2' --max_length '2048' --use_galore 'True' --galore_rank '64' --task_type 'causal_lm' --per_device_train_batch_size '14' --per_device_eval_batch_size '28' --learning_rate '1e-4' --num_train_epochs '2' --gradient_accumulation_steps '4' --eval_steps '500' --output_dir '/opt/tmp/lora/20250918_181334/model_lora' --neftune_noise_alpha '0' --report_to 'tensorboard' --add_version False run sh: `/opt/tpapp/conda/envs/vllm-deepseek/bin/python /opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/swift/cli/sft.py --torch_dtype bfloat16 --model /opt/tmp/lora/20250918_181334/model --dataset /opt/tmp/lora/20250918_181334/dataset/train.jsonl --split_dataset_ratio 0.2 --max_length 2048 --use_galore True --galore_rank 64 --task_type causal_lm --per_device_train_batch_size 14 --per_device_eval_batch_size 28 --learning_rate 1e-4 --num_train_epochs 2 --gradient_accumulation_steps 4 --eval_steps 500 --output_dir /opt/tmp/lora/20250918_181334/model_lora --neftune_noise_alpha 0 --report_to tensorboard --add_version False` [INFO:swift] Successfully registered `/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/swift/llm/dataset/data/dataset_info.json`. Could not find the bitsandbytes CUDA binary at PosixPath('/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda128.so') Could not load bitsandbytes native library: /opt/tpapp/conda/envs/vllm-deepseek/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so) Traceback (most recent call last): File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 85, in <module> lib = get_native_library() File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 72, in get_native_library dll = ct.cdll.LoadLibrary(str(binary_path)) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary return self._dlltype(name) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/ctypes/__init__.py", line 374, in __init__ self._handle = _dlopen(self._name, mode) OSError: /opt/tpapp/conda/envs/vllm-deepseek/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so) CUDA Setup failed despite CUDA being available. Please run the following command to get more information: python -m bitsandbytes Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues Traceback (most recent call last): File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/transformers/hf_argparser.py", line 258, in _add_dataclass_arguments type_hints: dict[str, type] = get_type_hints(dtype) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/typing.py", line 1833, in get_type_hints value = _eval_type(value, base_globals, base_locals) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/typing.py", line 329, in _eval_type ev_args = tuple(_eval_type(a, globalns, localns, recursive_guard) for a in t.__args__) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/typing.py", line 329, in <genexpr> ev_args = tuple(_eval_type(a, globalns, localns, recursive_guard) for a in t.__args__) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/typing.py", line 327, in _eval_type return t._evaluate(globalns, localns, recursive_guard) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/typing.py", line 694, in _evaluate eval(self.__forward_code__, globalns, localns), File "<string>", line 1, in <module> NameError: name 'ParallelismConfig' is not defined. Did you mean: 'parallelism_config'? During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/swift/cli/sft.py", line 10, in <module> sft_main() File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/swift/llm/train/sft.py", line 331, in sft_main return SwiftSft(args).main() File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/swift/llm/train/sft.py", line 27, in __init__ super().__init__(args) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/swift/llm/base.py", line 19, in __init__ self.args = self._parse_args(args) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/swift/llm/base.py", line 31, in _parse_args args, remaining_argv = parse_args(self.args_class, args) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/swift/utils/utils.py", line 144, in parse_args parser = HfArgumentParser([class_type]) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/transformers/hf_argparser.py", line 143, in __init__ self._add_dataclass_arguments(dtype) File "/opt/tpapp/conda/envs/vllm-deepseek/lib/python3.10/site-packages/transformers/hf_argparser.py", line 260, in _add_dataclass_arguments raise RuntimeError( RuntimeError: Type resolution failed for <class 'swift.llm.argument.train_args.TrainArguments'>. Try declaring the class in global scope or removing line of `from __future__ import annotations` which opts in Postponed Evaluation of Annotations (PEP 563)
最新发布
09-20
<think>我们正在解决运行YOLOv4数据集转换脚本时报错:`sb.py: error: the following arguments are required: --base-dir`。这个错误表明脚本运行时缺少必需的参数`--base-dir`(可能还有`--output-dir`)。下面我们将逐步分析并解决这个问题。 ### 问题分析 1. **错误信息解读**: - 脚本`sb.py`要求必须传入`--base-dir`参数,但运行命令中未提供。 - 可能的原因:运行命令中确实没有包含`--base-dir`参数,或者参数格式不正确。 2. **参数传递规范**: - Python的`argparse`库要求长参数(如`--base-dir`)后面必须跟一个值(用空格或等号分隔)。 - 检查命令格式:`python sb.py --base-dir "路径" --output-dir "路径"`。 ### 解决方案 #### 步骤1:检查运行命令 确保命令中包含了所有必需的参数,并且格式正确。例如: ```bash python sb.py --base-dir "D:\software_accessible\yolov4\pytorch-YOLOv4-master\databag" --output-dir "D:\software_accessible\yolov4\pytorch-YOLOv4-master\yolov4_output" ``` #### 步骤2:验证脚本参数解析代码 如果命令正确但依然报错,可能是`sb.py`脚本的参数解析逻辑存在问题。请检查脚本中是否正确定义了`--base-dir`参数。以下是一个标准的参数解析示例: ```python import argparse def parse_args(): parser = argparse.ArgumentParser(description='Convert dataset to YOLO format') parser.add_argument('--base-dir', required=True, help='Base directory of the dataset') parser.add_argument('--output-dir', required=True, help='Output directory for YOLO format') # 可选参数示例(非必需) parser.add_argument('--split-ratio', type=float, default=0.8, help='Train/val split ratio') return parser.parse_args() if __name__ == '__main__': args = parse_args() print(f"Base directory: {args.base_dir}") print(f"Output directory: {args.output_dir}") ``` 关键点: - `required=True` 表示该参数是必需的。 - 参数名必须与命令行中使用的完全一致(例如`--base-dir`)。 #### 步骤3:处理Windows路径分隔符 Windows路径中的反斜杠`\`在Python字符串中可能被解释为转义字符。建议以下两种方式之一: 1. 使用正斜杠`/`代替反斜杠: ```bash python sb.py --base-dir "D:/software_accessible/yolov4/pytorch-YOLOv4-master/databag" ... ``` 2. 使用原始字符串(在路径前加`r`)或双反斜杠: ```bash python sb.py --base-dir r"D:\software_accessible\yolov4\pytorch-YOLOv4-master\databag" ... # 或 python sb.py --base-dir "D:\\software_accessible\\yolov4\\pytorch-YOLOv4-master\\databag" ... ``` #### 步骤4:运行测试 使用简化路径重新测试,排除路径复杂性影响: ```bash python sb.py --base-dir "C:/test_dataset" --output-dir "C:/yolo_output" ``` ### 常见错误排查 - **参数拼写错误**:检查`--base-dir`和`--output-dir`的拼写(注意是短横线`-`,不是下划线`_`)。 - **参数值未加引号**:如果路径包含空格,必须用引号包裹(如`"D:/path with space"`)。 - **脚本更新问题**:确保使用的`sb.py`是最新版本(可能旧版本参数不同)。 ### 附加说明:YOLOv4数据集目录结构 成功运行转换脚本后,输出目录应包含以下结构[^4]: ``` yolov4_output/ ├── labels/ # 存放每个图像的YOLO格式标注文件(.txt) ├── train.txt # 训练集图像路径列表 └── val.txt # 验证集图像路径列表 ``` §§ 相关问题 §§ 1. Python的argparse模块如何处理必需参数和可选参数? 2. 如何在Windows系统中正确处理文件路径的转义字符? 3. YOLOv4训练前如何验证数据集转换的正确性? 4. 如果数据集标注格式是COCO,如何修改转换脚本以适应YOLOv4?</think>### 解决 YOLOv4 数据集转换脚本的参数缺失错误 您遇到的错误 `sb.py: error: the following arguments are required: --base-dir` 表明脚本运行时缺少必需的 `--base-dir` 参数。以下是逐步解决方案: #### 1. **检查命令格式(关键修复)** - **错误原因**:参数格式不正确或缺少引号 - **正确格式**: ```bash python sb.py --base-dir "D:\software_accessible\yolov4\pytorch-YOLOv4-master\databag" --output-dir "D:\software_accessible\yolov4\pytorch-YOLOv4-master\yolov4_output" ``` - **必须注意**: - 参数名和值之间**必须有空格** - 路径包含空格或特殊字符时**必须加双引号** - Windows 路径建议使用原始字符串前缀 `r`: ```bash python sb.py --base-dir r"D:\software_accessible\yolov4\pytorch-YOLOv4-master\databag" ... ``` #### 2. **验证脚本参数解析逻辑** 检查 `sb.py` 是否正确定义了参数解析器(通常使用 `argparse` 库): ```python import argparse parser = argparse.ArgumentParser() parser.add_argument('--base-dir', required=True, type=str) # required=True 表示必需参数 parser.add_argument('--output-dir', required=True, type=str) args = parser.parse_args() ``` - 如果 `required=True` 被遗漏,会导致参数检查失效[^1] #### 3. **路径格式处理(Windows 特殊处理)** ```python # 在 sb.py 中添加路径规范化处理 import os def main(): args = parser.parse_args() base_dir = os.path.normpath(args.base_dir) # 标准化 Windows 路径 output_dir = os.path.normpath(args.output_dir) ``` #### 4. **测试最小可运行命令** 使用简化路径测试基本功能: ```bash python sb.py --base-dir "C:\test" --output-dir "C:\output" ``` #### 5. **常见错误排查表** | 错误现象 | 解决方案 | |---------|----------| | 路径包含空格 | 确保整个路径用双引号包裹 | | 反斜杠转义问题 | 使用正斜杠 `/` 或原始字符串 `r"路径"` | | 参数名拼写错误 | 检查 `--base-dir` 和 `--output-dir` 的拼写 | | 脚本未更新 | 确保使用的是最新版 `sb.py` | > **关键提示**:YOLOv4 要求数据集目录必须包含 `images` 和 `annotations` 子目录,否则即使参数正确也会报错[^4]。
评论 5
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值