(vlm) face8@jamesdeMac-Studio vlm % python train_vlm\ copy.py
✅ MPS设备可用
🛠️ 系统配置:
- 设备: mps
- 内存状态: 256.00GB 总计
- 训练参数: ScriptArguments(train_path='train.jsonl', valid_path='valid.jsonl', model_name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', output_dir='./output_lora_qwen25vl_instruct', per_device_train_batch_size=1, gradient_accumulation_steps=4, num_train_epochs=3, logging_steps=5, save_steps=100, eval_steps=100, image_size=672, learning_rate=2e-05, warmup_steps=50, weight_decay=0.01, lora_rank=16, lora_alpha=32, lora_dropout=0.05, fp16=False, bf16=False, max_steps=-1, gradient_checkpointing=True, seed=42, report_to='none', enable_mps_fallback=True, debug_mode=True, max_retries=3)
Loading checkpoint shards: 100%|███| 5/5 [00:05<00:00, 1.03s/it]
🔧 已将模型移动到MPS设备
✅ 已启用梯度检查点
trainable params: 35,090,432 || all params: 8,324,397,056 || trainable%: 0.4215
✅ LoRA配置加载完成
✅ 数据集加载完成: 路径=train.jsonl, 总行数=934, 有效样本=934
✅ 数据集加载完成: 路径=valid.jsonl, 总行数=104, 有效样本=104
✅ 数据集准备完成
/Users/face8/works/vlm/train_vlm copy.py:440: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `Trainer.__init__`. Use `processing_class` instead.
trainer = Trainer(
✅ Trainer创建完成
🔍 训练前环境检查:
🔍 收到 2 个样本
样本0: 图像=images/00316.jpg, 指令长度=141, 回答长度=4
样本1: 图像=images/00653.jpg, 指令长度=141, 回答长度=9
📊 批处理完成: 图像=2, 输入ID形状=torch.Size([2, 1024])
✅ 环境检查通过
🚀 开始训练...
训练前内存使用: 总计=256.00GB, 已用=109.07GB, 可用=146.02GB
Currently training with a batch size of: 1
***** Running training *****
Num examples = 934
Num Epochs = 3
Instantaneous batch size per device = 1
Total train batch size (w. parallel, distributed & accumulation) = 4
Gradient Accumulation steps = 4
Total optimization steps = 702
Number of trainable parameters = 35,090,432
🚀 训练开始
训练开始时内存使用: 总计=256.00GB, 已用=108.17GB, 可用=146.92GB
🔄 开始第 0 轮训练
/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/utils/data/dataloader.py:683: UserWarning: 'pin_memory' argument is set as true but not supported on MPS now, then device pinned memory won't be used.
warnings.warn(warn_msg)
🔍 收到 1 个样本
样本0: 图像=images/00722.jpg, 指令长度=141, 回答长度=6
📊 批处理完成: 图像=1, 输入ID形状=torch.Size([1, 1024])
🔍 收到 1 个样本
样本0: 图像=images/00689.jpg, 指令长度=141, 回答长度=3
📊 批处理完成: 图像=1, 输入ID形状=torch.Size([1, 1024])
🔍 收到 1 个样本
样本0: 图像=images/00458.jpg, 指令长度=141, 回答长度=18
📊 批处理完成: 图像=1, 输入ID形状=torch.Size([1, 1024])
🔍 收到 1 个样本
样本0: 图像=images/00915.jpg, 指令长度=141, 回答长度=12
📊 批处理完成: 图像=1, 输入ID形状=torch.Size([1, 1024])
🔍 收到 1 个样本
样本0: 图像=images/00161.jpg, 指令长度=141, 回答长度=12
📊 批处理完成: 图像=1, 输入ID形状=torch.Size([1, 1024])
❌ 训练过程中出错: 'NoneType' object is not iterable
❌ 主程序异常终止: 'NoneType' object is not iterable
Traceback (most recent call last):
File "/Users/face8/works/vlm/train_vlm copy.py", line 480, in <module>
main()
File "/Users/face8/works/vlm/train_vlm copy.py", line 466, in main
trainer.train()
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/trainer.py", line 2207, in train
return inner_training_loop(
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/trainer.py", line 2549, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/trainer.py", line 3750, in training_step
loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/trainer.py", line 3837, in compute_loss
outputs = model(**inputs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/peft/peft_model.py", line 1757, in forward
return self.base_model(
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/peft/tuners/tuners_utils.py", line 193, in forward
return self.model.forward(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/utils/generic.py", line 943, in wrapper
output = func(self, *args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1487, in forward
outputs = self.model(
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1228, in forward
image_embeds = self.get_image_features(pixel_values, image_grid_thw)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1178, in get_image_features
image_embeds = self.visual(pixel_values, grid_thw=image_grid_thw)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 441, in forward
rotary_pos_emb = self.rot_pos_emb(grid_thw)
File "/Users/face8/miniconda3/envs/vlm/lib/python3.9/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 361, in rot_pos_emb
for t, h, w in grid_thw:
TypeError: 'NoneType' object is not iterable
最新发布