Qwen2.5-Omni-7B SFT微调及模型合并

最新推荐文章于 2025-12-05 11:17:30 发布

原创最新推荐文章于 2025-12-05 11:17:30 发布 · 396 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#1024程序员节 #人工智能

部署运行你感兴趣的模型镜像

数据集格式与样例

数据集包含多模态信息，结构如下：

{
    "messages": [
      {
        "content": "<video><audio>What is the video describing?",
        "role": "user"
      },
      {
        "content": "A girl who is drawing a picture of a guitar and feel nervous.",
        "role": "assistant"
      }
    ],
    "videos": [
      "mllm_demo_data/4.mp4"
    ],
    "audios": [
      "mllm_demo_data/4.mp3"
    ]
}

SFT微调流程

采用四卡微调配置，使用LLaMAFactory进行SFT微调，主要参数如下：

llamafactory-cli train \
    --stage sft \
    --do_train True \
    --model_name_or_path /workspace/Qwen2___5-Omni-7B \
    --preprocessing_num_workers 16 \
    --finetuning_type lora \
    --template qwen2_omni \
    --flash_attn auto \
    --dataset_dir data \
    --dataset mllm_audio_demo,mllm_video_demo,mllm_demo \
    --cutoff_len 2048 \
    --learning_rate 5e-05 \
    --num_train_epochs 10.0 \
    --max_samples 100000 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 8 \
    --lr_scheduler_type cosine \
    --max_grad_norm 1.0 \
    --logging_steps 5 \
    --save_steps 100 \
    --warmup_steps 0 \
    --packing False \
    --enable_thinking True \
    --report_to none \
    --output_dir saves/Qwen2.5-Omni-7B/lora/train_2025-09-22-08-20-53 \
    --fp16 True \
    --plot_loss True \
    --trust_remote_code True \
    --ddp_timeout 180000000 \
    --include_num_input_tokens_seen True \
    --optim adamw_torch \
    --lora_rank 8 \
    --lora_alpha 16 \
    --lora_dropout 0 \
    --lora_target all \
    --freeze_vision_tower True \
    --freeze_multi_modal_projector True \
    --image_max_pixels 589824 \
    --image_min_pixels 1024 \
    --video_max_pixels 65536 \
    --video_min_pixels 256

关键参数说明

model_name_or_path：模型路径
dataset_dir：自定义数据集路径
output_dir：输出模型路径
num_train_epochs：训练轮数
model_max_length：模型序列长度（根据数据定义）
per_device_train_batch_size：批处理大小
save_steps：模型保存步数

模型参数合并流程

合并模型参数命令：

llamafactory-cli export examples/merge_lora/qwen2_5omni_lora_sft.yaml

合并配置文件样例：

### Note: DO NOT use quantized model or quantization_bit when merging lora adapters
### model
model_name_or_path: /workspace/Qwen2___5-Omni-7B
adapter_name_or_path: saves/Qwen2.5-Omni-7B/lora/train_2025-09-19-09-42-22
template: qwen2_omni
trust_remote_code: true

### export
export_dir: output/qwen2_5omni_lora_sft
export_size: 5
export_device: cpu  # choices: [cpu, auto]
export_legacy_format: false

常见问题与解决方案

问题描述：
Qwen2.5-Omni Inference Error after Full-SFT: KeyError: ‘qwen2_5_omni_thinker’

原因分析：
微调后保存的是 omni.thinker，需要与原始模型合并 [thinker + talker] -> [omni]

解决方法：
参考 LLaMA-Factory Pull Request #7537
使用脚本合并：

python3 ./scripts/qwen_omni_merge.py merge_lora \
  --base_model_path="/workspace/Qwen2___5-Omni-7B" \
  --lora_checkpoint_path="/app/saves/Qwen2.5-Omni-7B/lora/train_2025-10-13-03-14-01" \
  --save_path="output/qwen2_5omni_lora_sft"

VLLM推理兼容性调整

如需VLLM推理，将合并权重模型文件中的config.py中Qwen2_5OmniForConditionalGeneration修改为Qwen2_5OmniModel。

推理测试

CUDA_VISIBLE_DEVICES=4,5,6,7 llamafactory-cli webchat \
    --model_name_or_path /app/output/qwen2_5omni_lora_sft_100 \
    --template qwen2_omni \
    --finetuning_type lora

您可能感兴趣的与本文相关的镜像

Vllm-v0.11.0

Vllm

vLLM是伯克利大学LMSYS组织开源的大语言模型高速推理框架，旨在极大地提升实时场景下的语言模型服务的吞吐与内存使用效率。vLLM是一个快速且易于使用的库，用于 LLM 推理和服务，可以和HuggingFace 无缝集成。vLLM利用了全新的注意力算法「PagedAttention」，有效地管理注意力键和值