一、环境准备
python
# 硬件要求:建议至少1块24GB显存的GPU(如3090/A10/A100) # Python版本:3.8+ # 安装核心库 pip install torch torchvision torchaudio pip install transformers==4.37.0 pip install datasets accelerate peft bitsandbytes pip install wandb tensorboard # 日志记录
二、数据准备
1. 数据集格式(推荐JSON)
json
[ { "instruction": "写一首关于春天的诗", "input": "", "output": "春风轻拂绿柳梢..." }, { "instruction": "将以下句子翻译成英语", "input": "今天天气真好", "output": "The weather is nice today." } ]
2. 数据预处理
python
from datasets import load_dataset # 加载数据集 dataset = load_datase