开源大模型食用指南技术文档-优快云博客

开源大模型食用指南技术文档

【免费下载链接】self-llm 《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程项目地址: https://gitcode.com/datawhalechina/self-llm

1. 安装指南

1.1 环境要求

操作系统：Linux（推荐Ubuntu 20.04+）
Python版本：3.8+
GPU：NVIDIA显卡（建议显存≥16GB）
CUDA：11.7+
其他依赖：git, conda/pip

1.2 基础环境配置

# 克隆项目
git clone https://github.com/datawhalechina/self-llm.git
cd self-llm

# 创建conda环境
conda create -n llm python=3.10
conda activate llm

# 安装基础依赖
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

2. 项目使用说明

2.1 模型选择建议

初学者推荐：Qwen1.5/InternLM2/MiniCPM
中文场景：ChatGLM/ERNIE
多模态需求：MiniCPM-o/Hunyuan3D

2.2 典型使用流程

选择目标模型（如Qwen3-8B）
查看对应模型的部署文档
下载模型权重（需自行获取授权）
执行部署脚本
通过API或Web界面交互

3. 项目API使用文档

3.1 通用API调用示例

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "Qwen/Qwen3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path).cuda()

inputs = tokenizer("你好，请介绍一下你自己", return_tensors="pt").to("cuda")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))

3.2 特色API功能

vLLM部署：支持高并发推理
Gradio部署：快速构建Web界面
LangChain集成：支持知识库扩展
SwanLab可视化：训练过程监控

4. 项目安装方式

4.1 标准安装（推荐）

# 针对特定模型的安装（以Qwen3为例）
cd models/Qwen3
bash install.sh

4.2 Docker快速部署

# 以Qwen3为例
docker pull registry.codewithgpu.com/datawhalechina/self-llm/Qwen3
docker run -p 8000:8000 --gpus all -it registry.codewithgpu.com/datawhalechina/self-llm/Qwen3

4.3 Windows支持（有限）

通过LMStudio工具部署
仅支持部分量化模型
性能可能受限

注意事项：

大模型部署需要显存≥模型参数的2倍
首次运行会自动下载模型权重（需配置HF_TOKEN）
微调训练建议使用A100/A800等专业卡
中文输入建议添加system prompt优化效果

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考