DeepSeek-7B-chat 模型的 QLoRA 微调

在这篇博客中,我将介绍如何使用 QLoRA 技术对 DeepSeek-7B-chat 模型进行微调。QLoRA 是一种高效的微调方法,能够在有限的计算资源下对大型语言模型进行定制化训练。

1. 环境准备

首先,我们需要安装必要的 Python 包:

pip install transformers datasets pandas peft bitsandbytes accelerate

具体版本为:
transformers 4.38.0
peft 0.10.0
accelerate 0.26.0
dataset 2.14.6

2. 数据准备

我们使用 HarmonyOS 的训练数据进行微调。数据需要转换为标准的指令-输入-输出格式:

import pandas as pd
from datasets import Dataset

# 读取训练数据
df = pd.read_json('train.json')
ds = Dataset.from_pandas(df)

3. 模型加载与量化配置

我们使用 Hugging Face 的 transformers 库加载 DeepSeek-7B-chat 模型,并应用量化配置以减少内存使用:

安装huggingface_hub:

pip install -U huggingface_hub

windows:

$env:HF_ENDPOINT = "https://hf-mirror.com"

linux:

export HF_ENDPOINT=https://hf-mirror.com 

下载:

huggingface-cli download deepseek-ai/deepseek-llm-7b-chat --local-dir ./model_temp/deepseek-llm-7b-chat --local-dir-use-symlinks False
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

# 加载分词器
tokenizer = AutoTokenizer.from_pretrained('model_tmp/deepseek-llm-7b-chat/', use_fast=False, trust_remote_code=True)
tokenizer.padding_side = 'right'

# 创建量化配置
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.half,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True
)

# 加载模型
model = AutoModelForCausalLM.from_pretrained(
    'model_tmp/deepseek-llm-7b-chat/', 
    trust_remote_code=True, 
    torch_dtype=torch.half, 
    low_cpu_mem_usage=True,
    quantization_config=quantization_config
)

4. 数据处理与格式化

我们定义一个数据处理函数,将训练数据格式化为模型可以理解的输入:

def process_func(example):
    MAX_LENGTH = 384
    instruction = tokenizer(f"User: {example['instruction']+example['input']}\\n\\n", add_special_tokens=False)
    response = tokenizer(f"Assistant: {example['output']}<|end▁of▁sentence|>", add_special_tokens=False)
    input_ids = instruction["input_ids"] + response["input_ids"] + [tokenizer.pad_token_id]
    attention_mask = instruction["attention_mask"] + response["attention_mask"] + [1]
    labels = [-100] * len(instruction["input_ids"]) + response["input_ids"] + [tokenizer.pad_token_id]
    if len(input_ids) > MAX_LENGTH:
        input_ids = input_ids[:MAX_LENGTH]
        attention_mask = attention_mask[:MAX_LENGTH]
        labels = labels[:MAX_LENGTH]
    return {
        "input_ids": input_ids,
        "attention_mask": attention_mask,
        "labels": labels
    }

tokenized_id = ds.map(process_func, remove_columns=ds.column_names)

5. LoRA 配置与应用

我们使用 PEFT 库配置和应用 LoRA,以减少训练参数数量:

from peft import LoraConfig, TaskType, get_peft_model

# 配置LoRA
config = LoraConfig(
    task_type=TaskType.CAUSAL_LM, 
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    inference_mode=False,
    r=4,
    lora_alpha=32,
    lora_dropout=0.1
)

# 应用LoRA配置到模型
model = get_peft_model(model, config)

6. 训练配置与执行

我们使用 transformers 库的 Trainer 类进行训练:

from transformers import TrainingArguments, Trainer

# 配置训练参数
args = TrainingArguments(
    output_dir="./output/DeepSeek",
    per_device_train_batch_size=16,
    gradient_accumulation_steps=1,
    logging_steps=10,
    num_train_epochs=30,
    save_steps=100,
    learning_rate=1e-4,
    save_on_each_node=True,
    gradient_checkpointing=True,
    optim="paged_adamw_32bit"
)

# 创建训练器
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_id,
    data_collator=DataCollatorForSeq2Seq(tokenizer=tokenizer, padding=True),
)

# 开始训练
trainer.train()

7. 模型测试与保存

训练完成后,我们可以测试模型并保存结果:

def test_model(text):
    inputs = tokenizer(f"User: {text}\\n\\n", return_tensors="pt")
    outputs = model.generate(**inputs.to(model.device), max_new_tokens=100)
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(f"输入: {text}")
    print(f"输出: {result}")
    return result

# 测试示例
test_model("在ArkTS中,如何为一个包含imageUrl(字符串类型)和isAdd(布尔类型)属性的类创建构造函数?")

# 保存模型
model.save_pretrained("./output/DeepSeek/final_model")

8. 等待运行

在这里插入图片描述

### DeepSeek-7B-Chat Model Information and Usage For users interested in the **DeepSeek-7B-Chat** model, this guide provides detailed instructions on accessing and deploying the model using Ollama's platform. To deploy the **DeepSeek-7B-Chat**, one should visit the Models section of Ollama at [Ollama Models](https://ollama.com/search), search for `deepseek-r1`, and follow the command line instruction provided there to install it. For systems equipped with a GPU that has more than 8GB VRAM, opting for the 7 billion parameter version (7B) is feasible; otherwise, choosing the smaller 1.5 billion parameter variant might be necessary due to hardware limitations[^1]. The deployment can be initiated via the following command tailored specifically towards running the DeepSeek-R1 which presumably includes configurations compatible or similar to what would apply to the DeepSeek-7B-Chat: ```bash ollama run deepseek-r1 ``` Once deployed successfully, interaction with the model could occur through an interface accessible by navigating web browsers to addresses like `http://0.0.0.0:7860` as seen when working with models such as ChatGLM-7B where interfaces are served locally over HTTP services[^2]. However, specific details about how exactly the DeepSeek-7B-Chat integrates into these frameworks may vary slightly based upon implementation specifics not covered directly within given references but generally follows similar patterns observed across comparable large language models hosted similarly. --related questions-- 1. What are the system requirements for effectively utilizing the DeepSeek-7B-Chat? 2. How does one fine-tune the DeepSeek-7B-Chat for specialized tasks? 3. Can multiple instances of DeepSeek-7B-Chat operate concurrently without conflicts? 4. Is there documentation available detailing advanced configuration options beyond basic setup?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值