开源项目使用教程：LongChat-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_00851/article/details/147007440

开源项目使用教程：LongChat

LongChat Official repository for LongChat and LongEval 项目地址: https://gitcode.com/gh_mirrors/lo/LongChat

1. 项目介绍

LongChat 是一个开源项目，旨在支持基于长上下文的聊天机器人模型的训练和评估。该项目由 Dacheng Li 等人开发，支持高达 32K 的上下文长度，使用 Llama 2 模型。LongChat 的目标是通过提供高效的工具和模型，推动开源聊天机器人在长上下文处理能力方面的发展。

2. 项目快速启动

在开始使用 LongChat 之前，请确保您的环境中已经安装了 Python 3.10 和相关依赖。

环境搭建

conda create -n longeval python=3.10
conda activate longeval
pip install longchat

如果您希望从源代码构建项目，可以使用以下命令：

git clone https://github.com/DachengLi1/LongChat/
cd LongChat/
pip install -e .

对于需要测试超长序列长度的用户，请安装 FlashAttention：

pip install flashattention

模型训练

以下是一个使用分布式训练来训练 LongChat 模型的示例命令：

python -m torch.distributed.run --nproc_per_node=8 \
longchat/train/fine_tune/train_condense_16K.py \
--model_name_or_path <path-to-llama> \
--data_path data/dummy_conversation.json \
--bf16 \
--output_dir outputs \
--num_train_epochs 3 \
--per_device_train_batch_size 1 \
--per_device_eval_batch_size 4 \
--gradient_accumulation_steps 1 \
--evaluation_strategy no \
--save_strategy steps \
--save_steps 1000 \
--save_total_limit 1 \
--learning_rate 2e-5 \
--weight_decay 0. \
--warmup_ratio 0.03 \
--lr_scheduler_type "cosine" \
--logging_steps 1 \
--fsdp "full_shard auto_wrap" \
--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
--tf32 True \
--model_max_length 16384 \
--gradient_checkpointing True \
--lazy_preprocess True

请注意，这个脚本是假设使用 8xA100 GPUs，并且使用仓库中的虚拟数据进行示例使用。请根据您的实际情况进行适配。

3. 应用案例和最佳实践

模型评估

为了评估 LongChat 模型在粗粒度话题基准上的表现，可以使用以下命令：

cd longeval
python3 eval.py --model-name-or-path lmsys/longchat-13b-16k --task topics --longchat_flash_attn

对于新的模型，从任务列表中选择一个任务（"topics" 或 "lines"），并用您的模型路径替换 <your-model>：

python3 eval.py --model-name-or-path <your-model> --task <task>

如果您的模型需要内存高效的 FlashAttention 来评估超长测试，请在遇到内存问题时提交一个 issue。我们在发布博客中包含了我们所使用的命令。

生成新的测试用例

要生成新的测试用例，可以使用以下命令：

python3 generate_testcases.py <path-to-generate-testcases-configuration>

将 <path-to-generate-testcases-configuration> 替换为包含生成测试用例配置的 yaml 文件的路径。longeval/generate_testcases_configs.yaml 是一个提供默认配置文件，用户可以调整配置文件中的选项来自定义生成的测试用例。