Llama 2 ONNX 项目使用指南

最新推荐文章于 2025-06-04 11:52:04 发布

史跃骏Erika

最新推荐文章于 2025-06-04 11:52:04 发布

阅读量660

点赞数 7

CC 4.0 BY-SA版权

本文链接：https://blog.youkuaiyun.com/gitblog_01072/article/details/142838093

Llama 2 ONNX 项目使用指南

Llama-2-Onnx 项目地址: https://gitcode.com/gh_mirrors/ll/Llama-2-Onnx

1. 项目介绍

Llama 2 ONNX 是由微软优化并开源的 Llama 2 模型版本，基于 ONNX 格式。Llama 2 是由 Meta 提供的预训练和微调生成文本模型集合。该项目允许用户在遵守 Llama 社区许可协议的前提下，使用、修改、重新分发和创建微软贡献的优化版本的衍生作品。

主要特点

优化版本：基于 ONNX 格式，提供高效的模型推理。
多版本支持：支持多种模型大小和精度（如 7B 和 13B，float16 和 float32）。
开源许可：遵守 Llama 社区许可协议，允许用户自由使用和修改。

2. 项目快速启动

2.1 环境准备

在开始之前，请确保已安装 Git LFS（Large File Storage），以便高效处理大文件。

# 安装 Git LFS
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
git lfs install

2.2 克隆项目并初始化子模块

选择所需的 Llama 2 模型版本，并克隆项目及初始化子模块。

# 克隆项目
git clone https://github.com/microsoft/Llama-2-Onnx.git
cd Llama-2-Onnx

# 初始化并更新子模块
git submodule init <chosen_submodule>
git submodule update

2.3 运行示例代码

项目提供了最小工作示例和更完整的聊天应用示例。以下是运行最小工作示例的代码：

# 运行最小工作示例
python MinimumExample/Example_ONNX_LlamaV2.py --onnx_file 7B_FT_float16/ONNX/LlamaV2_7B_FT_float16.onnx --embedding_file 7B_FT_float16/embeddings.pth --tokenizer_path tokenizer.model --prompt "What is the lightest element?"