LLaVA-HR 项目安装与使用教程-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_00007/article/details/139432721

LLaVA-HR 项目安装与使用教程

LLaVA-HR LLaVA-HR: High-Resolution Large Language-Vision Assistant 项目地址: https://gitcode.com/gh_mirrors/ll/LLaVA-HR

1. 项目目录结构及介绍

LLaVA-HR 项目的目录结构如下：

LLaVA-HR/
├── assets/
├── llava_hr/
├── playground/
│   └── data/
│       └── prompts/
├── scripts/
├── .gitattributes
├── .gitignore
├── Evaluation.md
├── LICENSE
├── README.md
└── pyproject.toml

目录介绍

assets/: 存放项目相关的静态资源文件。
llava_hr/: 项目的主要代码目录，包含模型实现和相关功能。
playground/data/prompts/: 存放用于训练和评估的提示数据。
scripts/: 存放项目的脚本文件，用于训练、评估和启动项目。
.gitattributes: Git 属性配置文件。
.gitignore: Git 忽略文件配置。
Evaluation.md: 评估指南文件。
LICENSE: 项目许可证文件。
README.md: 项目介绍和使用说明文件。
pyproject.toml: Python 项目配置文件。

2. 项目启动文件介绍

LLaVA-HR 项目的启动文件主要集中在 scripts/ 目录下。以下是一些关键的启动脚本：

scripts/v1_5/pretrain_llava_hr.sh: 用于低分辨率预训练的脚本。
scripts/v1_5/train_eval_llava_hr.sh: 用于高分辨率指令微调的脚本。
scripts/v1_5/eval.sh: 用于模型评估的脚本。

启动步骤

克隆项目仓库：

git clone https://github.com/luogen1996/LLaVA-HR.git
cd LLaVA-HR

安装依赖：

conda create -n llava-hr python=3.10 -y
conda activate llava-hr
pip install --upgrade pip
pip install -e .
pip install ninja
pip install flash-attn --no-build-isolation

启动预训练：
```
bash scripts/v1_5/pretrain_llava_hr.sh
```

启动指令微调：

bash scripts/v1_5/train_eval_llava_hr.sh

启动评估：
```
bash scripts/v1_5/eval.sh
```

3. 项目配置文件介绍

LLaVA-HR 项目的主要配置文件是 pyproject.toml，它包含了项目的依赖和构建配置。以下是该文件的部分内容示例：

[tool.poetry]
name = "LLaVA-HR"
version = "0.1.0"
description = "High-Resolution Large Language-Vision Assistant"
authors = ["Gen Luo <gen.luo@example.com>"]

[tool.poetry.dependencies]
python = "^3.10"
torch = "^1.10.0"
transformers = "^4.12.0"
...

[tool.poetry.dev-dependencies]
pytest = "^6.2.5"
...