AttackVLM 开源项目使用教程-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_00039/article/details/142192330

AttackVLM 开源项目使用教程

AttackVLM [NeurIPS-2023] Annual Conference on Neural Information Processing Systems 项目地址: https://gitcode.com/gh_mirrors/at/AttackVLM

1. 项目介绍

AttackVLM 是一个用于评估大型视觉-语言模型（VLMs）对抗性鲁棒性的开源项目。该项目旨在模拟最现实和高风险的环境，其中攻击者仅具有黑盒系统访问权限，并试图通过微妙地操纵最脆弱的模态（如视觉）来欺骗模型返回目标响应。通过生成对抗性示例并将其转移到其他VLMs中，AttackVLM 提供了一种量化理解大型VLMs对抗性脆弱性的方法，并呼吁在实际部署前对其潜在的安全缺陷进行更彻底的检查。

2. 项目快速启动

2.1 环境准备

首先，确保您的系统满足以下要求：

平台：Linux
硬件：A100 PCIe 40G
依赖库：lmdb, tqdm, wandb, torchvision 等

2.2 安装依赖

# 创建并激活 conda 环境
conda env create -f environment.yaml
conda activate ldm

2.3 克隆项目

git clone https://github.com/yunqing-me/AttackVLM.git
cd AttackVLM

2.4 生成目标图像

使用 Stable Diffusion 生成目标图像：

# 克隆 Stable Diffusion 项目
git clone https://github.com/CompVis/stable-diffusion.git
cd stable-diffusion

# 准备目标图像生成脚本
python txt2img_coco.py \
  --ddim_eta 0.0 \
  --n_samples 10 \
  --n_iter 1 \
  --scale 7.5 \
  --ddim_steps 50 \
  --plms \
  --skip_grid \
  --ckpt /_model_pool/sd-v1-4-full-ema.ckpt \
  --from-file '/path/to/your/coco_captions_file.txt' \
  --outdir '/path/to/your/targeted_images'

2.5 执行对抗性攻击

# 激活 unidiffuser 环境
conda activate unidiffuser

# 执行对抗性攻击
bash _train_adv_img_trans.sh

# 生成对抗性图像的文本响应
python _eval_i2t_dataset.py \
  --batch_size 100 \
  --mode i2t \
  --img_path 'dir of white-box transfer images' \
  --output 'dir of white-box transfer captions'