OpenAI——CLIPs（代码使用示例）

leaves dancing in the wind

已于 2023-03-31 10:22:51 修改

阅读量3.2k

点赞数 4

文章标签：计算机视觉人工智能 pytorch 自然语言处理

于 2023-03-09 19:08:42 首次发布

本文链接：https://blog.youkuaiyun.com/weixin_43860330/article/details/129428618

版权

OpenAI——CLIPs(打通NLP与CV)

Open AI在2021年1月份发布Contrastive Language-Image Pre-training(CLIP),基于对比文本-图像对对比学习的多模态模型，通过图像和它对应的文本描述对比学习，模型能够学习到文本-图像对的匹配关系。它开源、多模态、zero-shot、few-shot、监督训练均可。
原文原理图：
在这里插入图片描述
原文算法思想伪代码：

OpenAI CLIP 原项目：

https://github.com/openai/CLIP

使用

（一）原版
安装：

$ conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
$ pip install ftfy regex tqdm
$ pip install git+https://github.com/openai/CLIP.git

当然没有GPU和cuda，直接CPU也可以
源码：

import torch
import clip
from PIL import Image

device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device)

image = preprocess(Image.open("cat.png")).unsqueeze(0).to(device)  # CLIP.png为本文中图一，即CLIP的流程图
text = clip.tokenize( ["cat in basket", "python", "a cute cat","pytorch"