计算机视觉技术实战:从ViT到SAM与FastSAM
1. Hugging Face ViT注意力分数可视化
在计算机视觉领域,理解模型对图像的关注点至关重要。Hugging Face的ViT(Vision Transformer)模型为我们提供了一种强大的工具来实现这一目标。以下是使用Python 3.9实现ViT注意力分数可视化的代码:
from PIL import Image
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
from transformers import ViTImageProcessor
from transformers import ViTModel
model_name = "facebook/dino-vits8"
model = ViTModel.from_pretrained(model_name, add_pooling_layer=False)
feature_extractor = ViTImageProcessor.from_pretrained(model_name, size=480)
image = Image.open('d:/robin.jpg')
# Using the pre-trained ViT model to get attention scores
input = feature_extractor(images=image, return_tensors='pt').pixel_values
outputs = model(in
超级会员免费看
订阅专栏 解锁全文
3569

被折叠的 条评论
为什么被折叠?



