为了判断图片是否是从手机对手机或手机对电脑图片的翻拍,用原图与对应的翻拍图片分别放在两个文件夹中,构建二分类数据集,然后利用预训练的ResNet18模型进行微调训练,实现准确率达99%的判别效果。同样,为了判别图片是否模糊,构建原图与模糊图片组成的二分类数据集,利用预训练的ResNet18模型进行微调训练送到类似翻拍训练效果。为了判断同一个文件夹下重复图片,获取特征并计算其中每张图片与其他图片特征余弦相似度,如果相似度达到设定的相似度阈值,就确定为图片之间是重复的。
下面分别介绍其实现详情
一、判断图片是否翻拍
如上所述,先准备好训练集、验证集、测试集,我这里训练集有5千多张图片,验证集有1千多张图片。准备好数据后,在训练之前加载数据:
# 定义新的目标尺寸
target_size = (590, 849)
# 数据预处理
transform = transforms.Compose([
transforms.Resize(target_size), # 设置图片大小为 500x888
transforms.ToTensor(),
transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])
# 加载数据集
train_dataset = datasets.ImageFolder('F:/ajnewfortraintest/train', transform=transform)
# train_dataset = CustomDataset('F:/ajnewfortraintest/train')
test_dataset = datasets.ImageFolder('F:/ajnewfortraintest/test', transform=transform)
# test_dataset = CustomDataset('F:/ajnewfortraintest/test')
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
在这里, batch_size设置为32
然后初始化模型,如下所示:
# 检查是否存在之前保存的最佳模型
best_model_path = 'model/best_model.pth'
if os.path.exists(best_model_path):
model = models.resnet18(pretrained=False) # 不加载预训练权重
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 2) # 假设有2个类别
if torch.cuda.is_available():
model.load_state_dict(torch.load(best_model_path))
else:
model.load_state_dict(torch.load(best_model_path, map_location=torch.device('cpu')))
print("Loaded the best model for training.")
else:
model = models.resnet18(weights="DEFAULT")
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 2) # 假设有2个类别
print("No best model found, using pretrained ResNet50 model.")
# 冻结所有卷积层参数
for param in model.parameters():
param.requires_grad = False
在上面代码中,因为我这里是二分类模型训练,所以要修改原模型的最后的层的特征数量为2。
在这里我选用Adam优化器,初始化学习率为0.01
# 解冻最后一层
for param in model.fc.parameters():
param.requires_grad = True
# 将模型移动到GPU(如果可用)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# 损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.fc.parameters(), lr=0.01)
# 学习率调度器
scheduler = StepLR(optimizer, step_size=3, gamma=0.1)
# 用于保存最佳模型的变量
best_accuracy = 0.0
writer=SummaryWriter('runs/experiment_remake')
下面分轮分批进行训练,这里设置轮次为10,每训练完一轮,计算损失与准确率,并保存在日志中,并保存模型,然后用验证集对模型进行验证,如果当前的准确率比上一轮的准确率高,就把当前训练结果保存为最好模型。训练函数代码如下所所示:
def train():
# 解冻最后一层
for param in model.fc.parameters():
param.requires_grad = True
# 将模型移动到GPU(如果可用)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# 损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.fc.parameters(), lr=0.01)
# 学习率调度器
scheduler = StepLR(optimizer, step_size=3, gamma=0.1)
# 用于保存最佳模型的变量
best_accuracy = 0.0
writer=SummaryWriter('runs/experiment_remake')
# 训练模型
for epoch in range(10):
model.train()
try:
i=0
batch_index=0
total=0
correct=0
for images, labels in train_loader:
i+=images.shape[0]
print(f'第{epoch}轮获取到第{i}张训练图片')
# batch = [img for img in images if img is None]
# if len(batch) > 0:
# continue
images, labels = images.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
#计算准确率
_,predicted=torch.max(outputs.data, 1)
total+=labels.size(0)
correct+=(predicted==labels).sum().item()
accuracy=100*correct/total
loss.backward()
optimizer.step()
#记录损失
writer.add_scalar('Loss/train', loss.item(), epoch*len(train_loader)+batch_index)
#记录准确率
writer.add_scalar('Accuracy/train', accuracy, epoch*len(train_loader)+batch_index)
batch_index+=1
except Exception as e:
print(e)
# 每个epoch结束后更新学习率
scheduler.step()
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in test_loader:
images, labels = images.to(device), labels.to(device)
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
accuracy = 100 * correct / total
print(f'Epoch {epoch + 1}, Accuracy: {accuracy}%')
# 保存每个轮次的模型
torch.save(model.state_dict(), f'model/jud_remark_model_epoch_{epoch + 1}.pth')
# 保存最佳模型
if (correct / total) > best_accuracy:
best_accuracy = (correct / total)
torch.save(model.state_dict(), 'model/best_model.pth')
print(f'Best model saved with accuracy: {accuracy}%')
在验证之前,需要用model.eval(),才能验证到模型真实情况。model.eval()确保模型在评估模式下运行,这对于诸如 Dropout 和 Batch Normalization 等层的正确行为至关重要。在评估模式下,Dropout 层会关闭,而 Batch Normalization 层会使用训练过程中学习到的均值和方差,而不是当前批次的统计数据。
上面步骤完成了整个训练过程,下面需要对模型进行测试
def predict(url):
# 将模型移动到GPU(如果可用)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval() #确保模型处于评估模式
# 加载测试图片
image_path = url
image = Image.open(image_path).convert('RGB')
# 预处理测试图片
input_tensor = transform(image)
input_batch = input_tensor.unsqueeze(0) # 创建一个 mini-batch 作为模型的输入
# 确保使用 GPU 进行推理(如果可用)
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
# 使用模型进行预测
with torch.no_grad():
output = model(input_batch)
# 将输出转换为概率
probabilities = torch.nn.functional.softmax(output[0], dim=0)
# 获取预测的类别
_, predicted_idx = torch.max(probabilities, 0)
predicted_label = predicted_idx.item()
# 假设你有一个类别标签列表
# class_labels = ['original', 'reproduced'] # 根据类别替换这里的标签
class_labels = ['原图', '翻拍'] # 根据你的类别替换这里的标签
print(f"Predicted class: {class_labels[predicted_label]}")
print(f"Probabilities: {probabilities}")
return class_labels[predicted_label]
在上述代码中,在测试之前,用model.eval() 确保模型处于评估模式,以测试到模型真实效果。
下面是在训练过程中tensorboard中看到的准确率与损失情况:
模型图片判断模型训练代码与上面是一样的,只是训练集、验证集、测试集不一样。
二、判断图片是否重复
先准备好图片文件夹与图片,然后在程序中获取图片,逐一用ResNet18获取图片的特征,并两两计算图片特征之间的余弦相似度,判断相似度是否达到预定的阈值,如果达到,就是重复图片,否则不是,代码如下:
import torch
from torchvision import models, transforms
from PIL import Image
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
import os
# 加载预训练的ResNet模型
model = models.resnet50(weights='DEFAULT')
model = torch.nn.Sequential(*list(model.children())[:-1]) # 去掉最后一层全连接层
model.eval()
# 图像预处理
preprocess = transforms.Compose([
transforms.Resize((500, 888)), # 调整图像大小为 500x888
transforms.CenterCrop(224), # 中心裁剪为 224x224
transforms.ToTensor(), # 转换为 Tensor
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), # 归一化
])
def get_file_paths(folder_path):
file_paths = []
for root, dirs, files in os.walk(folder_path):
for file in files:
file_path = os.path.join(root, file)
file_paths.append(file_path)
return file_paths
def extract_features(img_path, model):
img = Image.open(img_path).convert('RGB')
img_tensor = preprocess(img)
img_tensor = img_tensor.unsqueeze(0) # 增加batch维度
with torch.no_grad():
features = model(img_tensor)
features = features.view(features.size(0), -1).numpy() # 展平特征向量
return features
def find_duplicates(image_paths, threshold=0.95):
features = [extract_features(img_path, model) for img_path in image_paths]
features = np.vstack(features) # 确保特征是一个二维数组
similarity_matrix = cosine_similarity(features)
duplicates = []
for i in range(len(similarity_matrix)):
for j in range(i + 1, len(similarity_matrix)):
if similarity_matrix[i][j] >= threshold:
duplicates.append((image_paths[i], image_paths[j]))
return duplicates
# 示例使用
image_paths=get_file_paths("F:/aj/hnzr/150000_20240219160200/0124021939893728_1141255686_20240219")
duplicates = find_duplicates(image_paths)
print("Detected duplicates:", duplicates)
在上面代码中,采用ResNet50模型获取图片特征值,需要用model = torch.nn.Sequential(*list(model.children())[:-1]) 去掉最后一层全连接层,并用model.eval()把模型设为评估模式。在这里,相似度阈值设为threshold=0.95。最后,输出重复图片的路径。经测试,达到预期的满意效果。