目录
2.2 Medium Baseline (acc>0.73207)
2.3 Strong Baseline (acc>0.81872)
2.4 Boss Baseline (acc>0.88446)
Machine Learning HW3
任务
图像分类

数据下载
百度网盘:
链接:https://pan.baidu.com/s/1gEiw4nIYDA4puMIhqBI5Og?pwd=pwyl
提取码:pwyl
结果
全过strong baselin,public score与bossline差0.2

改进方法
2.1 Simple line (acc>0.50099)
运行课程上给的基础代码
2.2 Medium Baseline (acc>0.73207)
对数据集进行数据增强,并训练更长的时间,这里的图像变换,一是可以增强模型的鲁棒性,二是可以用于进一步扩充数据集,使得我们的训练集得到扩充。
test_tfm = transforms.Compose([
transforms.Resize((128, 128)),
transforms.ToTensor(),
])
# 对于训练图片做处理,数据增强,常用的几种有效数据增强方法:
train_tfm = transforms.Compose([
# Resize the image into a fixed shape (height = width = 128)
transforms.Resize((128, 128)),
# You may add some transforms here.
transforms.RandomHorizontalFlip(p=0.5), # 50%的概率水平翻转
transforms.RandomVerticalFlip(p=0.5), # 50%的概率垂直翻转
transforms.RandomCrop(128, padding=10),#填充剪裁
# transforms.RandomGrayscale(p=0.1) #根据概率转灰度channel=1,CNN中in_channel=3,不可行
transforms.ColorJitter(brightness=0.5, contrast=0.5, saturation=0.5, hue=0.1), # 修改亮度、对比度和饱和度,色调
# ToTensor() should be the last one of the transforms.
# transforms.RandomInvert(),# 改变图像的颜色
transforms.ToTensor(),
])
2.3 Strong Baseline (acc>0.81872)
模型设计,采用残差神经网络。
我这里使用的是resnet50,没有使用预训练的参数
def ResNet1():
model = torchvision.models.resnet50(weights=None)
model.conv1.in_channels = 3
model.fc = nn.Sequential(nn.Flatten(), nn.Linear(2048, 512),nn.LeakyReLU(0.1),nn.BatchNorm1d(512), nn.Dropout(0.2),nn.Linear(512, 11))
model.fc.out_feature = 11
return model
2.4 Boss Baseline (acc>0.88446)
使用预训练参数,进一步增强数据。
def ResNet1():
model = torchvision.models.resnet50(weights=True)
model.conv1.in_channels = 3
model.fc = nn.Sequential(nn.Flatten(), nn.Linear(2048, 512),nn.LeakyReLU(0.1),nn.BatchNorm1d(512), nn.Dropout(0.2),nn.Linear(512, 11))
model.fc.out_feature = 11
return model
使用交叉验证。
files = sorted([os.path.join("./food-11/cross_validation",x) for x in os.listdir("./food-11/cross_validation") if x.endswith(".jpg")])
#不能简单的平均分成4份,因为相同label的图片聚集在一起,所以下面的切片方法错误
#flod_1,flod_2,flod_3,flod_4=files[:3324],files[3324:6648],files[6648:9972],files[9972:]
#正确方法,随机划分
flod_1_size=3324
flod_2_size=3324
flod_3_size=3324
flod_4_size=3324
#无重复的随机划分
#使用random_split需要引入 from torch.utils.data import random_split
flod_1,flod_2,flod_3,flod_4 = random_split(files, [flod_1_size, flod_2_size,flod_3_size,flod_4_size], generator=torch.Generator().manual_seed(myseed))
#cross_files作为k_flod()函数中files的传参 ,len(cross_files)=4
cross_files=[list(flod_1),list(flod_2),list(flod_3),list(flod_4)]
使用esemble,综合几个模型的结果
def ts1(model,device, k, batch_size,_exp_name):
_dataset_dir = "./food-11"
test_set = deal.FoodDataset1(os.path.join(_dataset_dir, "test"), tfm=deal.test_tfm)
test_loader = deal.DataLoader(test_set, batch_size=batch_size, shuffle=False, num_workers=7, pin_memory=True)
item = 0
"""# Testing and generate prediction CSV"""
model_best = model().to(device)
model_best.load_state_dict(torch.load(f"{_exp_name}_best.ckpt"))
model_best.eval()
prediction = []
with torch.no_grad():
for data, _ in test_loader:
test_pred = model_best(data.to(device))
test_label = np.argmax(test_pred.cpu().data.numpy(), axis=1)
# prediction += test_label.squeeze().tolist()
a=[pad4(item),]
a.extend(test_pred.detach().cpu().numpy().squeeze().tolist())
a.append(str(test_label.squeeze().tolist()))
# print(test_label)
prediction.append(a)
item += 1
with open(_exp_name+'submission.csv', 'w') as fp:
writer = csv.writer(fp)
writer.writerow(["id","0","1","2","3","4","5","6","7","8","9","10","Category"])
for p11 in prediction:
# print(p11)
# print(p11[1][0])
writer.writerow(p11)
总结
这一次作业可以很简单达到bossline,可以无限增加训练集,再使用预训练模型的参数,但我本人没有这么做,而是希望在单模型交叉验证上取得突破。
这篇博客记录了作者在李宏毅机器学习课程HW3中进行的图像分类任务,包括简单线性模型、中等基线、强基线和Boss基线的实现。通过数据增强、残差网络、预训练参数、交叉验证和ensemble策略提高模型性能,最终在Boss基线上取得超过0.88的准确率。
944





