斯坦福的狗数据集,图片尺寸很随意,需要resize到最短边为250(另一个边根据这个缩放因子进行调整),然后随机裁剪成224*224大小的图片,最后分别对测试集和验证集生成lmdb和均值文件,以便于训练
并发批量resize+随机裁剪
在jupyter 中
#随机裁剪,padding为10,先resize到250,再padding再使用224*224的裁剪
def random_crop(num):
# print(num)
# new_img={}
image=data[num]
img_h = image.shape[0]
img_w = image.shape[1]
if img_h > img_w :
rate=250.0/img_w
else :
rate =250.0/img_h
image=cv2.resize(image, (0, 0), fx=rate, fy=rate, interpolation=cv2.INTER_CUBIC)
# oshape_h = 270#pading=10 250+10*2
# oshape_w = 270
img_h = image.shape[0]
img_w = image.shape[1]
padding=10
img_pad = np.zeros([img_h+20, img_w+20, 3], np.uint8)
img_pad[padding:padding + img_h, padding:padding + img_w, 0:3] = image
nh = np.random.randint(0, 26)#产生0-26的随机数
nw = np.random.randint(0, 26)
new_img=img_pad[nh:nh + 224, nw:nw + 224]
return new_img
并发:十个线程读取十张图片,十个进程处理十张图片,十个进程(写IO不知为什么线程和进程速度无差别)写处理完的十张图片
def get_data(i, img_path_i):
data[i]=cv2.imread(img_path_i)
def io(img_data,path): #data[i]
img=np.uint8(img_data)
cv2.imwrite(path,img_data)
from multiprocessing import Pool
path='/home/data/StanfordDogsDataset/merge/test/'
new_path='/home/data/S