caffe---create自己的数据出现的各种bug

最新推荐文章于 2021-03-22 19:40:11 发布

gn102038

最新推荐文章于 2021-03-22 19:40:11 发布

阅读量401

点赞数 1

caffe---create自己的数据出现的各种bug

本文转载自：http://blog.youkuaiyun.com/dcxhun3/article/details/51966921

分类：caffe学习笔记深度学习

目前bug主要是create_imagenet.sh（来源于examples/imagenet）生成lmdb数据时产生的

bug 1 mkdir *_val_lmdb failed

这个一般是因为指定路径下已经存在了该文件，导致出现冲突问题，我最开始对于这问题是每次都手动敲码删除该文件，最后发现自己很笨，可以直接加个语句到create_imagenet.sh中：

[python]view plain copy 
    
 
 rm -rf $EXAMPLE/mytask_train_lmdb  
 rm -rf $EXAMPLE/mytask_val_lmdb  

bug 2 找不到指定路径下的图片could not open or find file

第一个情况是我在windows cmd下生成的txt标签文件，这里路径是反斜杠，我没有注意到。解决的最好办法就是打开txt文件，将反斜杠替换为斜杠。要么就是在linux下运行make_list.py就不会出现这个问题了。

第二种情况，这个着实困扰了我好久，怎么也搞不懂，路径明明对着了，为啥就不对呢？百思不得其解。。。最后才发现是python里面的转义字符 \t 搞的鬼在图片名和标签之间的空格用\t表示的，解决这个问题的办法是用 ‘ ’代替了，好了：

[python]view plain copy 
    
 
 #fout.write('%s\t%d\n'%(image_list[i][0], image_list[i][1]))  
        fout.write('%s%s%d\n'%(image_list[i][0], ' ',image_list[i][1]))#space not \t   

正确情况，开始生成lmdb 数据比较大啊 378430图像比较耗时

代码一

make_list.py

[python]view plain copy 
    
 
 import fnmatch,os  
 import random  
 import numpy as np  
 import argparse  
   
 def list_image(root, recursive, exts):  
     image_list = []  
     if recursive:  
         cat = {}  
         for path, subdirs, files in os.walk(root,True):  
             print path  
             for fname in files:  
                 fpath = os.path.join(path,fname)  
                 suffix = os.path.splitext(fname)[1].lower()  
                 if os.path.isfile(fpath) and (suffix in exts):  
                     if path not in cat:  
                         cat[path] = len(cat)  
                     image_list.append((os.path.relpath(fpath, root), cat[path]))  
                #    print fpath,cat[path]  
     else:  
         for fname in os.listdir(root):  
             fpath = os.path.join(root, fname)  
             suffix = os.path.splitext(fname)[1].lower()  
             if os.path.isfile(fpath) and (suffix in exts):  
                 image_list.append((os.path.relpath(fpath, root), 0))  
     return image_list  
   
 def write_list(path_out, image_list):  
     with open(path_out, 'w') as fout:  
         for i in xrange(len(image_list)):  
             #fout.write('%d \t %d \t %s\n'%(i, image_list[i][1], image_list[i][0]))  
             #fout.write('%s\t%d\n'%(image_list[i][0], image_list[i][1]))  
             fout.write('%s%s%d\n'%(image_list[i][0], ' ',image_list[i][1]))#space not \t   
 def make_list(prefix_out, root, recursive, exts, num_chunks, train_ratio):  
     image_list = list_image(root, recursive, exts)  
     random.shuffle(image_list)  
     N = len(image_list)  
     chunk_size = (N+num_chunks-1)/num_chunks  
     for i in xrange(num_chunks):  
         chunk = image_list[i*chunk_size:(i+1)*chunk_size]  
         if num_chunks > 1:  
             str_chunk = '_%d'%i  
         else:  
             str_chunk = ''  
         if train_ratio < 1:  
             sep = int(chunk_size*train_ratio)  
             write_list(prefix_out+str_chunk+'_train.txt', chunk[:sep])  
             write_list(prefix_out+str_chunk+'_val.txt', chunk[sep:])  
         else:  
             write_list(prefix_out+str_chunk+'.txt', chunk)  
   
 def main():  
     parser = argparse.ArgumentParser(  
         formatter_class=argparse.ArgumentDefaultsHelpFormatter,  
         description='Make image list files that are\  
         required by im2rec')  
     parser.add_argument('root', help='path to folder that contain images.')  
     parser.add_argument('prefix', help='prefix of output list files.')  
     parser.add_argument('--exts', type=list, default=['.bmp','.bmp'],  
         help='list of acceptable image extensions.')  
     parser.add_argument('--chunks', type=int, default=1, help='number of chunks.')  
     parser.add_argument('--train_ratio', type=float, default=1.0,  
         help='Percent of images to use for training.')  
     parser.add_argument('--recursive', type=bool, default=True,  
         help='If true recursively walk through subdirs and assign an unique label\  
         to images in each folder. Otherwise only include images in the root folder\  
         and give them label 0.')  
     args = parser.parse_args()  
       
     make_list(args.prefix, args.root, args.recursive,  
         args.exts, args.chunks, args.train_ratio)  
   
 if __name__ == '__main__':  
     main()  

代码二

create_imagenet.sh

[python]view plain copy 
    
 
 #!/usr/bin/env sh  
 # Create the imagenet lmdb inputs  
 # N.B. set the path to the imagenet train + val data dirs  
 EXAMPLE=examples/mytask  
 DATA=/mnt/hgfs/caffe  
 TOOLS=build/tools  
 TRAIN_DATA_ROOT=/mnt/hgfs/caffe/train/  
 VAL_DATA_ROOT=/mnt/hgfs/caffe/val/  
 # Set RESIZE=true to resize the images to 256x256. Leave as false if images have  
 # already been resized using another tool.  
 RESIZE=true  
 if $RESIZE; then  
   RESIZE_HEIGHT=256  
   RESIZE_WIDTH=256  
 else  
   RESIZE_HEIGHT=0  
   RESIZE_WIDTH=0  
 fi  
 if [ ! -d "$TRAIN_DATA_ROOT" ]; then  
   echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"  
   echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \  
        "where the ImageNet training data is stored."  
   exit 1  
 fi  
 if [ ! -d "$VAL_DATA_ROOT" ]; then  
   echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"  
   echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \  
        "where the ImageNet validation data is stored."  
   exit 1  
 fi  
 echo "Creating train lmdb..."  
 rm -rf $EXAMPLE/mytask_train_lmdb  
 rm -rf $EXAMPLE/mytask_val_lmdb  
 GLOG_logtostderr=1 $TOOLS/convert_imageset \  
     --resize_height=$RESIZE_HEIGHT \  
     --resize_width=$RESIZE_WIDTH \  
     --shuffle \  
     $TRAIN_DATA_ROOT \  
     $DATA/train.txt \  
     $EXAMPLE/mytask_train_lmdb  
 echo "Train lmdb done!"  
 echo "Creating val lmdb..."  
 GLOG_logtostderr=1 $TOOLS/convert_imageset \  
     --resize_height=$RESIZE_HEIGHT \  
     --resize_width=$RESIZE_WIDTH \  
     --shuffle \  
     $VAL_DATA_ROOT \  
     $DATA/val.txt \  
     $EXAMPLE/mytask_val_lmdb  
 echo "val lmdb done!"  
 echo "Done."