对于PNet,运行gen_PNet_tfrecords.py
一次,生成tfrecords文件存放在文件 "../../DATA/imglists/PNet/train_PNet_landmark.tfrecord_shuffle中,用于训练PNet。
对于PNet,运行gen_RNet_tfrecords.py
四次,生成neg,pos,part and landmark 的tfrecords文件存放在'../../DATA/imglists/RNet/pos_landmark.tfrecord_shuffle'
'../../DATA/imglists/RNet/part_landmark.tfrecord_shuffle'
'../../DATA/imglists/RNet/neg_landmark.tfrecord_shuffle'
'../../DATA/imglists/RNet/landmark_landmark.tfrecord_shuffle'文件中,用于训练RNet
对于ONet,运行gen_ONet_tfrecords.py
3次,生成neg,pos,part and landmark 的tfrecords文件存放在'../../DATA/imglists/ONet/pos_landmark.tfrecord_shuffle'
'../../DATA/imglists/ONet/part_landmark.tfrecord_shuffle'
'../../DATA/imglists/ONet/neg_landmark.tfrecord_shuffle'
共用'../../DATA/imglists/RNet/landmark_landmark.tfrecord_shuffle'文件中,用于训练RNet
正常情况下我们训练文件夹经常会生成 train, test 或者val文件夹,这些文件夹内部往往会存着成千上万的图片或文本等文件,这些文件被散列存着,这样不仅占用磁盘空间,并且再被一个个读取的时候会非常慢,繁琐。占用大量内存空间(有的大型数据不足以一次性加载)。此时我们TFRecord格式的文件存储形式会很合理的帮我们存储数据。TFRecord内部使用了“Protocol Buffer”二进制数据编码方案,它只占用一个内存块,只需要一次性加载一个二进制文件的方式即可,简单,快速,尤其对大型训练数据很友好。而且当我们的训练数据量比较大的时候,可以将数据分成多个TFRecord文件,来提高处理效率。
生成TFRecord,from:https://www.jianshu.com/p/b480e5fcb638
if __name__ == '__main__':
dir = '../../DATA/'
net = 'PNet'
output_directory = '../../DATA/imglists/PNet'
run(dir, net, output_directory, shuffling=True)
1、run函数
- _get_output_filename(output_dir, name, net) 函数返回一个路径:结果创建路径:tf_filename ="../../DATA/imglists/PNet/train_PNet_landmark.tfrecord"
- if shuffling: 用于打乱样本顺序
- for i, image_example in enumerate(dataset): image_example是个字典
- _add_to_tfrecord(filename, image_example, tfrecord_writer):从图像和TXT文件中取出取出图像区域,转化为TFRecord.
- tf.python_io.TFRecordWriter(tf_filename) as tfrecord_writer: #tf.python_io.TFRecordWriter(tf_filename)就是TFrecord生成器,tfrecord_writer.write(tf_example.SerializeToString())来生成我们所要的tfrecord文件了
- tf_filename = tf_filename + '_shuffle' ##创建文件 "../../DATA/imglists/PNet/train_PNet_landmark.tfrecord_shuffle",用于存放TFrecord文件
def run(dataset_dir, net, output_dir, name='MTCNN', shuffling=False):
"""
dataset_dir: '../../DATA/'
output_dir: Output directory. 'PNet'
"""
#tfrecord name
tf_filename = _get_output_filename(output_dir, name, net)
if tf.gfile.Exists(tf_filename):
print('Dataset files already exist. Exiting without re-creating them.')
return
# GET Dataset, and shuffling.
dataset = get_dataset(dataset_dir, net=net) #返回一个列表,列表每个元素是字典,有两个key值。一个存图像名字符串,另一个存人脸框或特征点(用字典类型保存)
# filenames = dataset['filename']
if shuffling:
tf_filename = tf_filename + '_shuffle'
#random.seed(12345454)
random.shuffle(dataset) #用于将一个列表中的元素打乱
# Process dataset files.write the data to tfrecord
print('lala')
with tf.python_io.TFRecordWriter(tf_filename) as tfrecord_writer:
for i, image_example in enumerate(dataset): #image_example是个字典
if (i+1) % 100 == 0:
sys.stdout.write('\r>> %d/%d images has been converted' % (i+1, len(dataset)))
#sys.stdout.write('\r>> Converting image %d/%d' % (i + 1, len(dataset)))
sys.stdout.flush()
filename = image_example['filename']
_add_to_tfrecord(filename, image_example, tfrecord_writer)
# Finally, write the labels file:
# labels_to_class_names = dict(zip(range(len(_CLASS_NAMES)), _CLASS_NAMES))
# dataset_utils.write_label_file(labels_to_class_names, dataset_dir)
print('\nFinished converting the MTCNN dataset!')
2、 get_dataset(dir, net='PNet'): #返回一个列表,列表每个元素是字典,有3个key值。第一个key=filename:存图像名字符串,第二个key=label:记录样本标签;另一个key=bbox:存人脸框或特征点(用字典类型保存)
def get_dataset(dir, net='PNet'): #返回一个列表,列表每个元素是字典,有2个key值。一个存图像名字符串,第二个存label,另一个存人脸框或特征点(用字典类型保存)
#get file name , label and anotation
item = 'imglists/PNet/train_%s_landmark.txt' % net #item = 'imglists/PNet/train_PNet_raw.txt'
dataset_dir = os.path.join(dir, item) # dataset_dir = '../../DATA/imglists/PNet/train_PNet_landmark.txt'
imagelist = open(dataset_dir, 'r') #读取我们合并过后的txt文件
dataset = []
for line in imagelist.readlines():#readlines()返回一个列表
info = line.strip().split(' ') #返回字符列表
data_example = dict() #字典
bbox = dict()
data_example['filename'] = info[0]
#print(data_example['filename'])
data_example['label'] = int(info[1]) #读取该图像的类别label
。。。
if len(info) == 6: #应用于人脸检测训练,无人脸特征点
bbox['xmin'] = float(info[2])
bbox['ymin'] = float(info[3])
bbox['xmax'] = float(info[4])
bbox['ymax'] = float(info[5])
if len(info) == 12: #应用于人脸特征点定位,5
bbox['xlefteye'] = float(info[2])
。。。
bbox['yrightmouth'] = float(info[11])
data_example['bbox'] = bbox #将一个字典bbox作为字典data_example的元素
dataset.append(data_example)
return dataset
训练数据: [path to image][cls_label][bbox_label][landmark_label]
For pos sample,cls_label=1,bbox_label(calculate),landmark_label=[0,0,0,0,0,0,0,0,0,0].
For part sample,cls_label=-1,bbox_label(calculate),landmark_label=[0,0,0,0,0,0,0,0,0,0].
For landmark sample,cls_label=-2,bbox_label=[0,0,0,0],landmark_label(calculate).
For neg sample,cls_label=0,bbox_label=[0,0,0,0],landmark_label=[0,0,0,0,0,0,0,0,0,0].
3、_add_to_tfrecord(filename, image_example, tfrecord_writer):从图像和TXT文件中取出取出图像区域,转化为TFRecord.
filename:从 imglists/PNet/train_PNet_raw.txt读取出的图像路径和图像名
image_example:字典,有3个key值。第一个key=filename:图像名字符串,第二个key=label:记录样本标签;另一个key=bbox:
image_data, height, width = _process_image_withoutcoder(filename)
example = _convert_to_example_simple(image_example, image_data)
tfrecord_writer.write(example.SerializeToString())
- _process_image_withoutcoder函数返回image_data,# return string data and initial height and width of the image:
image_data = image.tostring() #把矩阵转化为字符串,from:https://blog.youkuaiyun.com/nanxiaoting/article/details/8072105 example.SerializeToString()
是将Example中的map压缩为二进制文件,更好的节省空间。
3.1、_convert_to_example_simple函数:
tf.train.Example(): tfrecord格式是tensorflow官方推荐的数据格式,把数据、标签进行统一的存储。 tfrecord文件包含了tf.train.Example 协议缓冲区(protocol buffer,协议缓冲区包含了特征 Features), 能让tensorflow更好的利用内存。参考:https://blog.youkuaiyun.com/hfutdog/article/details/86244944
example包含了:图像,标签,
def _convert_to_example_simple(image_example, image_buffer):
"""
covert to tfrecord file
:param image_example: dict, an image example
:param image_buffer: string, JPEG encoding of RGB image
:param colorspace:
:param channels:
:param image_format:
:return:
Example proto
"""
# filename = str(image_example['filename'])
# class label for the whole image
class_label = image_example['label']
bbox = image_example['bbox']
roi = [bbox['xmin'],bbox['ymin'],bbox['xmax'],bbox['ymax']]
landmark = [bbox['xlefteye'],bbox['ylefteye'],bbox['xrighteye'],bbox['yrighteye'],bbox['xnose'],bbox['ynose'],
bbox['xleftmouth'],bbox['yleftmouth'],bbox['xrightmouth'],bbox['yrightmouth']]
example = tf.train.Example(features=tf.train.Features(feature={
'image/encoded': _bytes_feature(image_buffer),
'image/label': _int64_feature(class_label),
'image/roi': _float_feature(roi),
'image/landmark': _float_feature(landmark)
}))
return example