之前的一些配置用来tensorflow与gpu相关设置链接:
https://blog.youkuaiyun.com/qq_32166779/article/details/84891909
首先在~建立tensorflow文件夹
在tensorflow文件夹下载源码
mkdir tensorflow
git clone https://github.com/tensorflow/models.git
进入~/tensorflow/models/research执行命令:
protoc object_detection/protos/*.proto --python_out=.
显示
object_detection/protos/ssd.proto:104:3: Expected "required", "optional", or "repeated".
object_detection/protos/ssd.proto:104:12: Expected field name.
object_detection/protos/model.proto: Import "object_detection/protos/ssd.proto" was not found or had errors.
object_detection/protos/model.proto:12:5: "Ssd" is not defined.
原因是proto工具版本太旧,需要更新这个,更新前需删除
sudo apt-get remove protobuf-compiler
查看是否删除:protoc --version
删除成功后去网站下载:protobuf-all-3.6.1.tar.gz
网站链接:https://github.com/google/protobuf/releases
解压,进入protobuf-all-3.6.1文件内
cd protobuf-3.6.1
./configure --prefix=/usr
make -j15
make check -j15
sudo make install -j15
sudo ldconfig
protoc --version
在执行protoc object_detection/protos/*.proto --python_out=.
成功
添加环境变量
gedit ~/.bashrc
文档添加
# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:~/tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:~/tensorflow/models/research/slim
source ~/.bashrc
测试
python object_detection/builders/model_builder_test.py
训练模型:
首先是找到训练数据集
http:\www.robots.ox.ac.uk/~vgg/data/pets/
下载:
Dataset
Groundtruth data
建立目录
cd /home/wmz/Downloads
ttar -xvf annotations.tar.gz
tar -xvf images.tar.gz
mkdir /python_code/pet_data(放入刚下载并解压后的文件)
cp -r /home/wmz/Downloads/annotations /python_code/pet_data/
cp -r /home/wmz/Downloads/images /python_code/pet_data/
再建立目录,用来做训练
mkdir /python_code/pet_train
mkdir /python_code/pet_train/data (用来放tfcord数据)
mkdir /python_code/pet_train/models (用来放模型)
mkdir /python_code/pet_train/models/train
mkdir /python_code/pet_train/models/eval
#转移 ssd_mobilenet_v1_pets.config
cp ~/tensorflow/models/research/object_detection/samples/configs/ssd_mobilenet_v1_pets.config /python_code/pet_train/models
cp ~/tensorflow/models/research/object_detection/data/pet_label_map.pbtxt /python_code/pet_train/data/
下载与模型
model zoo链接:
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
下载ssd_mobilenet_v1_coco_2018_01_28.tar.gz
建立models 文件夹,并放入压缩包,并解压
mkdir /python_code/models/
cp -r /home/wmz/Downloads/ssd_mobilenet_v1_coco_2018_01_28 /python_code/models/
第一步:先生成pet的tfcord数据
输入指令(注意自己的路径即可)
cd ~/tensorflow/models/research
python object_detection/dataset_tools/create_pet_tf_record.py --label_map_path=object_detection/data/pet_label_map.pbtxt --data_dir=/python_code/pet_data --output_dir=/python_code/pet_train/data
有问题 ,猜测是tensorflow版本太高
卸载并重安装指令
pip uninstall protobuf
pip uninstall tensorflow-gpu==1.12.0
下载离线包
网站链接:https://pypi.org/project/tensorflow-gpu/1.7.0/
https://pypi.org/project/tensorflow-gpu/“x.x.0”/
pip install tensorflow-gpu
失败…
直接卸载anaconda内的所有文件
再次安装anaconda,并且安装tensorflow-gpu成功。
再输入一遍,还是不行,结果把等于号附近的空格合并了,cao !!!好使了。
python object_detection/dataset_tools/create_pet_tf_record.py --label_map_path=object_detection/data/pet_label_map.pbtxt --data_dir=/python_code/pet_data --output_dir=/python_code/pet_train/data
修改pet的config文件
确定checkpoint路径:
/python_code/models/ssd_mobilenet_v1_coco_2018_01_28/model.ckpt
确定input路径:
/python_code/pet_train/data/#########
确定label_map_path路径:
/python_code/pet_train/data/#########
gedit /python_code/pet_train/models/ssd_mobilenet_v1_pets.config
git cocoapi
cd /python_code
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
make
cp -r pycocotools ~/tensorflow/models/research/
最后一步进行训练
cd ~/tensorflow/models/research
python object_detection/model_main.py --pipeline_config_path=/python_code/pet_train/models/ssd_mobilenet_v1_pets.config --model_dir=/python_code/pet_train/models/train --num_train_steps=100 --num_eval_steps=10 --alsologtostderr
cp ~/tensorflow/models/research/object_detection/legacy/train.py ~/tensorflow/models/research/object_detection/
python object_detection/train.py --logtostderr --pipeline_config_path=/python_code/pet_train/models/ssd_mobilenet_v1_pets.config --train_dir=/python_code/pet_train/models/train
成功则恭喜你 train文件里的就是模型,如果失败请看下面
最后一步,导出模型
建立 模型文件夹
mkdir pet_models
生成模型路径:/python_code/pet_models
cd ~/tensorflow/models/research
python object_detection/export_inference_graph.py --input_type=image_tensor --pipeline_config_path=/python_code/pet_train/models/ssd_mobilenet_v1_pets.config --trained_checkpoint_prefix=/python_code/pet_train/models/train/model.ckpt-1597 --output_directory=/python_code/pet_model
https://github.com/tensorflow/models/issues/4780问题汇总
GPU失败问题
在使用时敲这个指令
export CUDA_VISIBLE_DEVICES=0
还是不行,打印问题感觉时显存不够
Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1482 MB memory) -> physical GPU (device: 0, name: GeForce 840M, pci bus id: 0000:01:00.0, compute capability: 5.0)
在model_main.py文件加入
import os
os.environ["CUDA_VISIBLE_DEVICES"] = '0' #指定第一块GPU可用
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.2 # 程序最多只能占用指定gpu50%的显存
config.gpu_options.allow_growth = True #程序按需申请内存
sess = tf.Session(config = config)