1、项目来源:
https://github.com/AITTSMD/MTCNN-Tensorflow
2、安装tensorflow:
手把手教你如何安装Tensorflow(Windows和Linux两种版本)
Q1:
(tensorflow) auto406@auto406:~$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.12.1-cp35-cp35m-linux_x86_64.whl
Traceback (most recent call last):
File "/home/auto406/anaconda3/envs/tensorflow/bin/pip", line 6, in <module>
sys.exit(pip.main())
AttributeError: module 'pip' has no attribute 'main'
auto406@auto406:~$ pip --version
pip 10.0.1 from /home/auto406/anaconda3/lib/python3.6/site-packages/pip (python 3.6)
看看你的pip 版本,10.0没有main(),考虑降个版本:python -m pip install --upgrade pip==9.0.3
https://zhidao.baidu.com/question/1738175020509032387.html
#!/home/auto406/anaconda3/envs/tensorflow/bin/python
if __name__ == '__main__':
import sys
import pip
import pip._internal as pip_new
sys.exit(pip_new.main())
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Collecting tensorflow-gpu==1.3.0cr2 from https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.3.0cr2-cp35-cp35m-linux_x86_64.whl
Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='storage.googleapis.com', port=443): Read timed out. (read timeout=15)",)': /tensorflow/linux/gpu/tensorflow_gpu-1.3.0cr2-cp35-cp35m-linux_x86_64.whl
Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7f01c11a96a0>: Failed to establish a new connection: [Errno 101] 网络不可达',)': /tensorflow/linux/gpu/tensorflow_gpu-1.3.0cr2-cp35-cp35m-linux_x86_64.whl
(tensorflow)$ pip install tensorflow
>>> import tensorflow as tf
>>> hello = tf.constant('first tensorflow')
>>> sess = tf.Session()
2019-04-07 16:17:42.623778: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-04-07 16:17:42.650197: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3600000000 Hz
2019-04-07 16:17:42.651151: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x3b4cbb0 executing computations on platform Host. Devices:
2019-04-07 16:17:42.651210: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
遇到了这个问题,意思是你的 CPU 支持AVX AVX2 (可以加速CPU计算),但你安装的 TensorFlow 版本不支持
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
https://blog.youkuaiyun.com/zhaohaibo_/article/details/80573676
Q3:
print sess.run('hello')
File "<stdin>", line 1
print sess.run('hello')
^
SyntaxError: invalid syntax
这个报错是因为python3中print变成了一个方法,需要带括号当参数传入值。
print(sess.run(hello))
>>> print(sess.run(hello))
b'first tensorflow'
tensorflow_CPU版安装成功!!!
下一步:
auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/prepare_data$ python gen_12net_data.py
Traceback (most recent call last):
File "gen_12net_data.py", line 3, in <module>
import cv2
ModuleNotFoundError: No module named 'cv2'
安装python接口的opencv:
在tensorflow环境中
$ pip install opencv-python #安装opencv
$ pip install opencv-contrib-python #安装opencv的contrib扩展包
(tensorflow) auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/prepare_data$ python gen_12net_data.py
Traceback (most recent call last):
File "gen_12net_data.py", line 7, in <module>
from prepare_data.utils import IoU
ImportError: No module named 'prepare_data'
#from prepare_data.utils import IoU
from utils import IoU
(tensorflow) auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/prepare_data$ python gen_hard_example.py
Traceback (most recent call last):
File "gen_hard_example.py", line 13, in <module>
from train_models.MTCNN_config import config
File "../train_models/MTCNN_config.py", line 3, in <module>
from easydict import EasyDict as edict
ImportError: No module named 'easydict'
在激活的tensorflow环境中安装扩展包easydict:
conda install -c chembl easydict
安装cuDNN:
注册下载:https://developer.nvidia.com/rdp/cudnn-archive
查看ubuntu版本:
auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/prepare_data$ cat /proc/version
Linux version 4.15.0-46-generic (buildd@lgw01-amd64-008) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) #49~16.04.1-Ubuntu SMP Tue Feb 12 17:45:24 UTC 2019
查看CUDA版本:
auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/prepare_data$ cat /usr/local/cuda/version.txt
CUDA Version 9.0.176
下载之后安装:(需要vpn)
auto406@auto406:~/MTCNN$ cd cuDNN/
auto406@auto406:~/MTCNN/cuDNN$ sudo dpkg -i libcudnn7_7.5.0.56-1+cuda9.0_amd64.deb
auto406@auto406:~/MTCNN/cuDNN$ sudo dpkg -i libcudnn7-dev_7.5.0.56-1+cuda9.0_amd64.deb
auto406@auto406:~/MTCNN/cuDNN$ sudo dpkg -i libcudnn7-doc_7.5.0.56-1+cuda9.0_amd64.deb
采用如下代码测试是否安装成功:
auto406@auto406:~/MTCNN/cuDNN$ cp -r /usr/src/cudnn_samples_v7/ ~
auto406@auto406:~/MTCNN/cuDNN$ cd ~/cudnn_samples_v7/mnistCUDNN/
auto406@auto406:~/cudnn_samples_v7/mnistCUDNN$ make clean && make
auto406@auto406:~/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
参考:https://blog.youkuaiyun.com/davidhopper/article/details/81206673
安装tensorflow-gpu:
进入tensorflow环境,卸载高版本(先前没有安装cudnn直接安装pip install tensorflow-gpu):
(tensorflow-gpu) auto406@auto406:~/cudnn_samples_v7/mnistCUDNN$ pip uninstall tensorflow-gpu
查找cuda9.0对应的tensorflow版本:https://blog.youkuaiyun.com/lifuxian1994/article/details/81103530
从豆瓣镜像中下载安装1.9.0 tensorflow-gpu:
(tensorflow-gpu) auto406@auto406:~/cudnn_samples_v7/mnistCUDNN$ pip install tensorflow-gpu==1.9 -i https://pypi.doubanio.com/simple/
检查是否安装成功:
(tensorflow-gpu) auto406@auto406:~/cudnn_samples_v7/mnistCUDNN$ python
Python 3.5.4 |Continuum Analytics, Inc.| (default, Aug 14 2017, 13:26:58)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
- Run
prepare_data/gen_12net_data.py
to generate training data(Face Detection Part) for PNet. - Run
gen_landmark_aug_12.py
to generate training data(Face Landmark Detection Part) for PNet. - Run
gen_imglist_pnet.py
to merge two parts of training data.(tensorflow-gpu) auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/prepare_data$ python gen_imglist_pnet.py 1000282 458013 1128835 250000 750000 250000 250000
(tensorflow-gpu) auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/prepare_data$ python gen_PNet_tfrecords.py >> 1429500/1429557 images has been converted Finished converting the MTCNN dataset!
(tensorflow-gpu) auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/prepare_data$ python gen_hard_example.py Called with argument: Namespace(batch_size=[2048, 256, 16], epoch=[58, 14, 16], min_face=20, prefix=['../data/MTCNN_model/PNet_landmark/PNet', '../data/MTCNN_model/RNet_landmark/RNet', '../data/MTCNN_model/ONet_landmark/ONet'], shuffle=False, slide_window=False, stride=2, test_mode='PNet', thresh=[0.3, 0.1, 0.7], vis=False) Test model: PNet ../data/MTCNN_model/PNet_landmark/PNet-58 (1, ?, ?, 3) load summary for : conv1/add (1, ?, ?, 10) load summary for : pool1/MaxPool (1, ?, ?, 10) load summary for : conv2/add (1, ?, ?, 16) load summary for : conv3/add (1, ?, ?, 32) load summary for : conv4_1/Reshape_1 (1, ?, ?, 2) load summary for : conv4_2/BiasAdd (1, ?, ?, 4) load summary for : conv4_3/BiasAdd (1, ?, ?, 10) 2019-04-08 15:06:06.569916: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2019-04-08 15:06:06.716087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.645 pciBusID: 0000:65:00.0 totalMemory: 10.91GiB freeMemory: 10.51GiB 2019-04-08 15:06:06.716121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0 2019-04-08 15:06:06.940480: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-04-08 15:06:06.940515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0 2019-04-08 15:06:06.940522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N 2019-04-08 15:06:06.940688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10167 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:65:00.0, compute capability: 6.1) ../data/MTCNN_model/PNet_landmark/PNet-58 restore models' param ================================== load test data finish loading start detecting.... detect_face:img 1 detect_face:img 2
用如下代码可检测tensorflow的能使用设备情况:
-
>>> from tensorflow.python.client import device_lib >>> print(device_lib.list_local_devices()) 2019-04-08 15:47:48.711553: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0 2019-04-08 15:47:48.711631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-04-08 15:47:48.711660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0 2019-04-08 15:47:48.711684: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N 2019-04-08 15:47:48.711921: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/device:GPU:0 with 10167 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:65:00.0, compute capability: 6.1) [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 10620960882628000992 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 10661888000 locality { bus_id: 1 links { } } incarnation: 9467620951121494053 physical_device_desc: "device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:65:00.0, compute capability: 6.1" ]
检测使用的是gpu版的还是cpu版的:
-
>>> import numpy >>> import tensorflow as tf >>> a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') >>> b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') >>> c = tf.matmul(a, b) >>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) >>> print(sess.run(c)) MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0 2019-04-08 15:54:29.167174: I tensorflow/core/common_runtime/placer.cc:886] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0 MatMul_1: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0 2019-04-08 15:54:29.167234: I tensorflow/core/common_runtime/placer.cc:886] MatMul_1: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0 a: (Const): /job:localhost/replica:0/task:0/device:GPU:0 2019-04-08 15:54:29.167261: I tensorflow/core/common_runtime/placer.cc:886] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0 b: (Const): /job:localhost/replica:0/task:0/device:GPU:0 2019-04-08 15:54:29.167287: I tensorflow/core/common_runtime/placer.cc:886] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0 a_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0 2019-04-08 15:54:29.167308: I tensorflow/core/common_runtime/placer.cc:886] a_1: (Const)/job:localhost/replica:0/task:0/device:GPU:0 b_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0 2019-04-08 15:54:29.167330: I tensorflow/core/common_runtime/placer.cc:886] b_1: (Const)/job:localhost/replica:0/task:0/device:GPU:0 [[22. 28.] [49. 64.]]
tensorflow-cpu中的结果:
-
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:CPU:0 2019-04-08 15:57:43.593532: I tensorflow/core/common_runtime/placer.cc:1059] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:CPU:0 a: (Const): /job:localhost/replica:0/task:0/device:CPU:0 2019-04-08 15:57:43.593592: I tensorflow/core/common_runtime/placer.cc:1059] a: (Const)/job:localhost/replica:0/task:0/device:CPU:0 b: (Const): /job:localhost/replica:0/task:0/device:CPU:0 2019-04-08 15:57:43.593618: I tensorflow/core/common_runtime/placer.cc:1059] b: (Const)/job:localhost/replica:0/task:0/device:CPU:0 [[22. 28.] [49. 64.]]
跳过训练直接测试:
-
(tensorflow-gpu) auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/test$ python one_image_test.py 2019-04-08 20:10:15.470449: E tensorflow/core/common_runtime/direct_session.cc:158] Internal: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY; total memory reported: 11718230016
切换到tensorflow-cpu则没有该问题:
-
(tensorflow) auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/test$ python one_image_test.py
Traceback (most recent call last): File "one_image_test.py", line 29, in <module> PNet = FcnDetector(P_Net, model_path[0]) File "../Detection/fcn_detector.py", line 35, in __init__ assert readstate, "the params dictionary is not valid" AssertionError: the params dictionary is not valid
prefix = ['../data/MTCNN_model/PNet_No_landmark/PNet', '../data/MTCNN_model/RNet_landmark/RNet', '../data/MTCNN_model/ONet_landmark/ONet']
发现路径不对,在MTCNN_model文件夹下新建PNet_No_landmark文件夹,并把PNet_landmark中的模型复制过去。运行时将输入参数 test_mode设置为PNet。
修改测试图片位置;
运行one_image_test.py
test_mode = "Pnet"
test_mode = "Onet"
restore models' param
WARNING:tensorflow:From /home/auto406/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Traceback (most recent call last):
File "/home/auto406/MTCNN/MTCNN-Tensorflow-master/test/one_image_test.py", line 50, in <module>
for item in os.listdir(path):
FileNotFoundError: [Errno 2] No such file or directory: '../../DATA/test/lfpw_testImage'
-
python assert的作用
- 这里用到啦assert断言,相当于raise-if not,说明有错误条件
安装可视化编程IDE-spyder:
auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/test$ sudo apt install spyder
auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/test$ spyder //打开spyder
如何打开一个已经存在的项目:
project->new project->Existing directory->Location(选择已经存在的工程文件夹)->Create
直接open project,发现不能顺利打开项目,会遇到“XXX is not a Spyder project!"
在Anaconda中配置spyder:
(tensorflow) auto406@auto406:~/MTCNN/MTCNN-Tensorflow-master/test$ conda install spyder
使用pyCharm:
file->settings->Project Interpreter->选择~/anaconda3/envs/tensorflow/bin/python(之前已经装好easydict)
在pyCharm运行camera.py:
Traceback (most recent call last):
File "/home/auto406/MTCNN/MTCNN-Tensorflow-master/test/camera_test.py", line 60, in <module>
for j in range(len(landmarks[i])/2):
TypeError: 'float' object cannot be interpreted as an integer
原因:python3中/
的结果是真正意义上的除法,结果是float型。python3中用双//
就可以了
# for j in range(len(landmarks[i])/2):
for j in range(len(landmarks[i])//2):