tensorflow 问题总结

最新推荐文章于 2022-07-10 23:33:36 发布

hsq_roy

最新推荐文章于 2022-07-10 23:33:36 发布

阅读量1k

点赞数

分类专栏： tensorflow 文章标签： tensorflow

本文链接：https://blog.youkuaiyun.com/o0Roy/article/details/95455958

版权

tensorflow 专栏收录该内容

5 篇文章

订阅专栏

本文围绕TensorFlow展开，介绍了在ubuntu 16.04 + tensorflow 1.14.0 + python 3.5系统环境下，安装和运行TensorFlow时遇到的诸多问题，如安装慢、虚拟机无法读取物理机显卡、numpy报错等，并针对每个问题给出了相应的解决办法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

0、系统环境：

ubuntu 16.04 + tensorflow 1.14.0 + python 3.5

1、安装tensorflow很慢（换源）

直接用以下代码安装

# For CPU
pip3 install tensorflow
# For GPU
pip3 install tensorflow-gpu

因为谷歌在国外的关系，下载很慢，需要换源，直接在用户名根目录创建.pip文件夹并创建pip.conf文件

vim ~/.pip/pip.conf

[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
[install]
trusted-host=mirrors.aliyun.com

2、虚拟机无法读取物理机显卡（无法运行Tensorflow-GPU）

说明：VM装的虚拟机是虚拟显卡，不是物理机自带显卡，考虑以下解决办法。

解决办法：

1、安装docker，可以直接调用物理机硬件。（推荐）

2、双系统。

3、安装Tensorflow-CPU。（暂未试过）

3、numpy报错

root@48e02d5a30a1:~/python/lx_express# python3 main.py 
/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:458: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:459: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:460: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:461: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:462: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:465: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

问题说明：numpy版本过高，输入 pip show numpy 查看numpy版本

pip show numpy  # 查看版本
pip uninstall numpy  # 卸载numpy
pip install numpy==1.16  # 指定安装1.16版本numpy

4、Cannot uninstall 'wrapt'

ERROR: Cannot uninstall 'wrapt'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

解决：

pip install -U --ignore-installed wrapt enum34 simplejson netaddr

5、KeyError: 'IteratorFromStringHandleV2'

KeyError: 'IteratorFromStringHandleV2'

解决：

在本地环境可以运行的代码放到docker之后报错，查看发现tensorflow版本太低，最后安装了和本地一样的tensorflow（1.14.0版本）解决问题。

6、UnicodeEncodeError: 'ascii' codec can't encode characters in position 159-168: ordinal not in range(128)

UnicodeEncodeError: 'ascii' codec can't encode characters in position 159-168: ordinal not in range(128)

说明：

在运行的代码里面添加一下语句：

import sys
import codecs
sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())

7、ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory （docker 环境下）

ImportError: Traceback (most recent call last):

  File "/storage/xuminghong/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>

    from tensorflow.python.pywrap_tensorflow_internal import *

  File "/storage/xuminghong/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>

    _pywrap_tensorflow_internal = swig_import_helper()

  File "/storage/xuminghong/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper

    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)

  File "/usr/local/lib/python3.6/imp.py", line 242, in load_module

    return load_dynamic(name, filename, file)

  File "/usr/local/lib/python3.6/imp.py", line 342, in load_dynamic

    return _load(spec)

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

解决方法：

我是在docker环境下运行的，之前直接docker run，后面用nvidia-docker run以上的错误解决。

8、pycharm配置cuda

1.打开pycham run->Edit Configurations

2.设置Enviroment variables 变量：LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64（注意自己cuda安裝的路径）

9、 Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

之前使用keras，本地运行得好好的，部署到docker上报错（如题），发现是刚开始分配太多GPU内存（可能用不了这么多），但是GPU没有那么多内存给他，加入以下代码，按需获取GPU内存。

import tensorflow as tf  # 如果没有，记得import

config = tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))
sess = tf.Session(config=config)

10、 Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory

对于ibcudart.so.n，n为几，为找不到cuda-几的环境。此条错为没找到cuda10的路径，

需要配置ubuntu的cuda环境变量，其文件路径 ~/.bashrc

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}

记得source ~/.bashrc

注意：如果是在pycharm中报错，参考第8条。

修改成功后如图示：

11、ERROR: Could not find a version that satisfies the requirement tensorflowt (from versions: none)
ERROR: No matching distribution found for tensorflow

说明：

之前在比特大陆TPU盒子（型号：SE3）上安装tensorflow（也试过mxnet），直接pip install tensorflow一直报错，网上查了下有些人说python版本与tensorflow版本不匹配，依旧没解决问题。后面查了一下cpu架构不一样，无法直接从网上pip安装，得下arm架构的安装包。

在TPU终端中输入arch可以直接看CPU架构

我们一般的服务器CPU架构都是X86_64的

我下载tensorflow的地址（https://github.com/lhelontra/tensorflow-on-arm/releases ）

我下载的版本（https://github.com/lhelontra/tensorflow-on-arm/releases/download/v1.13.1/tensorflow-1.13.1-cp35-none-linux_aarch64.whl）

最后直接安装即可

pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow-1.13.1-cp35-none-linux_aarch64.whl