tensorflow 解读开篇

最新推荐文章于 2024-06-18 16:59:44 发布

原创最新推荐文章于 2024-06-18 16:59:44 发布 · 353 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#Tensorflow

C/C++ 同时被 2 个专栏收录

17 篇文章

订阅专栏

算法

6 篇文章

订阅专栏

本文深入解析TensorFlow，涵盖计算图、Tensor、op及内核实现，详解如何在C/C++层面调试TensorFlow，适合深度学习开发者和技术爱好者。

tensorflow 解读开篇

概述

深度学习研究的热潮持续高涨，各种开源深度学习框架也层出不穷，下面列出目前一些主流的深度学习开源工具。

工具名称	维护团体	支持语言	支持系统
TensorFlow	Google	C++、Python	Linux、Mac-OS、Windows、Android、iOS
Caffe	伯克利视觉学习中心	C++、Python、MATLAB	Linux、Mac-OS、Windows
Theano	蒙特利尔大学	Python	Linux、Mac-OS、Windows
Torch	Facebook等	Lua、LuaJIT、C	Linux、Mac-OS、Windows、Android、iOS
MXNet	DMLC	C++、Python、Go、R	Linux、Mac、Windows、Android、iOS
CNTK	MSR	Python、C++	Linux、Windows
Deeplearn4j	Skymind	Java、Scala	Linux、Windows、Mac

本章主要介绍带大家一览 Tensorflow 的相关概念。

TensorFlow

TensorFlow是相对高阶的机器学习库，用户可以方便地用它设计神经网络结构，而不必为了追求效率实现C++或CUDA代码。TensorFlow通过SWIG(Simplified Wrapper and Interface Generator)实现对多种语言的支持，包括有Python、R、C++等

因为TensorFlow有着对多平台、多语言支持特性。并且TensorFlow拥有产品级的高质量代码，背后又有Google强大的开发、维护能力的加持，相比于基于Python的其他框架，TensorFlow更加成熟、更加完善。

设计上它基于计算图来描述计算过程，提供了完善而灵活的分布式支持。

计算图描述

A TensorFlow computation is described by a directed graph, which is composed of a set of nodes. The graph represents a dataflow computation.
An operation has a name and represents an abstract computation (e.g., “matrix multiply”, or “add”).

TensorFlow中的计算可以表示为一个计算图(computation graph)，又称有向图(directed graph)。在计算图中每一个运算操作(operation)将作为一个节点(node)，节点与节点之间的连接称为边(edge)。计算图的每一个节点可以有任意多个输入和输出，节点可以算是运算操作的实例化(instance).在计算图的edge中流动(flow)的数据称为张量(tensor)，故得名tensorflow.

计算图相关信息用proto描述，下面是其定义：

message GraphDef {
	repeated NodeDef node = 1;
	FunctionDefLibrary library = 2;
	VersionDef versions = 4;
    int32 version = 3 [deprecated = true];
}
NodeDef {
	string name = 1;
	string op = 2;
	repeated string input = 3;
	string device = 4;
	map<string, AttrValue> attr = 5;
}

Node中包含计算op，数据inputs和设备信息device(cpu or gpu)。

Tensor

和其他的神经网络框架类似，tensor用来描述一个多维数组(A tensor simply identifies a multidimensional array or list)，主要有三个属性Ranks, Shapes, and Types(https://www.tensorflow.org/programmers_guide/dims_types)。

tensorflow的tensor主要基于开源的Eigen::Tensor，并且做了大量的扩展。
引用的eigen文件(https://github.com/RLovelett/eigen/blob/master/unsupported/Eigen/CXX11/Tensor)

 /// \brief Parse `other` and construct the tensor.

  /// Returns `true` iff the parsing succeeds. If the parsing fails,
  /// the state of `*this` is unchanged.
  bool FromProto(const TensorProto& other) TF_MUST_USE_RESULT;
  bool FromProto(Allocator* a, const TensorProto& other) TF_MUST_USE_RESULT;

  /// \brief Fills in `proto` with `*this` tensor's content.
  ///
  /// `AsProtoField()` fills in the repeated field for `proto.dtype()`, while

  void AsProtoField(TensorProto* proto) const;

可以看到 Tensor 提供了从 protobuf 序列化和反序列化的接口，方便进行网络数据传输。

operator and opkernel

一个 tensorflow 的op主要包含以下：

1. op interface

主要用于描述op的输入输出等性质，并且负责注册到tensorflow的系统中。下面这段代码注册了一个叫做ZeroOut的op，并且描述了其输入类型是32位int型，输出也是32位int型，并且对其shape做了描述，输入输出的tensor shape是一样的。

#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/shape_inference.h"

using namespace tensorflow;

REGISTER_OP("ZeroOut")
    .Input("to_zero: int32")
    .Output("zeroed: int32")
    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
      c->set_output(0, c->input(0));
      return Status::OK();
    });

2. op kernel的实现

主要是继承OpKernel这个基类，并且实现Compute这个接口，Compute有一个输入参数OpKernelContext，输入输出都是通过这个context进行管理。

#include "tensorflow/core/framework/op_kernel.h"

using namespace tensorflow;

class ZeroOutOp : public OpKernel {
 public:
  explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}

  void Compute(OpKernelContext* context) override {
    // Grab the input tensor
    const Tensor& input_tensor = context->input(0);
    auto input = input_tensor.flat<int32>();

    // Create an output tensor
    Tensor* output_tensor = NULL;
    OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(),
                                                     &output_tensor));
    auto output = output_tensor->flat<int32>();

    // Set all but the first element of the output tensor to 0.
    const int N = input.size();
    for (int i = 1; i < N; i++) {
      output(i) = 0;
    }

    // Preserve the first input value if possible.
    if (N > 0) output(0) = input(0);
  }
};

同样的，这个kernel需要注册到tensorflow的系统中。

REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);

调试篇

目前常用的 TensorFlow API 还是 Python 版的,因此需要用到 gdb 的附加到进程选项载入 Python 进程, 随后进入 C/C++部分的代码。

启动 Python 进程,通过 os 包中的命令获取当前的进程号。

$ python
>>> import tensorflow as tf 
>>> import os
>>> os.getpid()
26880

得到当前 Python 进程的进程号之后即可在另一个窗口中通过 gdb 启动进程调试,以 TF_Run 函数的断点为例:

$ gdb --pid=30475
(gdb) break TF_Run
(gdb) c

回到 python 窗口：

>>> a = tf.constant(1) 
>>> sess = tf.Session() 
>>> sess.run(a)

gdb 即断在了 TF_Run 这个函数上,

Breakpoint 1, 0x00007fd5cd8ee1f4 in TF_Run ()
   from /usr/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
(gdb) bt
#0  0x00007fd5cd8ee1f4 in TF_Run () from /usr/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#1  0x00007fd5cd60c9da in tensorflow::TF_Run_wrapper_helper(TF_DeprecatedSession*, char const*, TF_Buffer const*, _object*, tensorflow::gtl::InlinedVector<char const*, 8> const&, tensorflow::gtl::InlinedVector<char const*, 8> const&, TF_Status*, tensorflow::gtl::InlinedVector<_object*, 8>*, TF_Buffer*) ()
   from /usr/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#2  0x00007fd5cd60cdd1 in tensorflow::TF_Run_wrapper(TF_DeprecatedSession*, TF_Buffer const*, _object*, tensorflow::gtl::InlinedVector<char const*, 8> const&, tensorflow::gtl::InlinedVector<char const*, 8> const&, TF_Status*, tensorflow::gtl::InlinedVector<_object*, 8>*, TF_Buffer*) () from /usr/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#3  0x00007fd5cd5d10b1 in _wrap_TF_Run () from /usr/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#4  0x00007fd60998dcf0 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#5  0x00007fd60999003d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#6  0x00007fd609919978 in function_call () from /lib64/libpython2.7.so.1.0
#7  0x00007fd6098f4a63 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#8  0x00007fd6099886fd in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#9  0x00007fd60999003d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#10 0x00007fd60998d53c in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#11 0x00007fd60999003d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#12 0x00007fd60998d53c in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#13 0x00007fd60999003d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#14 0x00007fd60998d53c in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#15 0x00007fd60999003d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#16 0x00007fd60998d53c in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#17 0x00007fd60999003d in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#18 0x00007fd609990142 in PyEval_EvalCode () from /lib64/libpython2.7.so.1.0
#19 0x00007fd6099a957f in run_mod () from /lib64/libpython2.7.so.1.0
#20 0x00007fd6099ab630 in PyRun_InteractiveOneFlags () from /lib64/libpython2.7.so.1.0
#21 0x00007fd6099ab81e in PyRun_InteractiveLoopFlags () from /lib64/libpython2.7.so.1.0
#22 0x00007fd6099abeae in PyRun_AnyFileExFlags () from /lib64/libpython2.7.so.1.0
#23 0x00007fd6099bcb7f in Py_Main () from /lib64/libpython2.7.so.1.0
#24 0x00007fd608bd9445 in __libc_start_main () from /lib64/libc.so.6
#25 0x000000000040066e in _start ()
(gdb)