TensorFlow中的一个重要ops---MatMul的实现（二）

最新推荐文章于 2025-10-26 13:39:22 发布

原创

最新推荐文章于 2025-10-26 13:39:22 发布 · 5.3k 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#TensorFlow #MatMul #梯度

本文深入探讨了在TensorFlow中如何为MatMul操作实现梯度计算。通过自动微分和链式法则，将梯度从损失函数L对输出y的导数转换为对输入x的导数，以便进行梯度下降优化。文章解释了MatMul梯度节点的Python实现，并给出了矩阵求导的公式，帮助理解这一过程。

上面一篇文章我们主要介绍了MatMul这个ops的正向计算的实现，我们知道模型的最后一步是计算优化模型的参数，而一般采用的方法是梯度下降法，所以每个ops节点不仅要实现一个正向的计算节点，还要实现一个反向的梯度计算节点。

关于反向计算的结点官网有如下一段介绍：

Implement the gradient in Python

Given a graph of ops, TensorFlow uses automatic differentiation (backpropagation) to add new ops representing gradients with respect to the existing ops (seeGradient Computation). To make automatic differentiation work for new ops, you must register a gradient function which computes gradients with respect to the ops' inputs given gradients with respect to the ops' outputs.Mathematically, if an op computes y = f(x) the registered gradient op converts gradients ∂L/∂y