DeepLearing学习笔记-Sigmoid函数的梯度

最新推荐文章于 2025-10-19 08:51:31 发布

原创最新推荐文章于 2025-10-19 08:51:31 发布 · 9.8k 阅读

13 ·

CC 4.0 BY-SA版权

文章标签：

#deep-learning #python

深度学习同时被 2 个专栏收录

55 篇文章

订阅专栏

机器学习

17 篇文章

订阅专栏

本文详细介绍了Sigmoid函数的定义及其在Python中的实现，并推导了其导数的计算公式，最后给出了Python中实现Sigmoid导数的具体代码。

背景：

求解 $z= \sigma(z)$ 的梯度
由于 $sigmoid(x) = \frac{1}{1+e^{-x}}$
在python中利用numpy模块实现：

# GRADED FUNCTION: sigmoid

import numpy as np
# this means you can access numpy functions by writing np.function() instead of numpy.function()

def sigmoid(x):
    """
    Compute the sigmoid of x

    Arguments:
    x -- A scalar or numpy array of any size

    Return:
    s -- sigmoid(x)
    """

    ### START CODE HERE ### (≈ 1 line of code)
    s = None
    s = 1/(1+np.exp(-x))
    ### END CODE HERE ###

    return s

求对应的导数

s i g m o i d_d e r i v a t i v e (x) = σ' (x) = σ (x) (1 - σ (x)) (1)

$sigmoid\_derivative(x) = \sigma'(x) = \sigma(x) (1 - \sigma(x))\tag{1}$
那这个是怎么推导的呢？

σ(x)=11+e−x $\sigma(x) = \frac{1}{1+e^{-x}}$
另临时变量

t=1+e−x $t={1+e^{-x}}$ ，通过复合函数的求导法则，所以

σ′(x)=(t−1)′⋅t′=−t−2⋅(−e−x)=1(1+e−x)2⋅e−x=11+e−x(e−x1+e−x)=11+e−x(1+e−x−11+e−x)=11+e−x(1−11+e−x)=σ(x)⋅(1−σ(x)) $\sigma'(x)=(t^{-1})^{'}\cdot t^{'}=-t^{-2}\cdot (-e^{-x})=\frac{1}{(1+e^{-x})^{2}} \cdot e^{-x}=\frac{1}{1+e^{-x}}(\frac{e^{-x}}{1+e^{-x}})=\frac{1}{1+e^{-x}}(\frac{1+e^{-x}-1}{1+e^{-x}})=\frac{1}{1+e^{-x}}(1-\frac{1}{1+e^{-x}})=\sigma(x)\cdot (1-\sigma(x))$
得证！

python实现

def sigmoid_derivative(x):
    """
    Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.
    You can store the output of the sigmoid function into variables and then use it to calculate the gradient.

    Arguments:
    x -- A scalar or numpy array

    Return:
    ds -- Your computed gradient.
    """

    ### START CODE HERE ### (≈ 2 lines of code)
    s = 1 / ( 1 + 1 / np.exp(x))
    ds = s * (1 - s)
    ### END CODE HERE ###

    return ds
x = np.array([1, 2, 3])
print ("sigmoid_derivative(x) = " + str(sigmoid_derivative(x)))

输出结果：

sigmoid_derivative(x) = [ 0.19661193 0.10499359 0.04517666]