Activation Function 激活函数
Intro
In parctice, we hardly ever use step function as activation function
ϕ
(
z
)
\phi(z)
ϕ(z)
Activation function can be divided into two types: saturated activation function and non-saturated activation function.
Non-saturated activation functions can avoid gradient disappearance and they converge faster.
Gradient disappearance means the value of function changes slightly if
∣
x
∣
|x|
∣x∣ is large.
Step function
ϕ ′ ( x ) = 0 \phi'(x) = 0 ϕ′(x)=0, so we never use it.
Sigmoid Function
ϕ
(
x
)
=
s
i
g
m
o
i
d
(
x
)
=
1
1
+
e
−
x
\phi(x) = sigmoid(x) = \frac{1}{1+e^{-x}}
ϕ(x)=sigmoid(x)=1+e−x1
ϕ
′
(
x
)
=
ϕ
(
x
)
[
1
−
ϕ
(
x
)
]
\phi'(x) = \phi(x)[1-\phi(x)]
ϕ′(x)=ϕ(x)[1−ϕ(x)]
Sigmoid function is a saturated activation function.
tanh 双曲正切函数 (hyperbolic tangent funciton)
ϕ
(
x
)
=
t
a
n
h
(
x
)
=
e
x
−
e
−
x
e
x
+
e
−
x
\phi(x) = tanh(x) = \frac{e^x-e^{-x}}{e^x+e^{-x}}
ϕ(x)=tanh(x)=ex+e−xex−e−x
ϕ
′
(
x
)
=
1
−
ϕ
2
(
x
)
\phi'(x) = 1 - \phi^2(x)
ϕ′(x)=1−ϕ2(x)
The tanh function is a saturated activation function.
ReLU 线性整流函数(Rectified Linear Unit)
ϕ
(
x
)
=
{
x
x
>
0
0
x
≤
0
\phi(x) = \left\{\begin{array}{l} x \ x>0 \\ 0 \ x\leq0 \end{array}\right.
ϕ(x)={x x>00 x≤0
ϕ
(
x
)
=
max
(
0
,
x
)
\phi(x) = \max(0, x)
ϕ(x)=max(0,x)
ϕ
′
(
x
)
=
{
1
x
>
0
0
x
≤
0
\phi'(x) = \left\{\begin{array}{l} 1 \ x>0 \\ 0 \ x\leq0 \end{array}\right.
ϕ′(x)={1 x>00 x≤0
ReLU function is a non-saturated activation function.
ReLU is widely used in deep learning.
Leaky ReLU, PReLU, RReLU
The Leaky ReLU, PReLU and RReLU functions all have a tiny angle when
x
<
0
x<0
x<0. They are non-saturated activation function.
ϕ
(
x
)
=
{
x
x
>
0
x
a
x
≤
0
\phi(x) = \left\{\begin{array}{l} x \ x>0 \\ \frac xa \ x\leq0 \end{array}\right.
ϕ(x)={x x>0ax x≤0
ϕ
(
x
)
=
max
(
l
e
a
k
∗
x
,
x
)
\phi(x) = \max(leak*x, x)
ϕ(x)=max(leak∗x,x)
ϕ
′
(
x
)
=
{
1
x
>
0
1
a
x
≤
0
\phi'(x) = \left\{\begin{array}{l} 1 \ x>0 \\ \frac1a \ x\leq0 \end{array}\right.
ϕ′(x)={1 x>0a1 x≤0
Leaky ReLU
Leak is a constant number in leaky ReLU.
PReLU(Parameteric Leaky ReLU)
Leak is a parameter in PReLU.
RReLU(Randomized Leaky ReLU)
Leak is a random value subjected to a given range. This value will be fixed in the test phase.