Why we need activation function?

深度学习中,使用非线性激活函数对于构建多层神经网络至关重要。若仅使用线性激活函数,无论网络有多少层,最终模型的功能等同于单层线性模型。非线性激活函数如ReLU、tanh或sigmoid允许模型学习更复杂的特征表达。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

整理自吴恩达深度学习课程 https://mooc.study.163.com/learn/2001281002?tid=2001392029#/learn/content?type=detail&id=2001702018&cid=2001694026

Why we need activation function ?

Deep networks with many many layers, many many hidden layers and turns out that if you use a linear activation function or alternatively if you don’t have an activation function then no matter how many layers your network has, always doing is computing a linear activation function, so you might as well not have any hidden layers.

If you use the linear function here and sigmoid function here, then this model is no more expressive than standard logistic regression without any hidden layer.

The take-home is that a linear hidden layer is more or less useless,because the composition of
two linear function is itself a linear function.
So unless you throw a non-linearty in there, then
you are not computing more interesting functions even as you good deeper in the network

I, blogger of this post, think that if you are suing linear function, you deep neural network will not get the higher level features we expect.

The hidden units should not use the linear activation functions, they could use ReLU or tanh or leaky ReLU or maybe somethings else. So the one place you might use as linear activation function is usually the output layer. But other that, using a linear activation function in a hidden layer except for some very special circumstances relating to compression. Using linear activation is extremely rare.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值