\qquad
Sklearn中逻辑回归的损失函数的推导:

\qquad
假设y的标签为1和-1,用极大似然估计法估计模型参数,
P
(
Y
=
1
∣
X
i
)
P(Y=1|X_i)
P(Y=1∣Xi)=
h
(
X
i
T
W
+
C
)
h(X_i^TW+C)
h(XiTW+C)=
1
1
+
e
x
p
(
−
(
X
i
T
W
+
C
)
)
\frac{1}{1+exp(-(X_i^TW+C))}
1+exp(−(XiTW+C))1,则目标为估计最大化下列概率的参数:
Step1:
\qquad
∏
i
,
y
i
=
1
\prod_{i,y_i=1}
∏i,yi=1
P
(
Y
=
1
∣
X
i
)
P(Y=1|X_i)
P(Y=1∣Xi)
∏
i
,
y
i
=
−
1
\prod_{i,y_i=-1}
∏i,yi=−1
P
(
Y
=
−
1
∣
X
i
)
P(Y=-1|X_i)
P(Y=−1∣Xi)
\qquad
=
∏
i
,
y
i
=
1
\prod_{i,y_i=1}
∏i,yi=1
P
(
Y
=
1
∣
X
=
i
)
P(Y=1|X=i)
P(Y=1∣X=i)
∏
i
,
y
i
=
−
1
\prod_{i,y_i=-1}
∏i,yi=−1
(
1
−
P
(
Y
=
1
∣
X
=
i
)
)
(1-P(Y=1|X=i))
(1−P(Y=1∣X=i))
\qquad
=
∏
i
,
y
i
=
1
\prod_{i,y_i=1}
∏i,yi=1
h
(
X
i
T
W
+
C
)
h(X_i^TW+C)
h(XiTW+C)
∏
i
,
y
i
=
−
1
\prod_{i,y_i=-1}
∏i,yi=−1
h
(
−
(
X
i
T
W
+
C
)
)
h(-(X_i^TW+C))
h(−(XiTW+C))
\qquad
=
∏
i
,
y
i
\prod_{i,y_i}
∏i,yi
h
(
y
i
(
X
i
T
W
+
C
)
)
h(y_i(X_i^TW+C))
h(yi(XiTW+C))
\qquad
对数化之后不影响求参过程,则目标变为求使得
Σ
i
,
y
i
\Sigma_{i,y_i}
Σi,yi
l
o
g
(
h
(
y
i
(
X
i
T
W
+
C
)
)
)
log(h(y_i(X_i^TW+C)))
log(h(yi(XiTW+C)))最大化的参数。为将其转化为损失函数,目标转为最小化-
Σ
i
,
y
i
\Sigma_{i,y_i}
Σi,yi
l
o
g
(
h
(
y
i
(
X
i
T
W
+
C
)
)
)
log(h(y_i(X_i^TW+C)))
log(h(yi(XiTW+C)))的参数,则:
\qquad
-
Σ
i
,
y
i
\Sigma_{i,y_i}
Σi,yi
l
o
g
(
h
(
y
i
(
X
i
T
W
+
C
)
)
)
log(h(y_i(X_i^TW+C)))
log(h(yi(XiTW+C)))
\qquad
=-
Σ
i
,
y
i
\Sigma_{i,y_i}
Σi,yi
l
o
g
(
1
1
+
e
x
p
(
−
y
i
(
X
i
T
W
+
C
)
)
)
log(\frac{1}{1+exp(-y_i(X_i^TW+C))})
log(1+exp(−yi(XiTW+C))1)
\qquad
=
Σ
i
,
y
i
\Sigma_{i,y_i}
Σi,yi
l
o
g
(
1
+
e
x
p
(
−
y
i
(
X
i
T
W
+
C
)
)
)
log(1+exp(-y_i(X_i^TW+C)))
log(1+exp(−yi(XiTW+C)))
\qquad 推导完成!撒花撒花撒花。非常感谢sanshun大佬的大力支持,大家有空可以去大佬的博客逛逛呀sanshun博客,会慢慢更新哦
1万+

被折叠的 条评论
为什么被折叠?



