机器学习分为监督学习(supervised learning) 无监督学习(unsupervised learning)
强化学习(reinforcement learning) 推荐系统(recommender systems)
1.supervised learning
regression(回归)对应 predict continuous
classification(分类)对应 discrete value
2.unsupervised learning
no label find structure
cluster 聚类
Notation:
m: number of training example
x: input feature
y: output variable
(
x
i
,
y
i
)
(x^i,y^i)
(xi,yi)
h=hypothesis (x to y function)
linear regression(线性回归)
h(x)=θ0+θ1*x;
cost function
J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta_0,\theta_1)=\frac{1}{2m}{\sum_{i=1}^m(h_{\theta}(x^{(i)})-y^{(i)})^2} J(θ0,θ1)=2m1i=1∑m(hθ(x(i))−y(i))2
gradient descent(梯度下降)
同时更新
θ j : = θ j − α ∂ ∂ θ j J ( θ 0 , θ 1 ) \theta_j := \theta_j - \alpha \frac{\partial}{\partial \theta_j} J(\theta_0, \theta_1) θj:=θj−α∂θj∂J(θ0,θ1)
α
\alpha
α太小就很慢
太大就不会收敛
repeat until converge{
θ
0
:
=
θ
0
−
α
1
m
∑
i
=
1
m
(
h
θ
(
x
i
)
−
y
i
)
\theta_0:=\theta_0-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^i)-y^i)
θ0:=θ0−αm1i=1∑m(hθ(xi)−yi)
θ
1
:
=
θ
1
−
α
1
m
∑
i
=
1
m
(
h
θ
(
x
i
)
−
y
i
)
x
i
\theta_1:=\theta_1-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^i)-y^i)x^i
θ1:=θ1−αm1i=1∑m(hθ(xi)−yi)xi
}
update simultaneously get local minimum
counvex foucntion(凸函数) 一定有只有一个解 become global minimum