Exercise 1
Please complete the sigmoid function to calculate
g(z)=11+e−zg(z) = \frac{1}{1+e^{-z}}g(z)=1+e−z1
Note that
zis not always a single number, but can also be an array of numbers.- If the input is an array of numbers, we’d like to apply the sigmoid function to each value in the input array.
If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.
# UNQ_C1
# GRADED FUNCTION: sigmoid
def sigmoid(z):
"""
Compute the sigmoid of z
Args:
z (ndarray): A scalar, numpy array of any size.
Returns:
g (ndarray): sigmoid(z), with the same shape as z
"""
### START CODE HERE ###
g=1/(1+np.exp(-z))
#这里不能够使用内置的math.exp()来计算,因为他只能处理标量,无法处理矩阵和数组元素;np.exp()可以
### END SOLUTION ###
return g
Exercise 2
Please complete the compute_cost function using the equations below.
Recall that for logistic regression, the cost function is of the form
J(w,b)=1m∑i=0m−1[loss(fw,b(x(i)),y(i))](1) J(\mathbf{w},b) = \frac{1}{m}\sum_{i=0}^{m-1} \left[ loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)}) \right] \tag{1}J(w,b)=m1i=0∑m−1[loss(fw,b(x(i)),y(i))](1)
where
-
m is the number of training examples in the dataset
-
loss(fw,b(x(i)),y(i))loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)})loss(fw,b(x(i)),y(i)) is the cost for a single data point, which is -
loss(fw,b(x(i)),y(i))=(−y(i)log(fw,b(x(i)))−(1−y(i))log(1−fw,b(x(i)))(2)loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)}) = (-y^{(i)} \log\left(f_{\mathbf{w},b}\left( \mathbf{x}^{(i)} \right) \right) - \left( 1 - y^{(i)}\right) \log \left( 1 - f_{\mathbf{w},b}\left( \mathbf{x}^{(i)} \right) \right) \tag{2}loss(fw,b(x(i)),y(i))=(−y(i)log(fw,b(x(i)))−(1−y(i))log(1−fw,b(x(i)))(2)
-
fw,b(x(i))f_{\mathbf{w},b}(\mathbf{x}^{(i)})fw,b(x(i)) is the model’s prediction, while y(i)y^{(i)}y(i), which is the actual label
-
fw,b(x(i))=g(w⋅x(i)+b)f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = g(\mathbf{w} \cdot \mathbf{x^{(i)}} + b)fw,b(x(i))=g(w⋅x(i)+b) where function ggg is the sigmoid function.
- It might be helpful to first calculate an intermediate variable zw,b(x(i))=w⋅x(i)+b=w0x0(i)+...+wn−1xn−1(i)+bz_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x^{(i)}} + b = w_0x^{(i)}_0 + ... + w_{n-1}x^{(i)}_{n-1} + bzw,b(x(i))=w⋅x(i)+b=w0x0(i)+...+wn−1xn−1(i)+b where nnn is the number of features, before calculating fw,b(x(i))=g(zw,b(x(i)))f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = g(z_{\mathbf{w},b}(\mathbf{x}^{(i)}))fw,b(x(i))=g(zw,b(x(i)))
# UNQ_C2
# GRADED FUNCTION: compute_cost
def compute_cost(X, y, w, b, *argv):
"""
Computes the cost over all examples
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
*argv : unused, for compatibility with regularized version below
Returns:
total_cost : (scalar) cost
"""
m, n = X.shape
### START CODE HERE ###
total_cost = 0.0 #这里是为了与后续的输出代码不冲突
for i in range(m):
z = np.dot(X[i], w) + b
total_cost += (-y[i] * np.log(sigmoid(z)) - (1 - y[i]) * np.log(1 - sigmoid(z)))
total_cost /= m
### END CODE HERE ###
return total_cost
Exercise 3
Please complete the compute_gradient function to compute ∂J(w,b)∂w\frac{\partial J(\mathbf{w},b)}{\partial w}∂w∂J(w,b), ∂J(w,b)∂b\frac{\partial J(\mathbf{w},b)}{\partial b}∂b∂J(w,b) from equations (2) and (3) below.
∂J(w,b)∂b=1m∑i=0m−1(fw,b(x(i))−y(i))(2)
\frac{\partial J(\mathbf{w},b)}{\partial b} = \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - \mathbf{y}^{(i)}) \tag{2}
∂b∂J(w,b)=m1i=0∑m−1(fw,b(x(i))−y(i))(2)
∂J(w,b)∂wj=1m∑i=0m−1(fw,b(x(i))−y(i))xj(i)(3)
\frac{\partial J(\mathbf{w},b)}{\partial w_j} = \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - \mathbf{y}^{(i)})x_{j}^{(i)} \tag{3}
∂wj∂J(w,b)=m1i=0∑m−1(fw,b(x(i))−y(i))xj(i)(3)
-
m is the number of training examples in the dataset
-
fw,b(x(i))f_{\mathbf{w},b}(x^{(i)})fw,b(x(i)) is the model’s prediction, while y(i)y^{(i)}y(i) is the actual label
- Note: While this gradient looks identical to the linear regression gradient, the formula is actually different because linear and logistic regression have different definitions of fw,b(x)f_{\mathbf{w},b}(x)fw,b(x).
As before, you can use the sigmoid function that you implemented above and if you get stuck, you can check out the hints presented after the cell below to help you with the implementation.
# UNQ_C3
# GRADED FUNCTION: compute_gradient
def compute_gradient(X, y, w, b, *argv):
"""
Computes the gradient for logistic regression
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
*argv : unused, for compatibility with regularized version below
Returns
dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w.
dj_db : (scalar) The gradient of the cost w.r.t. the parameter b.
"""
m, n = X.shape
dj_dw = np.zeros(w.shape)
dj_db = 0.
### START CODE HERE ###
f=sigmoid(np.dot(X,w)+b)#这行代码计算出了f(x)
for i in range(m):
dj_dw+=(f[i]-y[i])*X[i]
dj_db+=(f[i]-y[i])
dj_dw/=m
dj_db/=m
### END CODE HERE ###
return dj_db, dj_dw
Exercise 4
Please complete the predict function to produce 1 or 0 predictions given a dataset and a learned parameter vector www and bbb.
-
First you need to compute the prediction from the model f(x(i))=g(w⋅x(i)+b)f(x^{(i)}) = g(w \cdot x^{(i)} + b)f(x(i))=g(w⋅x(i)+b) for every example
- You’ve implemented this before in the parts above
-
We interpret the output of the model (f(x(i))f(x^{(i)})f(x(i))) as the probability that y(i)=1y^{(i)}=1y(i)=1 given x(i)x^{(i)}x(i) and parameterized by www.
-
Therefore, to get a final prediction (y(i)=0y^{(i)}=0y(i)=0 or y(i)=1y^{(i)}=1y(i)=1) from the logistic regression model, you can use the following heuristic -
if f(x(i))>=0.5f(x^{(i)}) >= 0.5f(x(i))>=0.5, predict y(i)=1y^{(i)}=1y(i)=1
if f(x(i))<0.5f(x^{(i)}) < 0.5f(x(i))<0.5, predict y(i)=0y^{(i)}=0y(i)=0
If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.
# UNQ_C4
# GRADED FUNCTION: predict
def predict(X, w, b):
"""
Predict whether the label is 0 or 1 using learned logistic
regression parameters w
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
Returns:
p : (ndarray (m,)) The predictions for X using a threshold at 0.5
"""
# number of training examples
m, n = X.shape
p = np.zeros(m)
### START CODE HERE ###
# Loop over each example
f=sigmoid(np.dot(X,w)+b)
for i in range(m):
if(f[i]>=0.5):
p[i]=1
else:
p[i]=0
### END CODE HERE ###
return p
Exercise 5
Please complete the compute_cost_reg function below to calculate the following term for each element in www
λ2m∑j=0n−1wj2\frac{\lambda}{2m} \sum_{j=0}^{n-1} w_j^22mλj=0∑n−1wj2
The starter code then adds this to the cost without regularization (which you computed above in compute_cost) to calculate the cost with regulatization.
If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.
# UNQ_C5
def compute_cost_reg(X, y, w, b, lambda_ = 1):
"""
Computes the cost over all examples
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
lambda_ : (scalar, float) Controls amount of regularization
Returns:
total_cost : (scalar) cost
"""
m, n = X.shape
# Calls the compute_cost function that you implemented above
cost_without_reg = compute_cost(X, y, w, b)
# You need to calculate this value
reg_cost = 0.
### START CODE HERE ###
reg_cost=(lambda_/(2*m))*np.sum(w**2)
### END CODE HERE ###
# Add the regularization cost to get the total cost
total_cost = cost_without_reg + reg_cost
return total_cost
Exercise 6
Please complete the compute_gradient_reg function below to modify the code below to calculate the following term
KaTeX parse error: Undefined control sequence: \mbox at position 32: …} w_j \quad\, \̲m̲b̲o̲x̲{for $j=0...(n-…
The starter code will add this term to the ∂J(w,b)∂w\frac{\partial J(\mathbf{w},b)}{\partial w}∂w∂J(w,b) returned from compute_gradient above to get the gradient for the regularized cost function.
If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.
# UNQ_C6
def compute_gradient_reg(X, y, w, b, lambda_ = 1):
"""
Computes the gradient for logistic regression with regularization
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
lambda_ : (scalar,float) regularization constant
Returns
dj_db : (scalar) The gradient of the cost w.r.t. the parameter b.
dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w.
"""
m, n = X.shape
dj_db, dj_dw = compute_gradient(X, y, w, b)
### START CODE HERE ###
dj_dw+=(lambda_/m)*w
#di_db不需要正则化,虽然正则也可以,但在这个实验中如果添加就会导致错误。。。
### END CODE HERE ###
return dj_db, dj_dw
本文介绍了如何在Python中实现逻辑回归模型的sigmoid函数、成本函数、梯度计算以及正则化的成本和梯度更新。通过计算sigmoid激活后的预测值、单点损失函数、总成本和参数梯度,展示了完整的逻辑回归模型训练过程。
1654

被折叠的 条评论
为什么被折叠?



