吴恩达机器学习第三周编程题practice lab: logistic regression

最新推荐文章于 2025-12-21 09:56:40 发布

原创最新推荐文章于 2025-12-21 09:56:40 发布 · 554 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#机器学习 #人工智能

吴恩达机器学习专栏收录该内容

22 篇文章

订阅专栏

本文介绍了如何在Python中实现逻辑回归模型的sigmoid函数、成本函数、梯度计算以及正则化的成本和梯度更新。通过计算sigmoid激活后的预测值、单点损失函数、总成本和参数梯度，展示了完整的逻辑回归模型训练过程。

部署运行你感兴趣的模型镜像

Exercise 1

Please complete the sigmoid function to calculate

$\frac{1}{1+e^{-z}}$

Note that

z is not always a single number, but can also be an array of numbers.
If the input is an array of numbers, we’d like to apply the sigmoid function to each value in the input array.

If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C1
# GRADED FUNCTION: sigmoid

def sigmoid(z):
    """
    Compute the sigmoid of z

    Args:
        z (ndarray): A scalar, numpy array of any size.

    Returns:
        g (ndarray): sigmoid(z), with the same shape as z
         
    """
          
    ### START CODE HERE ###
    g=1/(1+np.exp(-z))
    #这里不能够使用内置的math.exp（）来计算，因为他只能处理标量，无法处理矩阵和数组元素；np.exp（）可以
    ### END SOLUTION ###  
    return g

Exercise 2

Please complete the compute_cost function using the equations below.

Recall that for logistic regression, the cost function is of the form

$J(\mathbf{w},b) = \frac{1}{m}\sum_{i=0}^{m-1} \left[ loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)}) \right] \tag{1}$

where

m is the number of training examples in the dataset
$loss(fw,b(x(i)),y(i))loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)})$ is the cost for a single data point, which is -

$loss(fw,b(x(i)),y(i))=(−y(i)log⁡(fw,b(x(i)))−(1−y(i))log⁡(1−fw,b(x(i)))(2)loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)}) = (-y^{(i)} \log\left(f_{\mathbf{w},b}\left( \mathbf{x}^{(i)} \right) \right) - \left( 1 - y^{(i)}\right) \log \left( 1 - f_{\mathbf{w},b}\left( \mathbf{x}^{(i)} \right) \right) \tag{2}$
$fw,b(x(i))f_{\mathbf{w},b}(\mathbf{x}^{(i)})$ is the model’s prediction, while $y^{(i)}$ , which is the actual label
$fw,b(x(i))=g(w⋅x(i)+b)f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = g(\mathbf{w} \cdot \mathbf{x^{(i)}} + b)$ where function $g$ is the sigmoid function.
- It might be helpful to first calculate an intermediate variable $zw,b(x(i))=w⋅x(i)+b=w0x0(i)+...+wn−1xn−1(i)+bz_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x^{(i)}} + b = w_0x^{(i)}_0 + ... + w_{n-1}x^{(i)}_{n-1} + b$ where $n$ is the number of features, before calculating $fw,b(x(i))=g(zw,b(x(i)))f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = g(z_{\mathbf{w},b}(\mathbf{x}^{(i)}))$

# UNQ_C2
# GRADED FUNCTION: compute_cost
def compute_cost(X, y, w, b, *argv):
    """
    Computes the cost over all examples
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      y : (ndarray Shape (m,))  target value 
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model
      *argv : unused, for compatibility with regularized version below
    Returns:
      total_cost : (scalar) cost 
    """

    m, n = X.shape
   
    ### START CODE HERE ###
    total_cost = 0.0 #这里是为了与后续的输出代码不冲突
    for i in range(m):
        z = np.dot(X[i], w) + b 
        total_cost += (-y[i] * np.log(sigmoid(z)) - (1 - y[i]) * np.log(1 - sigmoid(z))) 
    total_cost /= m  
    ### END CODE HERE ### 

    return total_cost

Exercise 3

Please complete the compute_gradient function to compute $∂J(w,b)∂w\frac{\partial J(\mathbf{w},b)}{\partial w}$ , $∂J(w,b)∂b\frac{\partial J(\mathbf{w},b)}{\partial b}$ from equations (2) and (3) below.

$\frac{\partial J(\mathbf{w},b)}{\partial b} = \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - \mathbf{y}^{(i)}) \tag{2}$
$\frac{\partial J(\mathbf{w},b)}{\partial w_j} = \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - \mathbf{y}^{(i)})x_{j}^{(i)} \tag{3}$

m is the number of training examples in the dataset
$fw,b(x(i))f_{\mathbf{w},b}(x^{(i)})$ is the model’s prediction, while $y^{(i)}$ is the actual label

Note: While this gradient looks identical to the linear regression gradient, the formula is actually different because linear and logistic regression have different definitions of $fw,b(x)f_{\mathbf{w},b}(x)$ .

As before, you can use the sigmoid function that you implemented above and if you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C3
# GRADED FUNCTION: compute_gradient
def compute_gradient(X, y, w, b, *argv): 
    """
    Computes the gradient for logistic regression 
 
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      y : (ndarray Shape (m,))  target value 
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model
      *argv : unused, for compatibility with regularized version below
    Returns
      dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w. 
      dj_db : (scalar)             The gradient of the cost w.r.t. the parameter b. 
    """
    m, n = X.shape
    dj_dw = np.zeros(w.shape)
    dj_db = 0.

    ### START CODE HERE ### 
    f=sigmoid(np.dot(X,w)+b)#这行代码计算出了f（x）
    for i in range(m):
        dj_dw+=(f[i]-y[i])*X[i]
        dj_db+=(f[i]-y[i])
    dj_dw/=m
    dj_db/=m
    ### END CODE HERE ###

        
    return dj_db, dj_dw

Exercise 4

Please complete the predict function to produce 1 or 0 predictions given a dataset and a learned parameter vector $w$ and $b$ .

First you need to compute the prediction from the model $f(x(i))=g(w⋅x(i)+b)f(x^{(i)}) = g(w \cdot x^{(i)} + b)$ for every example
- You’ve implemented this before in the parts above
We interpret the output of the model ( $f(x^{(i)})$ ) as the probability that $y^{(i)}=1$ given $x^{(i)}$ and parameterized by $w$ .
Therefore, to get a final prediction ( $y^{(i)}=0$ or $y^{(i)}=1$ ) from the logistic regression model, you can use the following heuristic -

if $f(x^{(i)}) >= 0.5$ , predict $y^{(i)}=1$

if $f(x^{(i)}) < 0.5$ , predict $y^{(i)}=0$

If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C4
# GRADED FUNCTION: predict

def predict(X, w, b): 
    """
    Predict whether the label is 0 or 1 using learned logistic
    regression parameters w
    
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model

    Returns:
      p : (ndarray (m,)) The predictions for X using a threshold at 0.5
    """
    # number of training examples
    m, n = X.shape   
    p = np.zeros(m)
   
    ### START CODE HERE ### 
    # Loop over each example
    f=sigmoid(np.dot(X,w)+b)
    for i in range(m):
        if(f[i]>=0.5):
            p[i]=1
        else:
            p[i]=0
    ### END CODE HERE ### 
    return p

Exercise 5

Please complete the compute_cost_reg function below to calculate the following term for each element in $w$
$λ2m∑j=0n−1wj2\frac{\lambda}{2m} \sum_{j=0}^{n-1} w_j^2$

The starter code then adds this to the cost without regularization (which you computed above in compute_cost) to calculate the cost with regulatization.

If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C5
def compute_cost_reg(X, y, w, b, lambda_ = 1):
    """
    Computes the cost over all examples
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      y : (ndarray Shape (m,))  target value 
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model
      lambda_ : (scalar, float) Controls amount of regularization
    Returns:
      total_cost : (scalar)     cost 
    """

    m, n = X.shape
    
    # Calls the compute_cost function that you implemented above
    cost_without_reg = compute_cost(X, y, w, b) 
    
    # You need to calculate this value
    reg_cost = 0.
    
    ### START CODE HERE ###
    reg_cost=(lambda_/(2*m))*np.sum(w**2)
    
    ### END CODE HERE ### 
    
    # Add the regularization cost to get the total cost
    total_cost = cost_without_reg + reg_cost

    return total_cost

Exercise 6

Please complete the compute_gradient_reg function below to modify the code below to calculate the following term

$KaTeX parse error: Undefined control sequence: \mbox at position 32: …} w_j \quad\, \̲m̲b̲o̲x̲{for $j=0...(n-…$

The starter code will add this term to the $∂J(w,b)∂w\frac{\partial J(\mathbf{w},b)}{\partial w}$ returned from compute_gradient above to get the gradient for the regularized cost function.

If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C6
def compute_gradient_reg(X, y, w, b, lambda_ = 1): 
    """
    Computes the gradient for logistic regression with regularization
 
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      y : (ndarray Shape (m,))  target value 
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model
      lambda_ : (scalar,float)  regularization constant
    Returns
      dj_db : (scalar)             The gradient of the cost w.r.t. the parameter b. 
      dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w. 

    """
    m, n = X.shape
    
    dj_db, dj_dw = compute_gradient(X, y, w, b)

    ### START CODE HERE ###     
    dj_dw+=(lambda_/m)*w
    #di_db不需要正则化，虽然正则也可以，但在这个实验中如果添加就会导致错误。。。
    ### END CODE HERE ###         
        
    return dj_db, dj_dw