吴恩达机器学习第三周编程题practice lab: logistic regression

本文介绍了如何在Python中实现逻辑回归模型的sigmoid函数、成本函数、梯度计算以及正则化的成本和梯度更新。通过计算sigmoid激活后的预测值、单点损失函数、总成本和参数梯度,展示了完整的逻辑回归模型训练过程。
部署运行你感兴趣的模型镜像

Exercise 1

Please complete the sigmoid function to calculate

g(z)=11+e−zg(z) = \frac{1}{1+e^{-z}}g(z)=1+ez1

Note that

  • z is not always a single number, but can also be an array of numbers.
  • If the input is an array of numbers, we’d like to apply the sigmoid function to each value in the input array.

If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C1
# GRADED FUNCTION: sigmoid

def sigmoid(z):
    """
    Compute the sigmoid of z

    Args:
        z (ndarray): A scalar, numpy array of any size.

    Returns:
        g (ndarray): sigmoid(z), with the same shape as z
         
    """
          
    ### START CODE HERE ###
    g=1/(1+np.exp(-z))
    #这里不能够使用内置的math.exp()来计算,因为他只能处理标量,无法处理矩阵和数组元素;np.exp()可以
    ### END SOLUTION ###  
    return g

Exercise 2

Please complete the compute_cost function using the equations below.

Recall that for logistic regression, the cost function is of the form

J(w,b)=1m∑i=0m−1[loss(fw,b(x(i)),y(i))](1) J(\mathbf{w},b) = \frac{1}{m}\sum_{i=0}^{m-1} \left[ loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)}) \right] \tag{1}J(w,b)=m1i=0m1[loss(fw,b(x(i)),y(i))](1)

where

  • m is the number of training examples in the dataset

  • loss(fw,b(x(i)),y(i))loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)})loss(fw,b(x(i)),y(i)) is the cost for a single data point, which is -

    loss(fw,b(x(i)),y(i))=(−y(i)log⁡(fw,b(x(i)))−(1−y(i))log⁡(1−fw,b(x(i)))(2)loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)}) = (-y^{(i)} \log\left(f_{\mathbf{w},b}\left( \mathbf{x}^{(i)} \right) \right) - \left( 1 - y^{(i)}\right) \log \left( 1 - f_{\mathbf{w},b}\left( \mathbf{x}^{(i)} \right) \right) \tag{2}loss(fw,b(x(i)),y(i))=(y(i)log(fw,b(x(i)))(1y(i))log(1fw,b(x(i)))(2)

  • fw,b(x(i))f_{\mathbf{w},b}(\mathbf{x}^{(i)})fw,b(x(i)) is the model’s prediction, while y(i)y^{(i)}y(i), which is the actual label

  • fw,b(x(i))=g(w⋅x(i)+b)f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = g(\mathbf{w} \cdot \mathbf{x^{(i)}} + b)fw,b(x(i))=g(wx(i)+b) where function ggg is the sigmoid function.

    • It might be helpful to first calculate an intermediate variable zw,b(x(i))=w⋅x(i)+b=w0x0(i)+...+wn−1xn−1(i)+bz_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x^{(i)}} + b = w_0x^{(i)}_0 + ... + w_{n-1}x^{(i)}_{n-1} + bzw,b(x(i))=wx(i)+b=w0x0(i)+...+wn1xn1(i)+b where nnn is the number of features, before calculating fw,b(x(i))=g(zw,b(x(i)))f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = g(z_{\mathbf{w},b}(\mathbf{x}^{(i)}))fw,b(x(i))=g(zw,b(x(i)))
# UNQ_C2
# GRADED FUNCTION: compute_cost
def compute_cost(X, y, w, b, *argv):
    """
    Computes the cost over all examples
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      y : (ndarray Shape (m,))  target value 
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model
      *argv : unused, for compatibility with regularized version below
    Returns:
      total_cost : (scalar) cost 
    """

    m, n = X.shape
   
    ### START CODE HERE ###
    total_cost = 0.0 #这里是为了与后续的输出代码不冲突
    for i in range(m):
        z = np.dot(X[i], w) + b 
        total_cost += (-y[i] * np.log(sigmoid(z)) - (1 - y[i]) * np.log(1 - sigmoid(z))) 
    total_cost /= m  
    ### END CODE HERE ### 

    return total_cost

Exercise 3

Please complete the compute_gradient function to compute ∂J(w,b)∂w\frac{\partial J(\mathbf{w},b)}{\partial w}wJ(w,b), ∂J(w,b)∂b\frac{\partial J(\mathbf{w},b)}{\partial b}bJ(w,b) from equations (2) and (3) below.

∂J(w,b)∂b=1m∑i=0m−1(fw,b(x(i))−y(i))(2) \frac{\partial J(\mathbf{w},b)}{\partial b} = \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - \mathbf{y}^{(i)}) \tag{2} bJ(w,b)=m1i=0m1(fw,b(x(i))y(i))(2)
∂J(w,b)∂wj=1m∑i=0m−1(fw,b(x(i))−y(i))xj(i)(3) \frac{\partial J(\mathbf{w},b)}{\partial w_j} = \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - \mathbf{y}^{(i)})x_{j}^{(i)} \tag{3} wjJ(w,b)=m1i=0m1(fw,b(x(i))y(i))xj(i)(3)

  • m is the number of training examples in the dataset

  • fw,b(x(i))f_{\mathbf{w},b}(x^{(i)})fw,b(x(i)) is the model’s prediction, while y(i)y^{(i)}y(i) is the actual label

  • Note: While this gradient looks identical to the linear regression gradient, the formula is actually different because linear and logistic regression have different definitions of fw,b(x)f_{\mathbf{w},b}(x)fw,b(x).

As before, you can use the sigmoid function that you implemented above and if you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C3
# GRADED FUNCTION: compute_gradient
def compute_gradient(X, y, w, b, *argv): 
    """
    Computes the gradient for logistic regression 
 
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      y : (ndarray Shape (m,))  target value 
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model
      *argv : unused, for compatibility with regularized version below
    Returns
      dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w. 
      dj_db : (scalar)             The gradient of the cost w.r.t. the parameter b. 
    """
    m, n = X.shape
    dj_dw = np.zeros(w.shape)
    dj_db = 0.

    ### START CODE HERE ### 
    f=sigmoid(np.dot(X,w)+b)#这行代码计算出了f(x)
    for i in range(m):
        dj_dw+=(f[i]-y[i])*X[i]
        dj_db+=(f[i]-y[i])
    dj_dw/=m
    dj_db/=m
    ### END CODE HERE ###

        
    return dj_db, dj_dw

Exercise 4

Please complete the predict function to produce 1 or 0 predictions given a dataset and a learned parameter vector www and bbb.

  • First you need to compute the prediction from the model f(x(i))=g(w⋅x(i)+b)f(x^{(i)}) = g(w \cdot x^{(i)} + b)f(x(i))=g(wx(i)+b) for every example

    • You’ve implemented this before in the parts above
  • We interpret the output of the model (f(x(i))f(x^{(i)})f(x(i))) as the probability that y(i)=1y^{(i)}=1y(i)=1 given x(i)x^{(i)}x(i) and parameterized by www.

  • Therefore, to get a final prediction (y(i)=0y^{(i)}=0y(i)=0 or y(i)=1y^{(i)}=1y(i)=1) from the logistic regression model, you can use the following heuristic -

    if f(x(i))>=0.5f(x^{(i)}) >= 0.5f(x(i))>=0.5, predict y(i)=1y^{(i)}=1y(i)=1

    if f(x(i))<0.5f(x^{(i)}) < 0.5f(x(i))<0.5, predict y(i)=0y^{(i)}=0y(i)=0

If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C4
# GRADED FUNCTION: predict

def predict(X, w, b): 
    """
    Predict whether the label is 0 or 1 using learned logistic
    regression parameters w
    
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model

    Returns:
      p : (ndarray (m,)) The predictions for X using a threshold at 0.5
    """
    # number of training examples
    m, n = X.shape   
    p = np.zeros(m)
   
    ### START CODE HERE ### 
    # Loop over each example
    f=sigmoid(np.dot(X,w)+b)
    for i in range(m):
        if(f[i]>=0.5):
            p[i]=1
        else:
            p[i]=0
    ### END CODE HERE ### 
    return p

Exercise 5

Please complete the compute_cost_reg function below to calculate the following term for each element in www
λ2m∑j=0n−1wj2\frac{\lambda}{2m} \sum_{j=0}^{n-1} w_j^22mλj=0n1wj2

The starter code then adds this to the cost without regularization (which you computed above in compute_cost) to calculate the cost with regulatization.

If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C5
def compute_cost_reg(X, y, w, b, lambda_ = 1):
    """
    Computes the cost over all examples
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      y : (ndarray Shape (m,))  target value 
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model
      lambda_ : (scalar, float) Controls amount of regularization
    Returns:
      total_cost : (scalar)     cost 
    """

    m, n = X.shape
    
    # Calls the compute_cost function that you implemented above
    cost_without_reg = compute_cost(X, y, w, b) 
    
    # You need to calculate this value
    reg_cost = 0.
    
    ### START CODE HERE ###
    reg_cost=(lambda_/(2*m))*np.sum(w**2)
    
    ### END CODE HERE ### 
    
    # Add the regularization cost to get the total cost
    total_cost = cost_without_reg + reg_cost

    return total_cost

Exercise 6

Please complete the compute_gradient_reg function below to modify the code below to calculate the following term

KaTeX parse error: Undefined control sequence: \mbox at position 32: …} w_j \quad\, \̲m̲b̲o̲x̲{for $j=0...(n-…

The starter code will add this term to the ∂J(w,b)∂w\frac{\partial J(\mathbf{w},b)}{\partial w}wJ(w,b) returned from compute_gradient above to get the gradient for the regularized cost function.

If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C6
def compute_gradient_reg(X, y, w, b, lambda_ = 1): 
    """
    Computes the gradient for logistic regression with regularization
 
    Args:
      X : (ndarray Shape (m,n)) data, m examples by n features
      y : (ndarray Shape (m,))  target value 
      w : (ndarray Shape (n,))  values of parameters of the model      
      b : (scalar)              value of bias parameter of the model
      lambda_ : (scalar,float)  regularization constant
    Returns
      dj_db : (scalar)             The gradient of the cost w.r.t. the parameter b. 
      dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w. 

    """
    m, n = X.shape
    
    dj_db, dj_dw = compute_gradient(X, y, w, b)

    ### START CODE HERE ###     
    dj_dw+=(lambda_/m)*w
    #di_db不需要正则化,虽然正则也可以,但在这个实验中如果添加就会导致错误。。。
    ### END CODE HERE ###         
        
    return dj_db, dj_dw

您可能感兴趣的与本文相关的镜像

Python3.11

Python3.11

Conda
Python

Python 是一种高级、解释型、通用的编程语言,以其简洁易读的语法而闻名,适用于广泛的应用,包括Web开发、数据分析、人工智能和自动化脚本

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Magic171

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值