Logistic regression算法介绍和Python实现

最新推荐文章于 2022-05-15 12:11:11 发布

原创最新推荐文章于 2022-05-15 12:11:11 发布 · 317 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#Logistic Regression #机器学习 #逻辑回归

本文深入探讨了Logistic回归算法，包括损失函数的数学推导、logistic函数的特性及其在二分类问题上的应用，同时提供了详细的代码实现，展示了算法在实际数据集上的有效性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Logistic regression algorithm

一.损失函数
1.理论部分
如果我们有一组观察数据 y = {y1,y2,…,yn}，一组特征x ={x1,x2,…,xn}(其中x1,x2,…,xn为不同的特征向，xm ∈R ^n) 所以x∈R^(n*n)

我们认为y近似为x的线性方程：

H(x) = θ^Tx

其中θ∈R^n,每一个θn为xn特征的系数
我们定义一个损失函数为：
在这里插入图片描述
这个损失函数多了平方是因为凸优化，以方便找到他的最小值，其中1/2为方便之后的计算
然后对这个损失函数对每一个θj 求偏导：

求出偏导后即可直接用梯度上升计算出最大的θj:

2.代码实现

import numpy as np
import random

def gradientDescent(x,y,theta,alpha,m,numliterations):    
    xTrans = x.transpose()
    for i in range(0,numIterations):
        hypothesis = np.dot(x,theta)
        loss = y - hypothesis
        cost = np.sum(loss **2)/(2*m)      #m isn't important. In other books,There is m in the loss function
        gradient = np.dot(xTrans,loss)/m
        theta = theta + alpha*gradient
        if i%1000 == 0:
            print("Iteration %d | Cost: %f" % (i, cost))
    return theta

def genData(numPoints,bias,variance):
'''  return training set   '''
    x = np.zeros((numPoints,2))
    y = np.zeros(numPoints)
    for i in range(0,numPoints):
        x[i][0] = 1
        x[i][1] = i
        y[i] = (i+bias)+random.uniform(0,1)*variance

    return x,y

x,y = genData(100,25,10)
m,n = np.shape(x)
numIterations = 20000   #The number of Iterations
alpha = 0.0005  # learn rate
theta = np.ones(n)
theta = gradientDescent(x,y,theta,alpha,m,numIterations)
print(theta)

二.logistic function(又称sigmoid function)
我是在MIT 18.04的课程学到的这个函数，它的导函数是线性且递增的。
在这里插入图片描述
它是二分类问题上运用比较频繁的激活函数，先看该函数的模型：

logistic function的值域只在（0，1）之间，其次图形中中间陡峭而两边平缓，符合二元分类的样本点特性。因此适合用于处理二分类问题。
三. Logistic regression algorithm
1.首先我们重新假设我们有一个二分类问题，观察的标签y∈[0,1]（y非0及1），我们观察的特征x∈{x1,x2,…,xn}。
由于我们的特征是
我们假设：

在这里插入图片描述
其中；后面的θ假设为一个定值 ,意思是P（y=1|x）是与θ值有关
然后我们能更简洁的写成：

(该公式参考伯努利分布)
因此，我们的损失函数可以改写成：

我们对数化一下上面式子(因为log函数也是一个单调递增的函数，极大方便运算)：
在这里插入图片描述
最后通过对每个θj求偏导，即可得到迭代公式：

虽然与上面的损失函数形式类似，但是要注意这里的y和h(x)∈（0，1）.

2.代码实现
代码部分是先生成两组数据。
x为特征储存的向量及二维平面上是点
x[0]为某点的x坐标
x[1]为该点的y坐标
而y为该点的标签

import numpy as np
import random
import matplotlib.pyplot as plt

#logistic function
def sigmoid(x):
    return 1/(1 + np.exp(-x))

def gradAscent(dataMatln,classLabels):
    dataMatrix = np.mat(dataMatln)
    labelMat = np.mat(classLabels).transpose()
    m,n = np.shape(dataMatrix)
    alpha = 0.001   #learning rate
    maxCycles = 30000    # The number of iterations
    theta = np.ones((n,1))
    for k in range(maxCycles):
        h = sigmoid(np.dot(dataMatln,theta))
        error = (labelMat - h)   #error with labels
        #cost =np.sum(np.dot(error.T,error))/2    #
        theta = theta + alpha*dataMatln.transpose()*error  #Gradient Method
        if k%1000 == 0:
            print("Iteration %d | Error: %f" % (k, np.sum(error)))
    return theta



def genData(numPoints):
    ''' Return training set and labels  '''
    x = np.zeros((numPoints,2))
    y = np.zeros(numPoints)
    count = int(numPoints/2)
    for i in range(0,count):
        x[i][0] = random.uniform(40,100)    
        x[i][1] = random.uniform(0,30)
        y[i] = 1
    for i in range(count,count*2):
        x[i][0] = random.uniform(0,40)
        x[i][1] = random.uniform(30,60)
        y[i] = 0
    return x,y

dataMatrix,classmethod = genData(200)
theta = gradAscent(dataMatrix,classmethod)

#Verify algorithm
Classify =np.dot(theta.T,dataMatrix.T)  # Classify = theta.T * X 
m,n = np.shape(Classify)
Class_One = []
Class_Zero = []

for i in range(0,n-1):                      #if probability is bigger than 0.5, draw this point in blue
    if sigmoid(Classify[0,i]) < 0.5:        #else if probability is smaller than 0.5, draw this point in red
        Class_Zero.append(dataMatrix[i])
    elif sigmoid(Classify[0,i]) >= 0.5:
        Class_One.append(dataMatrix[i])
Class_Zero = np.matrix(Class_Zero)
Class_One = np.matrix(Class_One)

plt.figure(figsize = (7,5))
plt.scatter(np.array(Class_Zero[:,0]),np.array(Class_Zero[:,1]),marker='o',c = 'r')
plt.scatter(np.array(Class_One[:,0]),np.array(Class_One[:,1]),marker='o',c = 'b')
plt.grid(True)
plt.xlabel('X')
plt.ylabel('Y')
plt.title('softmax')
plt.show()