python与R语言手推logistic回归（梯度下降法/牛顿法）_r语言随机梯度下降逻辑回归-优快云博客

这篇博客介绍了Logistic回归在分类问题中的应用，通过Python和R语言分别展示了如何建立和训练模型。在Python中，使用了随机梯度下降法优化交叉熵损失函数；在R语言中，实现了最大似然估计的参数求解。实验结果显示，两种实现都能达到高精度的分类效果。

概念及应用：

logistic回归主要用于分类问题中，遇到k分类问题时则转化为k个二分类问题即可。
logistic回归是将logit曲线套用在解释变量线性组合上，利用极大似然法进行参数估计，将似然函数（二项分布交叉熵）作为目标函数，利用最优化方法（牛顿法、梯度下降法）进行求解。

python实现

数据载入及切分

from sklearn import datasets
from sklearn.model_selection import train_test_split
iris = datasets.load_iris()
X = iris.data
y = iris.target
X = X[y != 2]
y = y[y != 2]
xtrain,xtest,ytrain,ytest=train_test_split(
    X, y, test_size=0.3, random_state=42)

中间函数准备

tip:由于exp(x)呈现指数级增长，易导致float溢出，可以对x范围进行控制防止溢出。

def sigmoid(z):
  # #防止溢出在RuntimeWarning: overflow encountered in exp
    return 1 / (1.0 + np.exp(-np.clip(z,-100,10000)))
def f(x,w):#x为n*k w为k*1
    return sigmoid(x@w )
def predict(x,w):
    return np.round(f(x, w))

利用随机梯度下降法进行求解

#损失函数为两个伯努利分布的交叉熵由极大似然估计进行推导
def cross_entropy_loss(y_pred, y_label):
    cross_loss=-np.dot(y_label,np.log(y_pred))-np.dot(np.log(1-y_label),1-y_pred)
    return cross_loss

def gradient(x, y, w):
    y_pred=predict(x,w)
    w_grad=np.matmul(x.T,y_pred-y_label)
    return w_grad
#随机梯度下降进行迭代
def training(x,y_label,alpha):
    dim=x.shape[1]
    w = np.random.rand(dim, 1)
    for i in range(10):
        for index in range(0,len(y_label)):
            y_pred=f(np.array(x[index,:],ndmin=2),w)
            gradient=np.array(x[index,:],ndmin=2).T@(y_pred-y_label[index])
            w-=alpha*gradient
    return w

预测

w=training(xtrain,ytrain,0.001)
y_train_pred=predict(xtrain,w)
y_test_pred=predict(xtest,w)

效果评估

from sklearn.metrics import classification_report,confusion_matrix
print(classification_report(ytrain, y_train_pred)) 
print(classification_report(ytest, y_test_pred)) 
print(confusion_matrix(ytrain, y_train_pred)) 
print(confusion_matrix(ytest, y_test_pred))

输出结果：
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        33
           1       1.00      1.00      1.00        37

    accuracy                           1.00        70
   macro avg       1.00      1.00      1.00        70
weighted avg       1.00      1.00      1.00        70

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        17
           1       1.00      1.00      1.00        13

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30

[[33  0]
 [ 0 37]]
[[17  0]
 [ 0 13]]

R语言实现

data(iris)
summary(iris)
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
##        Species  
##  setosa    :50  
##  versicolor:50  
##  virginica :50  
##                 
##                 
## 
ir<-iris[- which(iris$Species == 'setosa'),]
summary(ir)
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
##  Min.   :4.900   Min.   :2.000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:5.800   1st Qu.:2.700   1st Qu.:4.375   1st Qu.:1.300  
##  Median :6.300   Median :2.900   Median :4.900   Median :1.600  
##  Mean   :6.262   Mean   :2.872   Mean   :4.906   Mean   :1.676  
##  3rd Qu.:6.700   3rd Qu.:3.025   3rd Qu.:5.525   3rd Qu.:2.000  
##  Max.   :7.900   Max.   :3.800   Max.   :6.900   Max.   :2.500  
##        Species  
##  setosa    : 0  
##  versicolor:50  
##  virginica :50  
##                 
##                 
## 
ir$Species<-factor(ir$Species, levels = c( 'versicolor', 'virginica'), labels = c(0,1))#level 原始类别 label对于类别名称重命名
ir<-as.data.frame(lapply(ir,as.numeric))
#ir$Species
ir$Species[ir$Species == 2] <-0
数据标准化，选取三列x
x <- ir[,2:4]
y <- ir$Species
m <-dim(x)[1]
n <- dim(x)[2] + 1
x<-data.frame(scale(x))
x$constant <- 1
x<-as.matrix(x) #100*4
估计参数程序：
# param:
# {m:数据行数
#  n:数据维度}

mle<-function(x,y,n,m,max_iter){
  theta =matrix(data=0.001, nrow = n, ncol = 1)#4*1
  thred = 0.001
  iters = 1
  G = matrix(data=0, nrow = n, ncol = 1)
  H =matrix(data=0, nrow = n, ncol = n)
  a=1
  while( (iters<=max_iter) & (a>=thred)){
    print(iters)
    iters = iters + 1
    z=x%*%theta#100*1
    #print(z)
    h =1- 1/(1 + exp(z))#100*1
    dif = y - h#100*1
    G=t(x)%*%dif#4*1 x:4*100
    const_sum = h*(1-h)#100*1
    H=t(x)%*%(c(const_sum) * x)
    theta_pre=theta
    theta = theta +solve(H )%*%G
    a=sum((theta-theta_pre)**2)/sum(theta_pre**2)
    accuracy<-1-sum(abs(round(1- 1/(1 + exp(x%*%theta)))-y))/length(y)
    print('accuracy')
    print(accuracy)
  }
  return(theta)
}
theta=mle(x,y,n,m,100)
## [1] 1
## [1] "accuracy"
## [1] 0.96
## [1] 2
## [1] "accuracy"
## [1] 0.95
## [1] 3
## [1] "accuracy"
## [1] 0.95
## [1] 4
## [1] "accuracy"
## [1] 0.97
## [1] 5
## [1] "accuracy"
## [1] 0.97
## [1] 6
## [1] "accuracy"
## [1] 0.97
## [1] 7
## [1] "accuracy"
## [1] 0.97
## [1] 8
## [1] "accuracy"
## [1] 0.97
## [1] 9
## [1] "accuracy"
## [1] 0.97
print('待估参数为：')
## [1] "待估参数为："
print(theta)
##                     [,1]
## Sepal.Width   2.78698357
## Petal.Length -6.50073323
## Petal.Width  -9.10210495
## constant      0.03436432
y_pred=round(1- 1/(1 + exp(x%*%theta)))
print('结果为：')
## [1] "结果为："
table(y_pred,y)
##       y
## y_pred  0  1
##      0 49  2
##      1  1 48