吴恩达机器学习课程Python实现(ex2:逻辑回归)

最新推荐文章于 2024-04-10 20:43:42 发布

偷心的小白

最新推荐文章于 2024-04-10 20:43:42 发布

阅读量452

点赞数

文章标签： python 算法机器学习人工智能数据分析

本文链接：https://blog.youkuaiyun.com/m0_46128839/article/details/106844711

版权

文章目录

1 数据处理
- 1.1 读取数据
- 1.2 绘制图像
2 逻辑回归算法实现
3 正则化处理

1 数据处理

本次的任务是根据两次考试分数，预测学生是否被录取。数据在ex2data1.txt里，第一列是考试一的成绩，第二列是考试二的成绩，第三列是是否被录取。

1.1 读取数据

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
path = 'C:\\Users\\Inuyasha\\Desktop\\ex2data1.txt'
data = pd.read_csv(path, names = ['Exam1', 'Exam2', 'Admitted'])
data.head()

运行结果：

def get_x(df): #读取特征
    ones = pd.DataFrame({'ones': np.ones(len(df))})#ones是m行1列的dataframe
    data = pd.concat([ones, df], axis=1)  # 合并数据，根据列合并
    return np.array(data.iloc[:, :-1]).  # 这个操作返回 ndarray,不是矩阵
    
def get_y(df): #读取标签
    return np.array(df.iloc[:, -1]) #df.iloc[:, -1]是指df的最后一列

def normalize_feature(df):
    return df.apply(lambda column: (column - column.mean()) / column.std()) #特征缩放

x = get_x(data)
y = get_y(data)
theta =  np.zeros((3, 1), dtype = int)
y.reshape(100,1)
print(x.shape)
print(y.shape)

运行结果：
(100, 3)
(100, 1)

1.2 绘制图像

把读取的数据的分布情况用图像展示。

data.describe()

运行结果：

postive = data[data['Admitted'].isin([1])]
postive.describe()

运行结果：

negative = data[data['Admitted'].isin([0])]
negative.describe()

运行结果：

fig = plt.figure(figsize =(12,8))
ax = fig.add_subplot(1,1,1)
ax.scatter(postive['Exam1'],postive['Exam2'], s =50, c = 'b', marker = 'o' , label = 'Admitted')
ax.scatter(negative['Exam1'],negative['Exam2'], s = 50,c='r', marker='x',label = 'Not Admitted')
ax.legend()
ax.set_xlabel('Exam 1 Score')
ax.set_ylabel('Exam 2 Score')
plt.show()

运行结果：

2 逻辑回归算法实现

2.1 sigmoid函数

g 代表一个常用的逻辑函数（logistic function）为S形函数（Sigmoid function），公式为：
在这里插入图片描述
合起来，我们得到逻辑回归模型的假设函数：

# 实现sigmoid函数
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

2.2 代价函数

在这里插入图片描述

def cost(theta, x, y):
    inner = -y * np.log(sigmoid(x @ theta)) - (1 - y) * np.log(1 - sigmoid(x @ theta))
    return np.sum((inner) / len(x))
costValue = cost(x, y, theta)
costValue

运行结果：
0.6931471805599457

2.3 梯度下降算法

在这里插入图片描述

def cost(theta, x, y):
    inner = -y * np.log(sigmoid(x @ theta)) - (1 - y) * np.log(1 - sigmoid(x @ theta))
    return np.sum((inner) / len(x))

吴恩达机器学习课程Python实现(ex2:逻辑回归)

文章目录

1 数据处理

1.1 读取数据

1.2 绘制图像

2 逻辑回归算法实现

2.1 sigmoid函数

2.2 代价函数

2.3 梯度下降算法

2.4 用工具库计算θ的值

3 正则化处理