多分类学习

多分类任务实质上可以使用多个二分类器来解决。这篇博客主要介绍三种使用二分类器解决多分类任务的方法。虽然softmax之后使用交叉熵损失也可以解决多分类任务,但这篇博客不介绍这种方法。这篇博客主要介绍以下三种方法,这三种方法均是基于对训练集的拆分来进行操作的。

(1)一对一(One vs One,简称OvO)

设数据集为D=\{(x_1, y_1), (x_2, y_2), ..., (x_m, y_m)\}y_i\in \{C_1, C_2, ..., C_N\}。OvO的思想就是使每两个类构造一个二分类器,之后使用投票方式来进行预测。假设现在只有四个类C_1, C_2, C_3, C_4,那么分类器的构造如下表所示:

正例反例分类器
C_1C_2f_1
C_1C_3f_2
C_1C_4f_3
C_2C_3f_4
C_2C_4f_5
C_3C_4f_6

 

总共构造N(N-1)/2个二分类器,之后对于新的测试样本x_t,只需将样本输入至以上构造的每个二分类器中,最后采用投票方式取预测得到票数最多的那一类,如下表所示,每个分类器对于测试样本进行预测,C_2类得票最多,则将其预测为C_2类:

分类器预测类别最终预测类别
f_1C_2C_2
f_2C_1
f_3C_4
f_4C_2
f_5C_2
f_6C_3

(2)一对其余(One vs Rest,简称OvR)

数据集还采用(1)中的数据集,OvR的思想是依次将每个类别作为正例,其余类别统一作为反例来构造二分类器,显然这样可以构造N个二分类器。在对于新的测试样本进行预测时,这N个二分类器中只有1个会将其预测为正例,那么这个正例所对应的类别就是这个测试样本的预测类别。我们依然假设只有4个类别,则OvR如下表所示:

正例(+)反例(-)分类器
C_1C_2, C_3, C_4f_1
C_2C_1, C_3, C_4f_2
C_3C_1, C_2, C_4f_3
C_4C_1, C_2, C_3f_4

 

在对测试样本进行预测时

分类器预测结果最终预测结果
f_1-C_2
f_2+
f_3-
f_4-

(3)多对多(Many vs Many,简称MvM)

MvM的思想是对于训练集做M次划分,每次划分选择一些类别作为正例,剩下的类别作为反例来训练得到一个二分类器,于是可以训练得到M个二分类器\{f_1, f_2, ..., f_M\}。于是针对原数据中的一个类别C_i,运用这M个二分类器中的每一个分类器对其进行预测可以得到一个预测结果串[+1, -1, -1, +1, ...],称之为编码串,于是对于N个类别进行预测得到一个编码矩阵,其形状为(N,M),这里还假设原数据只有4个类别,总共对数据做了5次划分,于是得到5个二分类器\{f_1, f_2, ..., f_5\},则编码矩阵如下所示:

 f_1f_2f_3f_4f_5
C_1-1+1-1-1+1
C_2+1+1-1+1-1
C_3-1-1-1+1+1
C_4+1+1+1-1-1

对于测试样本,将其分别输入至这5个二分类器中会得到一个编码串[+1, +1, -1, +1, +1],用这个编码串分别与各个类别的编码求取距离(可以求欧氏距离,或海明距离),这里求取欧式距离,与其距离最小的那个类别就作为这个测试样本的最终预测类别。对于[+1, +1, -1, +1, +1],其欧氏距离如下表所示

类别距离
C_12.83
C_22.00
C_32.83
C_43.46

显然,与类别C_2编码距离最小,因此将测试样本预测为类别C_2

对于以上三种方法,代码实现以手写数字预测为例,二分类器采用LogisticRegression,如下所示:


# coding: utf-8

# In[36]:

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
import numpy as np
from numpy import random as rd
import pandas as pd
import warnings
warnings.filterwarnings('ignore')


# In[37]:

x, y = load_digits()["data"], load_digits()["target"]


# In[38]:

x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=42, test_size=0.3)


# In[39]:

class Multi_class(object):
    
    def __init__(self, x_train, x_test, y_train, y_test):
        self.x_train = x_train
        self.x_test = x_test
        self.y_train = y_train
        self.y_test = y_test
        self.class_unique = np.unique(self.y_train)
    
    def OvO(self):
        model_lst = []
        for i in range(len(self.class_unique) - 1):
            # i选择正例
            for j in range(i + 1, len(self.class_unique)):
                # j选择返例
                select_index_positive = self.y_train == self.class_unique[i]
                select_index_negative = self.y_train == self.class_unique[j]
                y_train_ = np.concatenate([self.y_train[select_index_positive], self.y_train[select_index_negative]])
                x_train_ = np.concatenate([self.x_train[select_index_positive], self.x_train[select_index_negative]], axis=0)
                cls = LogisticRegression()
                cls.fit(x_train_, y_train_)
                model_lst.append(cls)
        return model_lst
    
    def OvR(self):
        model_lst = []
        for c in self.class_unique:
            select_index_positive = self.y_train == c
            select_index_negative = np.logical_not(select_index_positive)
            y_train_ = list(self.y_train[select_index_positive]) + [-1] * np.sum(select_index_negative.astype(np.int8))
            x_train_ = np.concatenate([self.x_train[select_index_positive], self.x_train[select_index_negative]], axis=0)
            cls = LogisticRegression()
            cls.fit(x_train_, y_train_)
            model_lst.append(cls)
        return model_lst
        
    def MvM(self):
        # 随机做30次划分
        model_lst = []
        coding_matrix = []
        half_class_counts = int(len(self.class_unique) / 2)
        y_1 = np.array([1] * len(self.y_train))
        y_0 = np.array([-1] * len(self.y_train))
        for i in range(30):
            class_unique_after_shuffle = rd.permutation(self.class_unique)
            positive_labels = class_unique_after_shuffle[:half_class_counts]
            negative_labels = class_unique_after_shuffle[half_class_counts:]
            y_train_ = []
            x_train_ = []
            for pl, nl in zip(positive_labels, negative_labels):
                y_train_.extend(np.sum((self.y_train == pl).astype(np.int8)) * [1])
                y_train_.extend(np.sum((self.y_train == nl).astype(np.int8)) * [-1])
                x_train_.append(self.x_train[self.y_train == pl])
                x_train_.append(self.x_train[self.y_train == nl])
            x_train_ = np.concatenate(x_train_, axis=0)
            cls = LogisticRegression()
            cls.fit(x_train_, y_train_)
            model_lst.append(cls)
        for c in self.class_unique:
            label = []
            select_index = self.y_train == c
            x_train_ = self.x_train[select_index]
            for model in model_lst:
                predict_result = model.predict(x_train_)
                if np.sum(predict_result) > 0:
                    label.append(1)
                else:
                    label.append(-1)
            coding_matrix.append(label)
        coding_matrix = np.array(coding_matrix)
        return coding_matrix, model_lst
        
    def test_OvO(self, model_lst):
        predict_label_of_every_model = []
        predict_label = []
        for model in model_lst:
            predict_label_of_every_model.append(model.predict(self.x_test))
        predict_label_of_every_model = pd.DataFrame(predict_label_of_every_model)
        for i in range(self.x_test.shape[0]):
            counts = predict_label_of_every_model[i].value_counts()
            counts = counts.sort_values(ascending=False)
            predict_label.append(list(counts.index)[0])
        accur = np.mean((np.array(predict_label) == self.y_test).astype(np.int))
        print("OvO测试集准确率为%.2f%s" % (accur * 100, "%"))
    
    def test_OvR(self, model_lst):
        predict_label_of_every_model = []
        for model in model_lst:
            predict_label_of_every_model.append(model.predict(self.x_test))
        predict_label_of_every_model = np.array(predict_label_of_every_model)
        predict_label = np.max(predict_label_of_every_model, axis=0)
        accur = np.mean((predict_label == self.y_test).astype(np.int8))
        print("OvR测试集准确率为%.2f%s" % (accur * 100, "%"))
        
    def test_MvM(self, coding_matrix, model_lst):
        predict_label = []
        predict_result = []
        for model in model_lst:
            result = model.predict(self.x_test)
            predict_result.append(result)
        predict_result = np.array(predict_result).T
        for result in predict_result:
            predict_label.append(self.class_unique[np.argmin(np.sqrt(np.sum(np.square(result - coding_matrix), axis=1)).ravel())])
        accur = np.mean((predict_label == self.y_test).astype(np.int8))
        print("MvM测试集准确率为%.2f%s" % (accur * 100, "%"))
        
    def test(self):
        print("#############OvO################")
        OvO_model_lst = self.OvO()
        self.test_OvO(OvO_model_lst)
        print("#############OvR################")
        OvR_model_lst = self.OvR()
        self.test_OvR(OvR_model_lst)
        print("#############MvM################")
        coding_mat, MvM_model_lst = self.MvM()
        self.test_MvM(coding_mat, MvM_model_lst)


# In[40]:

mc = Multi_class(x_train, x_test, y_train, y_test)
mc.test()


# In[ ]:




# In[ ]:



 

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值