算法思路
Adaboost是将弱分类器进行组合的算法。在这里弱分类器采用DecisionStump,通过迭代产生一系列的DecisionStump分类器,然后以一定权重进行组合。需要注意的是Adaboost正负样本一+1,-1表示,不是0、1。
训练过程
1.设置样本权重向量
W
2.计算得到当前最优的DecisionStump分类器
3.计算当前误分类率
4.计算当前分类器权重
α=12ln1−ϵtϵt
5.更新样本权重,对于误分类样本权重更新为
wi=wi×1−ϵtϵt
,归一化权重向量
W
<script type="math/tex" id="MathJax-Element-6">W</script>
6.重复2-5,直到达到分类器个数。
预测过程
1.计算个分类器的计算结果
2.根据权重计算加权结果,取符号为预测结果。
代码
# Decision stump used as weak classifier in Adaboost
class DecisionStump():
def __init__(self):
self.polarity = 1
self.feature_index = None
self.threshold = None
self.alpha = None
class Adaboost():
"""Boosting method that uses a number of weak classifiers in
ensemble to make a strong classifier. This implementation uses decision
stumps, which is a one level Decision Tree.
Parameters:
-----------
n_clf: int
The number of weak classifiers that will be used.
"""
def __init__(self, n_clf=5):
self.n_clf = n_clf
# List of weak classifiers
self.clfs = []
def fit(self, X, y):
n_samples, n_features = np.shape(X)
# Initialize weights to 1/N
w = np.full(n_samples, (1 / n_samples))
# Iterate through classifiers
for _ in range(self.n_clf):
clf = DecisionStump()
# Minimum error given for using a certain feature value threshold
# for predicting sample label
min_error = 1
# Iterate throught every unique feature value and see what value
# makes the best threshold for predicting y
for feature_i in range(n_features):
feature_values = np.expand_dims(X[:, feature_i], axis=1)
unique_values = np.unique(feature_values)
# Try every unique feature value as threshold
for threshold in unique_values:
p = 1
# Set all predictions to '1' initially
prediction = np.ones(np.shape(y))
# Label the samples whose values are below threshold as '-1'
prediction[X[:, feature_i] < threshold] = -1
# Error = sum of weights of missclassified samples
error = sum(w[y != prediction])
if error > 0.5:
# E.g error = 0.8 => (1 - error) = 0.2
# We flip the error and polarity
error = 1 - error
p = -1
# If this threshold resulted in the smallest error we save the
# configuration
if error < min_error:
clf.polarity = p
clf.threshold = threshold
clf.feature_index = feature_i
min_error = error
# Calculate the alpha which is used to update the sample weights
# and is an approximation of this classifiers proficiency
clf.alpha = 0.5 * math.log((1.0 - min_error) / (min_error + 1e-10))
# Set all predictions to '1' initially
predictions = np.ones(np.shape(y))
# The indexes where the sample values are below threshold
negative_idx = (clf.polarity * X[:, clf.feature_index] < clf.polarity * clf.threshold)
# Label those as '-1'
predictions[negative_idx] = -1
# Calculate new weights
# Missclassified gets larger and correctly classified smaller
w = np.multiply(w, (np.exp(clf.alpha * np.multiply(y,predictions))))
# Normalize to one
w /= np.sum(w)
print("w value:", w)
# Save classifier
self.clfs.append(clf)
def predict(self, X):
n_samples = np.shape(X)[0]
y_pred = np.zeros((n_samples, 1))
# For each classifier => label the samples
for clf in self.clfs:
# Set all predictions to '1' initially
predictions = np.ones(np.shape(y_pred))
# The indexes where the sample values are below threshold
negative_idx = (clf.polarity * X[:, clf.feature_index] < clf.polarity * clf.threshold)
# Label those as '-1'
predictions[negative_idx] = -1
# Add column of predictions weighted by the classifiers alpha
# (alpha indicative of classifiers profieciency)
y_pred = np.concatenate((y_pred, clf.alpha * predictions), axis=1)
# Sum weighted predictions and return sign of prediction sum
y_pred = np.sign(np.sum(y_pred, axis=1))
return y_pred