<think>We are going to implement the `predict` function for a Linear Support Vector Machine (SVM) classifier.
The LinearSVM model has been trained and has a weight matrix `W` (of shape D x C, where D is the feature dimension and C is the number of classes).
Given a set of input examples (each as a vector of D features), we want to predict the class label for each example.
The prediction for a single example is done by computing the score for each class:
scores = x_i.dot(W) # a vector of C scores
Then, we predict the class with the highest score.
However, note that in the SVM formulation, we don't have a separate bias term if we have included the bias in the weight matrix by appending a constant feature to each input (like adding a 1 to the feature vector). If we did not, then we would have a separate bias vector. But in the provided context, we assume the weight matrix `W` includes the bias if one was used.
Steps for the predict function:
1. Compute the scores for all inputs and all classes: `scores = X.dot(W)`
- X: shape (N, D), W: shape (D, C) -> scores: shape (N, C)
2. For each input (each row in the scores matrix), find the index (class) with the maximum score.
Implementation:
Let's write the function accordingly.</think>### LinearSVM.predict 函数实现
在支持向量机(SVM)中,预测函数的核心是根据学习到的权重矩阵计算每个样本的得分,并选择最高得分的类别作为预测结果。以下是完整的实现和解释:
```python
import numpy as np
class LinearSVM:
def __init__(self, W=None):
"""
初始化线性SVM
:param W: 权重矩阵 (D, C)
其中 D = 特征维度 + 1(包含偏置项)
C = 类别数量
"""
self.W = W
def predict(self, X):
"""
使用训练好的权重矩阵预测类别标签
:param X: 输入数据 (N, D)
注意:X 应该包含偏置项(最后一列为1)
:return: 预测标签 (N,)
"""
if self.W is None:
raise ValueError("Model not trained. Weights are None.")
# 计算得分矩阵:X (N, D) * W (D, C) -> scores (N, C)
scores = np.dot(X, self.W)
# 获取每个样本的最高得分对应的类别索引
y_pred = np.argmax(scores, axis=1)
return y_pred
def train(self, X_train, y_train, learning_rate=1e-3, reg=1e-4, num_iters=1000):
"""
训练函数(简化版)
:param X_train: 训练数据 (N, D)
:param y_train: 训练标签 (N,)
:param learning_rate: 学习率
:param reg: 正则化强度
:param num_iters: 迭代次数
"""
num_train, dim = X_train.shape
num_classes = np.max(y_train) + 1
# 初始化权重矩阵(包含偏置项)
self.W = 0.001 * np.random.randn(dim, num_classes)
# 简化的梯度下降训练
for i in range(num_iters):
# 前向传播计算得分
scores = np.dot(X_train, self.W)
# 计算损失和梯度(这里省略具体实现)
# loss, dW = self.svm_loss(scores, y_train, reg)
# 权重更新
# self.W -= learning_rate * dW
return self
# 使用示例
if __name__ == "__main__":
# 创建模拟数据
np.random.seed(42)
N, D, C = 100, 10, 3 # 100个样本,10维特征,3个类别
# 生成数据(包含偏置项)
X = np.random.randn(N, D-1) # 特征
X = np.hstack((X, np.ones((N, 1)))) # 添加全1的偏置列
y = np.random.randint(0, C, N) # 随机标签
# 创建并训练模型(简化版)
svm = LinearSVM()
svm.train(X, y)
# 预测
predictions = svm.predict(X)
# 计算准确率
accuracy = np.mean(predictions == y)
print(f"预测准确率: {accuracy:.2%}")
# 输出预测示例
print("\n前5个样本预测结果:")
for i in range(5):
print(f"样本 {i+1}: 真实标签={y[i]}, 预测标签={predictions[i]}")
```
#### 关键代码解释:
1. **得分计算**:
```python
scores = np.dot(X, self.W)
```
- 矩阵乘法计算每个样本在各个类别的得分
- 形状:(N, D) × (D, C) → (N, C)
2. **预测决策**:
```python
y_pred = np.argmax(scores, axis=1)
```
- `axis=1` 表示对每行(每个样本)取最大值索引
- 返回每个样本预测的类别索引
3. **偏置项处理**:
```python
X = np.hstack((X, np.ones((N, 1))))
```
- 在特征矩阵末尾添加全1列作为偏置项
- 允许权重矩阵自动学习偏置参数
#### 预测过程可视化:
```mermaid
graph LR
A[输入样本] --> B[特征矩阵 X]
B --> C[权重矩阵 W]
C --> D[矩阵乘法 X·W]
D --> E[得分矩阵]
E --> F[取每行最大值索引]
F --> G[预测标签]
```
#### 高级优化技巧:
1. **批处理预测**:
```python
def predict_batch(self, X_batch, batch_size=32):
"""分批处理大型数据集避免内存溢出"""
num_samples = X_batch.shape[0]
predictions = np.empty(num_samples, dtype=int)
for start in range(0, num_samples, batch_size):
end = min(start + batch_size, num_samples)
batch = X_batch[start:end]
scores = np.dot(batch, self.W)
predictions[start:end] = np.argmax(scores, axis=1)
return predictions
```
2. **置信度输出**:
```python
def predict_with_confidence(self, X):
"""返回预测结果及置信度"""
scores = np.dot(X, self.W)
y_pred = np.argmax(scores, axis=1)
max_scores = np.max(scores, axis=1)
min_scores = np.min(scores, axis=1)
confidence = (max_scores - min_scores) / (np.max(scores) - np.min(scores))
return y_pred, confidence
```
3. **决策边界可视化**:
```python
def visualize_decision_boundary(self, X, y, feature_indices=[0,1]):
"""可视化两个特征的决策边界"""
plt.figure(figsize=(10, 8))
# 创建网格点
x_min, x_max = X[:, feature_indices[0]].min() - 1, X[:, feature_indices[0]].max() + 1
y_min, y_max = X[:, feature_indices[1]].min() - 1, X[:, feature_indices[1]].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
np.arange(y_min, y_max, 0.02))
# 准备网格数据(添加偏置项)
grid_points = np.c_[xx.ravel(), yy.ravel(), np.ones(xx.ravel().shape[0])]
# 预测并绘制
Z = self.predict(grid_points)
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.4)
plt.scatter(X[:, 0], X[:, 1], c=y, s=20, edgecolor='k')
plt.title('SVM Decision Boundary')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()
```