Patch Normalization Regularization

贡献:

提出了一种新的正则化方法,减少过拟合的发生,同时让神经网络具有更好的鲁棒性。

这种方法在图像和feature map 内部进行局部的permutation (置换),没改变整体的特征信息,又添加了新的变化(variance),用于训练中,提高了模型的鲁棒性,防止过拟合的发生

过拟合

过拟合的定义:训练的模型适应了噪声信息,而不是去捕捉数据中隐藏的变量信息。

过拟合出现原因:

1.参数过少,不足以包含足够的信息,导致会被无关的局部信息误导。

2.数据过少,训练的模型鲁棒性太差。

即使输入的图片的局部有一些变化,但是并没有破坏图片的整体性情况下,模型应该还可以正常工作。比如图片的马赛克,虽然打码,但是人们依然可以透过模糊,猜出图片正确分类。

Patch Shuffle 的优点:

  1. 仅消耗极少的内存和时间。不改变学习策略的情况下, 可以应用于各种神经网络当中。
  2. 现有正则化方法的一个补充。在四种代表性的分类数据中,与其他正则化方法结合使用,PatchShuffle进一步提升了分类的精度。
  3. 提升了CNNs对噪声的鲁棒性。

注意:椒盐噪声是指两种噪声,一种是盐噪声(salt noise)盐 = 白色(255),另一种是胡椒噪声(pepper noise)椒 = 黑色(0)。前者是高灰度噪声,后者属于低灰度噪声。一般两种噪声同时出现,呈现在图像上就是黑白杂点。

PatchShuffle Refularization

X表示原图像,T()()表示PatchShuffle Transformation操作

r表示伯努利分布(Bernoulli),即概率p,r=1;概率(1-p),r=0;

将图像分割成没有重叠的数据块,大小n\times n

公式三,表示具体的permutation操作,p_{ij}改变x_{ij}的行,同理,最后边的p_{ij}^{'}改变列。

注意:每一个patch将会被置换n^{2}次。想象成n^{2}个空格插入n^{2}个苹果,你就明白了~

除了将ParchShuffle应用于图像,我们还可以将其应用于feature map。

在整个feature map随机抽选一个feature map PS(patch shuffle)处理。对于中低层的特征,其空间结构大部分保留,PS用在这些层上来正则化training。而对于高层卷积层,PS可以让临近的像素(pixel)共享权重(weight sharing)。

训练和推理

\gamma的数值大小控制着Patch Shuffle所占的比重大小

损失函数的期望:

在feature map 无论怎么变,最后都是要再一次回到原来的feature map,这种consistency一直存在。

一个具体的例子

 

 

 

 

2.2 Multi-Class CNN This section describes the multi-class implementation used to differentiate and classify four histopathological categories (ductal, lobular, mucinous, and papillary). The implementation followed the same engineering principles as the binary pipeline but used a categorical output and class-aware design choices to address the increased label complexity. 2.2.1 Data and Preprocessing The multi-class dataset was found on “BREAST CANCER HISTOPATHOLOGICAL DATABASE (BreakHisv1).” This dataset is modified to only include four types of malignant breast tumors, which are the four types of breast cancer used in training the model. The data structure is organized by breast cancer types, individual patient histopathology files, and the zoom factor of tissue screening. It is then assembled into a data frame of file paths and labels (class mapping: ductal carcinoma = 0, lobular carcinoma = 1, mucinous carcinoma = 2, papillary carcinoma = 3). The dataset contained 5,429 images distributed unevenly across classes (ductal: 3,451; mucinous: 792; lobular: 626; papillary: 560). An 80/20 train/validation split yielded 4,343 training and 1,086 validation samples. Images were loaded on-the-fly via ImageDataGenerator with rescaling (1./255) and light augmentation (rotation range=15, width/height shifts up to 0.1, horizontal flip). This on-the-fly strategy reduces memory pressure and preserves I/O efficiency for large image collections. Model architecture and training The multi-class CNN uses four convolutional blocks with progressively larger filters (32 → 64 → 128 → 256), each block containing Conv2D→BatchNormalization→MaxPooling2D→Dropout (0.25). After flattening, dense layers of 512 and 256 units with batch normalization and 0.5 dropout are applied; the final layer is Dense. The model is compiled with the Adam optimizer and categorical cross entropy loss to support mutually exclusive class prediction. Two experiment configurations exist in the codebase: (a) the original configuration (Batch Size = 32, Epochs = 30, LR = 0.001) that produced the run log included with this submission, and (b) a speed-optimized variant (Batch Size = 64, Epochs = 15, LR = 0.002) for faster iteration. Evaluation and Diagnostics Model evaluation uses a held-out generator and sklearn.metrics.classification_report and confusion_matrix to report per-class precision, recall, and F1-score; a seaborn heatmap visualizes the confusion matrix and matplotlib plots training/validation accuracy and loss. The run log for the original configuration reports a peak validation accuracy of 0.6529 (epoch 4) and a recorded training accuracy of 0.6967 (epoch 5) during the captured epochs; the full evaluation (per-class metrics and confusion-matrix counts) is then produced and included in the presentation. 2.2.4 Practical observations and improvements The dataset’s class imbalance (ductal dominance) is the main challenge for multi-class discrimination. Remedies to consider include class-weighted loss or oversampling of minority classes, focal loss to mitigate easy negative dominance, stronger augmentation targeted at minority classes, or stratified patch sampling. The model’s dense head (flatten → 512 units) yields ~19.4M parameters and can be prone to overfitting; replacing the flatten + dense stack with a GlobalAveragePooling2D followed by a small dense head or applying more aggressive dropout or L2 regularization may reduce overfitting. The literature suggests further gains from transfer learning (pretrained EfficientNet/ResNet backbones or DenTnet) and patch-level context models (BiLSTM on ordered patches) for cases where contextual arrangement of tissue is diagnostic. Step-by-step processing pipeline: Assemble DataFrame of filepaths and labels and report class counts. Create ImageDataGenerator instances for training (rescale + augmentation) and validation (rescale). Split into train/validation via train_test_split(..., stratify=df['label']) and create flow_from_dataframe generators. Build CNN (Conv blocks → Flatten → Dense → Softmax), compile with Adam + categorical_crossentropy. Train with model.fit using EarlyStopping and ReduceLROnPlateau callbacks; save the best model. Evaluate with a test generator: compute predicted classes, print classification_report, plot confusion matrix, and save accuracy/loss curves.分析multi-CNN模型的作用英文分析优点
最新发布
09-25
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值