Deconstructing Denoising Diffusion Models for Self-Supervised Learning解读(超详细)

论文题目:Deconstructing Denoising Diffusion Models for Self-Supervised Learning

原文链接:https://arxiv.org/html/2401.14404v1
本文是对何恺明老师的新作进行的详细解读,其中穿插了一些思考,将从以下四个方面对这篇工作进行一个详细的介绍:
在这里插入图片描述

对早期工作的概括

在这里插入图片描述

AE

为了更加清楚全面的介绍扩散模型,我们首先要从自编码器谈起,自编码器由一个编码器和一个解码器组成,通过编码器对图像进行编码,然后通过解码器再解码成原来的图像,优化目标也非常简单:编码再解码后的重建图像应该和原图像尽可能一致,即二者的均方误差应该尽可能小。我们不需要给图片打上标签,整个训练过程是自监督的。所以我们说,整套模型是一个自编码器(Autoencoder,AE)

VAE

但是,AE生成效果并不好,会有过拟合现象,这导致AE的解码器只认得训练集里的图片经编码器解码出来的压缩数据,而不认得随机生成的压缩数据,进而也无法达到图像生成的要求。于是尝试着对AE进行改进,变分自编码器(Variational Autoencoder, VAE) 就是其中的代表。第一点改进是VAE让编码器的输出不再是一个确定的数据,而是一个数据的分布。这样的话,VAE就不能死记硬背了,必须要找出数据中的规律。VAE的第二项改动是多添加一个学习目标,让编码器的输出和标准正态分布尽可能相似,相当于对学习到的分布增加了一个约束,让编码器学习到的分布更加接近于标准正态分布。这样,VAE的误差函数由两部分组成:原图像和重建图像的重建误差、编码器输出和标准正态分布之间的误差。

VQVAE

但是问题又来了,VAE生成的图像还是比较模糊。也诞生了许多对VAE进行改进的模型,其中一条改进路径是VQVAE,由于VAE的编码器得到的分布不是很好学,而是一个codebook代替了,codebook可以理解为一个聚类的中心(通常为8192),这样就把特征量化了,而不是从分布里采样的一个随机的东西,优化起来就相对容易。

VQGAN

VQGAN是一个改进版的VQVAE,它将感知误差和GAN引入了图像压缩模型,把压缩图像生成模型替换成了更强大的Transformer。凭借这些改动,VQGAN方法能够生成高质量的高清图片。并且,通过把额外的约束条件(如语义分割图像、文字)输入进Transformer,VQGAN方法能够实现带约束的图像生成。(回过头来看看,之前的研究者们一直在可控和多样性的图像生成之间做权衡,一直没能找到一种可控生成而又保证多样性的图像生成方法)。

DDPM

另一条路径就是我们今天要讲的主角扩散模型 Diffusion Model,VAE之所以效果不好,很可能是因为它的约束太少了。VAE的编码和解码都是用神经网络表示的。神经网络是一个黑盒,我们不好对神经网络的中间步骤施加约束,只好在编码器的输出(某个正态分布)和解码器的输出(重建图像)上施加约束

### Self-Ensemble Concept In the realm of machine learning, self-ensemble refers to a technique where multiple models are created from variations or augmentations of training data points. These models collectively contribute towards making predictions that can be more robust than those made by any single model alone[^1]. The ensemble is built using different snapshots of the same neural network at various stages during its training process. The core idea behind this approach lies in leveraging diverse perspectives provided by these varied instances of the model trained on slightly altered datasets derived through transformations like noise addition or dropout regularization techniques applied over original inputs. This diversity helps improve generalization capabilities while reducing variance across predictions. #### Applications in Machine Learning One prominent application area for self-ensembles involves semi-supervised learning scenarios wherein only limited labeled examples exist alongside abundant unlabeled ones available for use during training phases. By applying consistency regularization methods such as Mean Teacher (MT), Temporal Ensembling (TE), Virtual Adversarial Training (VAT), etc., one ensures stable performance even when dealing with scarce supervision signals. Another significant utilization pertains to unsupervised domain adaptation tasks aiming to transfer knowledge acquired within source domains characterized by ample annotated samples into target environments lacking sufficient labeling but sharing similar characteristics otherwise unobserved directly due to distributional shifts between them both spatially and temporally speaking. Additionally, self-ensemble has been successfully employed in improving adversarial robustness against carefully crafted attacks designed specifically targeting deep networks' vulnerabilities exposed under certain conditions leading potentially catastrophic failures unless properly mitigated beforehand via defensive mechanisms embedded throughout architecture design choices including preprocessing steps taken prior feeding raw input features into subsequent layers responsible ultimately producing final outputs after passing several intermediate computations along pathways connecting neurons together forming complex webs capable performing intricate pattern recognition feats beyond human comprehension levels achievable today thanks largely advances brought forth recent years particularly around computational power availability coupled efficient algorithms development enabling faster experimentation cycles yielding better results overall time frame considered historically relevant benchmarks established previously before current era commenced officially ushering new age artificial intelligence research endeavors worldwide spanning numerous disciplines ranging natural sciences social studies humanities arts culture technology engineering mathematics statistics physics chemistry biology medicine health care environmental sustainability energy resources management policy governance ethics law regulation compliance security privacy protection safety assurance quality control standards setting benchmark creation measurement evaluation assessment feedback improvement innovation disruption transformation evolution revolution renaissance enlightenment awakening consciousness expansion awareness elevation transcendence ascension liberation freedom empowerment autonomy sovereignty independence interdependence cooperation collaboration coordination synchronization harmonization integration synthesis analysis decomposition reconstruction deconstruction construction building designing creating imagining envisioning conceptualizing theorizing hypothesizing experimenting validating verifying falsifying refuting rebutting arguing debating discussing communicating collaborating cooperating coordinating synchronizing harmonizing integrating synthesizing analyzing decomposing reconstructing deconstructing constructing building designing creating imagining envisioning conceptualizing theorizing hypothesizing experimenting validating verifying falsifying refuting rebutting arguing debating discussing communicating. ```python import numpy as np def create_self_ensemble(model, X_train, y_train=None): ensembles = [] # Create multiple versions of the dataset with slight modifications. for i in range(5): modified_X = apply_transformation(X_train.copy()) if y_train is not None: ensemble_model = train_model(model, modified_X, y_train) else: ensemble_model = train_unsupervised_model(model, modified_X) ensembles.append(ensemble_model) return ensembles def predict_with_self_ensemble(ensembles, X_test): all_predictions = [] for model in ensembles: prediction = model.predict(X_test) all_predictions.append(prediction) averaged_prediction = np.mean(all_predictions, axis=0) return averaged_prediction ```
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

KKdlg

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值