Autoencoders的理解
From paper “Unsupervised speech representation learning
using WaveNet autoencoders”
"Autoencoders: networks which are tasked with reconstructing their inputs. Autoencoders use an encoding network to extract a latent representation, which is then pass through a decoding network to recover the original data.
Ideally, the latent representation preserves the salient features of the original data, while being easier to analyze and work with, e.g. by disentangling different factors of variation in the data, and discarding spurious patterns (noise). These desirable qualities are typically obtained through a judicious application of regularization techniques and constraints or bottlenecks (we
use the two terms interchangeably). The representation learned by an autoencoder is thus subject to two competing forces. On the one hand, it should provide the decoder with information necessary for perfect reconstruction and thus capture in the latents as much of the input data characteristics as possible. On the other hand, the constraints force some information to be discarded, preventing the latent representation from being trivial to invert, e.g. by exactly passing through the input. Thus the bottleneck is necessary to force the network to learn a non-trivial data transformation."
自动编码器是一个以重建它们的输入为任务的网络。自编码器使用编码网络去提取一个潜在的表示,而这个潜在的表示将被传递给解码网络用于恢复原始的数据。
理想情况下,潜在的表示会保存原始数据的重要特征,而这个保存下来的重要特征更容易被分析和处理,例如解开数据中变化的不同因素,以及丢弃掉伪造模式(噪声)。这些想要的品质通常是通过明智地使用正则化技术和限制或者瓶颈。因此由自编码器学习到的表示受制于两个相互竞争的力量。一方面,应该提供给解码器必要的信息用来完美重建,因此需要尽可能多地捕捉输入数据的特征。另一方面,限制迫使一些信息需要被丢弃,以阻止潜在的表示过于繁琐,比如直接就把输入原封不动地往后传递。因此限制或者说是瓶颈是非常有必要的,它们使网络学习不繁琐的数据转换。