This specification relates to processing image data through the layers of neural networks to generate outputs.
Neural networks are machine learning models that employ one or more layers of nonlinear units to predict an output for a received input. Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, i.e., the next hidden layer or the output layer. Each layer of the network generates an output from a received input in accordance with current values of a respective set of parameters.
Variational auto encoders can auto encode input images, i.e., generate output images that are reconstructions of input images provided to the auto encoder. Variational auto encoders typically include an encoder neural network and a generative neural network. Generally, the encoder neural network and the generative neural network in a given variational auto encoder are trained jointly to generate high quality reconstructions of input images.
Once trained, the encoder neural network in the variational auto encoder is configured to receive an input image and to process the input image to generate outputs that define values of latent variables that represent features of the input image. For example, for a given latent variable, a corresponding output of the encoder neural network can parameterize a distribution from which the latent variable is sampled or can be used to generate the parameters of the distribution, e.g., by applying a linear transformation to the output. Some encoder neural networks are recurrent neural networks that generate outputs that define values of latent variables at each of multiple time steps, some are deep neural networks that generate outputs that define values of latent variables at each of multiple layers, and some are deep recurrent neural networks that are both recurrent and deep.
Once trained, the generative neural network is configured to generate a reconstruction of the input image conditioned on the latent variables defined by the outputs of the encoder neural network. That is, the generative neural network receives the values of the latent variables as input and generates a reconstruction of the input image using the received values. As with the encoder neural network, some generative neural networks are recurrent, some generative neural networks are deep, and some generative neural networks are deep recurrent neural networks.