Generative Adversarial Network (GAN)

A generative adversarial network (GAN) is a machine learning (ML) model in which two neural networks compete with each other to become more accurate in their predictions. GANs typically run unsupervised and use a cooperative zero-sum game framework to learn.

The two neural networks that make up a GAN are referred to as the generator and the discriminator. The generator is a convolutional neural network and the discriminator is a deconvolutional neural network. The goal of the generator is to artificially manufacture outputs that could easily be mistaken for real data. The goal of the discriminator is to identify which outputs it receives have been artificially created.

Essentially, GANs create their own training data. As the feedback loop between the adversarial networks continues, the generator will begin to produce higher-quality output and the discriminator will become better at flagging data that has been artificially created.

What Are Generative Adversarial Networks?

Generative Adversarial Networks, or GANs, are a deep-learning-based generative model.

More generally, GANs are a model architecture for training a generative model, and it is most common to use deep learning models in this architecture.

The GAN architecture was first described in the 2014 paper by Ian Goodfellow, et al. titled “Generative Adversarial Networks.”

A standardized approach called Deep Convolutional Generative Adversarial Networks, or DCGAN, that led to more stable models was later formalized by Alec Radford, et al. in the 2015 paper titled “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks“.

Most GANs today are at least loosely based on the DCGAN architecture …

NIPS 2016 Tutorial: Generative Adversarial Networks, 2016.

The GAN model architecture involves two sub-models: a generator model for generating new examples and a discriminator model for classifying whether generated examples are real, from the domain, or fake, generated by the generator model.

  • Generator. Model that is used to generate new plausible examples from the problem domain.
  • Discriminator. Model that is used to classify examples as real (from the domain) or fake (generated).

Generative adversarial networks are based on a game theoretic scenario in which the generator network must compete against an adversary. The generator network directly produces samples. Its adversary, the discriminator network, attempts to distinguish between samples drawn from the training data and samples drawn from the generator.

— Page 699, Deep Learning, 2016.

The Generator Model

The generator model takes a fixed-length random vector as input and generates a sample in the domain.

The vector is drawn from randomly from a Gaussian distribution, and the vector is used to seed the generative process. After training, points in this multidimensional vector space will correspond to points in the problem domain, forming a compressed representation of the data distribution.

This vector space is referred to as a latent space, or a vector space comprised of latent variables. Latent variables, or hidden variables, are those variables that are important for a domain but are not directly observable.

A latent variable is a random variable that we cannot observe directly.

— Page 67, Deep Learning, 2016.

We often refer to latent variables, or a latent space, as a projection or compression of a data distribution. That is, a latent space provides a compression or high-level concepts of the observed raw data such as the input data distribution. In the case of GANs, the generator model applies meaning to points in a chosen latent space, such that new points drawn from the latent space can be provided to the generator model as input and used to generate new and different output examples.

Machine-learning models can learn the statistical latent space of images, music, and stories, and they can then sample from this space, creating new artworks with characteristics similar to those the model has seen in its training data.

— Page 270, Deep Learning with Python, 2017.

After training, the generator model is kept and used to generate new samples.

The Discriminator Model

The discriminator model takes an example from the domain as input (real or generated) and predicts a binary class label of real or fake (generated).

The real example comes from the training dataset. The generated examples are output by the generator model.

The discriminator is a normal (and well understood) classification model.

After the training process, the discriminator model is discarded as we are interested in the generator.

Sometimes, the generator can be repurposed as it has learned to effectively extract features from examples in the problem domain. Some or all of the feature extraction layers can be used in transfer learning applications using the same or similar input data.

We propose that one way to build good image representations is by training Generative Adversarial Networks (GANs), and later reusing parts of the generator and discriminator networks as feature extractors for supervised tasks

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015.

How GANs work

The first step in establishing a GAN is to identify the desired end output and gather an initial training dataset based on those parameters. This data is then randomized and input into the generator until it acquires basic accuracy in producing outputs.

After this, the generated images are fed into the discriminator along with actual data points from the original concept. The discriminator filters through the information and returns a probability between 0 and 1 to represent each image’s authenticity (1 correlates with real and 0 correlates with fake). These values are then manually checked for success and repeated until the desired outcome is reached.

Popular use cases for GANs

GANs are becoming a popular ML model for online retail sales because of their ability to understand and recreate visual content with increasingly remarkable accuracy.  Use cases include:

  • Filling in images from an outline.
  • Generating a realistic image from text.
  • Producing photorealistic depictions of product prototypes.
  • Converting black and white imagery into color.

In video production, GANs can be used to:

  • Model patterns of human behavior and movement within a frame.
  • Predict subsequent video frames.
  • Create deepfake

Information Source –