Introduction
Imagine a world where fashion designers never run out of new ideas and every outfit we wear is a work of art. Sounds interesting, right? Well, we can make this happen in reality with the help of General Adversarial Networks (GANs). GANs had blurred the line between reality and imagination. It’s like a genie in a bottle that grants all our creative wishes. We can even create a sun on the Earth with the help of GANs, which is not possible in real life. Back in the 2010s, Lan Goodfellow and his colleagues introduced this framework. They actually aimed to address the challenge of unsupervised learning, where the model learns from unlabelled data and generate new samples. GANs have revolutionized a number of industries with their capacity to produce fascinating and lifelike content, and the fashion industry is leading the way in embracing this potential. Now we will explore the potential of GANs and understand how they magically work.
Learning Objectives
In this article, you will learn
- About Generative Adversarial Networks(GANs), and working of GANs.
- The role of GANs in the fields of ML and AI
- We will also see some challenges of using GANs and their future potential
- Understanding the power and potential of GANs
- Finally, the implementation of GANs on the MNIST fashion dataset
This article was published as a part of the Data Science Blogathon.
Table of contents
- Introduction
- Generative Adversarial Networks(GANs)
- Role of GANs in Machine Learning and Artificial Intelligence
- Challenges and Limitations
- Future Potential
- Fashion MNIST Dataset
- Applications of GANs in the Fashion Industry
- Implementation of the Fashion MNIST dataset
- Frequently Asked Questions
- Conclusion
- Frequently Asked Questions
Generative Adversarial Networks(GANs)
Generative Adversarial Networks are a class of machine learning models which are used for generating new realistic data. It can produce highly realistic images, videos, and many more. It contains only two neural networks: Generator and Discriminator.
Generator
A generator is a convolutional neural network that generates data samples that cannot be distinguished by the discriminator. Here generator learns how to create data from noise. It always tries to fool the discriminator.
Discriminator
The discriminator is a deconvolutional neural network that tries to correctly classify between real and fake samples generated by the generator. Discriminator takes both real and fake data generated by the generator and learns to distinguish it from real data. The discriminator will give a score between 0 and 1 as output for the generated images. Here 0 indicates the image is fake, and 1 indicates the image is real.
Adversarial Training
The training process includes generating fake data, and the discriminator tries to identify it correctly. It involves two stages: Generator training and Discriminator training. It also involves optimizing both the generator and discriminator. The goal of the generator is to generate data that are not distinguishable from real data and the goal of the discriminator is to identify real and fake data. If both networks work properly, then we can say the model is optimized. Both of them are trained using backpropagation. So whenever an error occurs, it will be propagated back and they will update their weights.
Training of GAN typically has the following steps:
- Define the problem statement
- Choose the architecture
- Train Discriminator on real data
- Generate fake inputs for the Generator
- Train the Discriminator on fake data
- Train Generator with the output of the Discriminator
- Iterate and refine
Loss Function
The loss function used in the GANs consists of two components, as we have two networks in its architecture. In this, the generator’s loss is based on how well it can generate realistic data that are not distinguishable by the discriminator. It always tries to minimize the discriminator’s ability. On the other hand, the discriminator’s loss is based on how well it can classify real and fake samples. It tries to minimize misclassification.
During training, both the generator and discriminator are updated alternatively. Here both try to minimize their losses. The generator tries to reduce its loss by generating better samples for the discriminator, and the discriminator tries to reduce its loss by classifying fake samples and real samples accurately. This process continues until the GAN reaches the desired level of convergence.
Role of GANs in Machine Learning and Artificial Intelligence
Due to their ability to generate new realistic data, GANs have become more important in the field of machine learning and artificial intelligence. This has many varieties of applications like video generation, image generation, text-to-image synthesis, etc. These revolutionize many industries. Let’s see some reasons why GANs are important in this field.
- Data Generation: We know that data is the most important thing for building models. We need a large number of datasets to train and build better models. Sometimes data is scarce, or maybe it is expensive. In such cases, GANs can be used to generate more new data using the existing ones.
- Data Privacy: Sometimes we need to use data for training models, but it may affect the privacy of individuals. In such cases, we can use GANs to create similar data to the original one and train the models to protect the privacy of individuals.
- Realistic Simulations: These enable the creation of accurate simulations of real-world situations and can be utilized to create machine learning models. For instance, since testing robots in the real world can be risky or expensive, we can utilize them to test the robots.
- Adversarial Attacks: GANs can be used to create adversarial attacks to test the robustness of machine learning models. It helps to identify vulnerabilities and helps in developing better models and also to improve security.
- Creative Applications: GANs can be used in generating creative applications for AI. They can be used to create games, music, artwork, films, animations, photographs, and much more. Additionally, it can produce original writing, like stories, poems, etc.
As the research on GANs still continues, we can expect many more miracles of this technology in the future.
Challenges and Limitations
Even though GANs have shown their ability to generate realistic and diverse data, it still has some challenges and limitations that need to be considered. Let’s see some challenges and limitations of GANs.
- GANs are very much dependent on training data. Generated data is based on the data used for training. These will generate data similar to training data. If it is limited in diversity, then GANs will also generate data limited in diversity and quality.
- It is difficult to train GANs because they are highly sensitive to the architecture of the network and the choice of hyperparameters used. These are prone to training instability as the generator and the discriminator can get stuck in the cycle of mutual deception. This leads to poor convergence resulting in the generation of poor-quality samples.
- If the generator is very good at distinguishing real and fake samples, then the generator will be able to generate samples that can fool the discriminator for distinguishing. This leads to the production of samples that are highly similar to each other, and it will be able to generate samples that cover the full range of possibilities in the dataset.
- It is also expensive to train GANs. Training GANs can be computationally expensive, especially when working with large datasets and complex architectures.
- One of the most concerning challenges of GANs is the impact on society in creating realistic fake data. This may lead to privacy concerns, bias, or misuse. For example, these can generate fake images or videos, leading to misinformation and fraud.
Future Potential
Though it has some challenges and limitations, GANs have a potentially bright future. Numerous industries, including healthcare, finance, and entertainment, are expected to experience a revolution as a result of GANs.
- One of its potential development will be generative medicine. It could be able to generate personalized medical Images and treatment plans for them. With the help of these GANs, even doctors could treat patients better by developing more effective treatments.
- It could be used to create virtual reality environments. These are very realistic and have many applications, like entertainment.
- Using GANs, we can create more realistic simulated environments where it can be used for testing autonomous vehicles. So that we can develop safer and more effective self-driving cars.
- These are not only limited to image-related tasks. They can also be used in Natural Language Processing( NLP) tasks. These include text generation, translation, and many more. They could generate contextually relevant texts, which is a must in building virtual assistants and chatbots.
- It will be very helpful for architects. It could generate new designs for buildings or any other structure. This helps architects and designers very much in creating more innovative designs.
- It could also be used for scientific research as it can generate data that can mimic real-world phenomena. They can create synthetic data for testing and validation in scientific investigations, help with drug development and molecular design, and simulate complex physical processes.
- GANs could also be used for crime investigation. For example, we can create images of suspects using their identities. This leads to faster and more successful investigations.
Fashion MNIST Dataset
It is a popular dataset used in machine learning for various purposes. It’s a replacement for the original MNIST dataset, which contains digits from 0 to 9. In our fashion MNIST dataset, we have images of various fashion items instead of digits. This dataset contains 70000 images, of which 60000 are training images and 10000 are testing images. Each of them is in greyscale with 28 x 28 pixels. The fashion MNIST dataset has 10 classes of fashion items. They are:
- T-shirt
- Dress
- Coat
- Pullover
- Shirt
- Trouser
- Bag
- Sandal
- Sneaker
- Ankle Boot
Initially, this dataset was created to develop machine-learning models for classification. This dataset is even used as a benchmark for evaluating many machine learning algorithms. This dataset is easy to access and can be downloaded from various sources, including Tensorflow and PyTorch libraries. Compared to the original digits MINIST dataset, it is more challenging. Models must be able to distinguish between various fashion products that may have similar shapes or patterns. This makes it suitable for testing the robustness of various algorithms.
Applications of GANs in the Fashion Industry
The fashion industry has undergone a tremendous transition because of GANs, which enabled creativity and change. The way we design, produce, and experience fashion has been revolutionized by GANs. Let’s see some real-world applications of General Adversarial Networks(GANs) in the fashion industry.
- Fashion Design and Generation: GANs are capable of generating new designs and new fashion concepts. This helps designers in creating innovative and attractive styles. A wide range of combinations, patterns, and colors can be explored by using GANs. For instance, H&M, a clothing shop, used GANs to develop fresh outfits for their products.
- Virtual Try-on: Virtual try-on is a virtual trial room. In this, GANs can generate more realistic images of customers with their garments. So customers can actually know how they look in those garments without actually wearing them physically.
- Fashion Forecasting: GANs are also used for forecasting. They can generate fashion trends in the future. This helps fashion brands in generating new styles and keeping with trends.
- Fabric and Texture Synthesis: GANs help designers in generating high-resolution fabric textures by experimenting with various materials and patterns virtually without actually experimenting with them in real. This helps in saving a lot of time and resources and also helps with innovative design processes.
Implementation of the Fashion MNIST dataset
We will now use Generative Adversarial Networks (GANs) to generate fashion samples using the MNIST fashion dataset. Start by importing all the necessary libraries.
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import Reshape
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import ZeroPadding2D
from tensorflow.keras.layers import LeakyReLU
from tensorflow.keras.layers import UpSampling2D
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.models import Sequential
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt
import sys
We have to load the dataset. Here we are using the fashion MNIST dataset. This is a built-in dataset in tensorflow. So we can directly load this using tensorflow keras. This dataset is basically used for classification tasks. As discussed earlier, it has greyscale images of pixels 28 x 28. We just need a training set of data. So we will divide it into training and testing datasets and load only the training set.
Loaded data is then normalized between -1 and 1. We usually normalize to improve the stability and convergence of deep learning models during training. This is a common step in most deep-learning tasks. And finally, we will add an extra dimension to the data array. Because we need to match the expected input shape of the generator. The generator requires a 4D tensor. It represents the batch size, height, width, and number of channels.
# Load fashion dataset
(X_train, _), (_, _) = tf.keras.datasets.fashion_mnist.load_data()
X_train = X_train / 127.5 - 1.
X_train = np.expand_dims(X_train, axis=3)
Set dimensions of generator and discriminator. Here gen_input_dim is the size of the generator’s input, and in the next line, define the shape of images that are generated by the generator. Here it is 28 x 28 and in greyscale as we are providing only one channel.
gen_input_dim = 100
img_shape = (28, 28, 1)
Define Generator Model
Now we will define the generator model. It takes only one single argument and that is the input dimension. It uses keras sequential API to build the model. It has three fully connected layers with LeakyReLU activation functions and batch normalization. And in the final layer, it uses tanh activation function to generate the final output image. Finally, it returns a keras model object which takes the noise vector as input and gives a generated image as output.
def build_generator(input_dim):
model = Sequential()
model.add(Dense(256, input_dim=input_dim))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(1024))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(np.prod(img_shape), activation='tanh'))
model.add(Reshape(img_shape))
noise = Input(shape=(input_dim,))
img = model(noise)
return Model(noise, img)
Define Discriminator Model
The next step is to build a discriminator. It is almost similar to the generator model but here it has only two fully connected layers and with sigmoid activation function for the last layer. And it returns the model object as output by taking the noise vector as input and outputs the probability that the image is real.
def build_discriminator(img_shape):
model = Sequential()
model.add(Flatten(input_shape=img_shape))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(256))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(1, activation='sigmoid'))
img = Input(shape=img_shape)
validity = model(img)
return Model(img, validity)
Compile Models
Now we have to compile them. We use binary cross-entropy loss and the Adam optimizer to compile the discriminator and generator. We set the learning rate to 0.0002 and the decay rate to 0.5. A discriminator model is built and compiled using a binary cross-entropy loss function which is popularly used for binary classification tasks. Accuracy metrics are also defined to evaluate the discriminator.
Similarly, a generator model is built that creates an architecture for the generator. Here we won’t compile the generator as we do for the discriminator. It will be trained in an adversarial manner against the discriminator. z is an input layer representing random noise for the generator. The generator takes z as input and generates img as output. The discriminator’s weights are frozen during the training of the combined model. The generator’s output will be fed to the discriminator and validity will be generated, which measures the quality of the generated image. Then the combined model is created using z as input and validity as output. This is used to train the generator.
optimizer = Adam(0.0002, 0.5)
discriminator = build_discriminator(img_shape)
discriminator.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
generator = build_generator(gen_input_dim)
z = Input(shape=(gen_input_dim,))
img = generator(z)
discriminator.trainable = False
validity = discriminator(img)
combined = Model(z, validity)
combined.compile(loss='binary_crossentropy',
optimizer=optimizer)
Training
It’s time to train our GAN. We know that it runs for epochs number of iterations. In each iteration, a batch of random images is taken from the training set and a batch of fake images is generated by the generator by passing noise.
Discriminator is trained on both real images and fake images. And the average loss is calculated. The generator is trained on noise and the loss is calculated. Here we have defined sample_interval as 1000. So for every 1000 iterations, losses will be printed.
# Train GAN
epochs = 5000
batch_size = 32
sample_interval = 1000
d_losses = []
g_losses = []
for epoch in range(epochs):
idx = np.random.randint(0, X_train.shape[0], batch_size)
real_images = X_train[idx]
# Train discriminator
noise = np.random.normal(0, 1, (batch_size, gen_input_dim))
fake_images = generator.predict(noise)
d_loss_real = discriminator.train_on_batch(real_images, np.ones((batch_size, 1)))
d_loss_fake = discriminator.train_on_batch(fake_images, np.zeros((batch_size, 1)))
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
d_losses.append(d_loss[0])
# Train generator
noise = np.random.normal(0, 1, (batch_size, gen_input_dim))
g_loss = combined.train_on_batch(noise, np.ones((batch_size, 1)))
g_losses.append(g_loss)
# Print progress
if epoch % sample_interval == 0:
print(f"Epoch {epoch}, Discriminator loss: {d_loss[0]}, Generator loss: {g_loss}")
Generate Sample Images
Now let’s see some generated samples. Here we are plotting a grid with 5 rows and 10 columns of these samples. This is created with matplotlib. These generated samples are similar to the dataset we used for training. We can generate better-quality samples by training for more epochs.
# Generate sample images
r, c = 5,10
noise = np.random.normal(0, 1, (r * c, gen_input_dim))
gen_imgs = generator.predict(noise)
# Rescale images 0 - 1
gen_imgs = 0.5 * gen_imgs + 0.5
# Plot images
fig, axs = plt.subplots(r, c)
cnt = 0
for i in range(r):
for j in range(c):
axs[i,j].imshow(gen_imgs[cnt,:,:,0], cmap='gray')
axs[i,j].axis('off')
cnt += 1
plt.show()
Frequently Asked Questions
A. GANs (Generative Adversarial Networks) are poised to revolutionize various aspects of our lives in the future. They have the potential to enhance creative fields, such as art and design, by generating realistic and novel content. Additionally, GANs can improve industries like healthcare and manufacturing through realistic simulations and synthetic data generation. They may also raise ethical concerns regarding the authenticity of content and data privacy.
A. GANs (Generative Adversarial Networks) are a powerful tool in AI. They are used for generating realistic and high-quality synthetic data, which can be used for training machine learning models. GANs have applications in computer vision, natural language processing, and data synthesis. They enable the creation of realistic images, videos, and audio, as well as the generation of text, making them valuable for various AI tasks, research, and creative endeavors.
Conclusion
Generative Adversarial Networks (GANs) are the most popular choice for many applications because of their unique architecture, training process, and their ability to generate data. As with any technology, GANs too have some challenges and limitations. Researchers are working to minimize them and crave better GANs. Overall we have learned and understood the power and potential of GANs and their working. We have also built a GAN to generate fashion samples using the fashion MNIST dataset.
- These are powerful tools for generating new data samples for a variety of applications. As demonstrated in this article, it can revolutionize many industries, and fashion is one among them.
- There are different types of GANs based on their ability to generate a kind of data and also based on their features. For example, we have DCGANs, for generating images, Conditional GANs for image-to-image translation, Style GANs, etc.
- One relieving advantage of GANs is that there will be no data scarcity for training and building machine learning models.
- It has no limit to its creativity that can rule the future of artificial intelligence and machine learning. Let’s see what miracles it will create in the future.
Hope you found this article useful. Connect with me on LinkedIn.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.
Frequently Asked Questions
A. GANs, or Generative Adversarial Networks, generate synthetic data that closely resembles real data. They have applications in various fields, including image generation, video synthesis, text generation, and data augmentation.
A. There are several types of GANs, including Conditional GANs (cGANs) that generate outputs based on specific conditions, CycleGANs that learn mappings between two domains, and Progressive GANs that generate images of increasing quality.
A. GANs have two main components: generator and discriminator networks. The generator generates synthetic data, while the discriminator distinguishes between real and fake data. Both networks are trained simultaneously in a competitive fashion.
A. The advantage of GANs is their ability to generate realistic and diverse synthetic data. They can capture complex patterns and generate new samples that exhibit similar characteristics to the training data. GANs have broad applications in various creative and data-driven domains.
A. GANs can be challenging to train and stabilize. They are sensitive to hyperparameters and may suffer from mode collapse, where the generator fails to explore the entire data distribution. Evaluating GAN performance objectively is also a challenge, making it difficult to assess the quality of generated samples.