Saturday, September 21, 2024
Google search engine
HomeUncategorisedEfficient, Simplistic Training Pipelines for GANs in the Cloud with Paperspace

Efficient, Simplistic Training Pipelines for GANs in the Cloud with Paperspace

Generative adversarial networks — GANs for short — are making waves in the world of machine learning. Yann LeCun, a legend in the deep learning community, said in a Quora post “[GANs are] the most interesting idea in the last 10 years in [machine learning].”

[Related Article: GANs explained. Generative Adversarial Networks applied to Generating Images]

GANs (and, more generally, neural networks) can be confusing at first. But developers have created lots of great frameworks for training pre-configured models efficiently. We’ll examine a package built by Hyeonwoo Kang at Catholic University of Korea that wraps PyTorch implementations for ten different types of GANs in an easy-to-use interface. We’ll also look at how to extend the package in order to run it in the cloud, particularly using Paperspace. 

Dawn of the Kernel

The most popular APIs for neural networks enable developers to write implementations in very few lines of code. Still, the development process can be painstaking — especially for novices. Even for professionals, rewriting the same neural network in separate scripts is a waste of time and should be avoided.

For this reason, distributing pre-configured models is helpful to the community — plus it upholds reproducibility and open source values. Most machine learning applications are ahead of the curve on this. However, it can still be challenging to find pre-configured packages for neural networks.

Fortunately, Kang’s package creates a simplistic interface to train pre-built GANs. Developers can train all models on a wide variety of inputs using the following command format:

$ python main.py --dataset <collection> --gan_type <TYPE> --epoch <int> --batch_size <int>

Not too bad, especially since the datasets will be downloaded on-the-fly and the models will fit their input dimensions appropriately.

For our example, to run BEGAN against the Fashion-MNIST dataset:

python main.py --gan_type BEGAN --dataset fashion-mnist --epoch 50 --batch_size 64 --result_dir /storage/<storage location>/results/gan/

In addition to outputting the model’s performance metrics and results, the script saves the model to disk for later use.

That’s all well, but training a GAN on a local machine can be prohibitively expensive. The process racks up compute time and monetary costs for a video card to train on. Fortunately, cloud solutions provide an alternative.

Training in the Cloud

Cloud architecture for machine learning solutions has proliferated in the past few years. The expansion has made heavy lifting accessible for people who can’t afford their own desktop.

One of my favorite cloud architectures is Paperspace, which has a platform called Gradient designed specifically for machine learning tasks. Paperspace is far from the only option, but it’s easy to start with. We’ll use it for our example.

Paperspace Gradient works by ingesting a Git repository and running a predefined hook command inside of a Docker container based off of a user-specified image. This occurs on a bare-metal platform, which the user can choose based on resource requirements and monetary cost. For this example, we have to extend Kang’s repository slightly by adding a run.sh file that sets up our environment and calls our Python code. The one I created looks like this:

#!/usr/bin/bash

pip install imageio
pip install scipy
pip install numpy
pip install matplotlib

mkdir -p /storage/${MY_DRIVE}/models/gan/
mkdir -p /storage/${MY_DRIVE}/datasets/fashion-mnist/
mkdir -p /storage/${MY_DRIVE}/results/gan/

# train
python main.py --gan_type BEGAN --dataset fashion-mnist --epoch 50 --batch_size 64 --result_dir /storage/${MY_DRIVE}/results/gan/

${MY_DRIVE} is the persistent storage Paperspace reserves for you, which can be accessed from any container and any virtual machine you create on the platform.

In order to run this, we simply pass in the location of our Git repository, along with the name of our Docker container — in this case, pytorch/pytorch — and the command we want to run, bash run.sh.

Once launched, the job takes about 18 minutes to run on Paperspace’s P4000 platform — one of their bottom-tier systems. It only costs $0.51 an hour to run, so creating our GAN is dirt cheap.

The Results

The final results of the GAN look pretty good, too!

Generative adversarial networks

[Related Article: 6 Unique GANs Use Cases]

By the end of training, we get the following parameter values:

Epoch: [50] [ 900/ 937] D_loss: 0.01793195, G_loss: 0.26628155, M: 0.26628155, k: 0.92347056

This is the direction that deep learning is increasingly headed as it becomes more accessible to non-experts.


This blog only scratches the surface. At ODSC West 2018, Seth Weidman will give a talk on the latest and greatest developments in GAN research. Weidman is a senior data scientist at Metis, a company specializing in accelerated programs for professionals looking to build their data science chops. In the past year, he has dedicated his time to understanding neural networks and establishing himself as an expert in the field.

RELATED ARTICLES

Most Popular

Recent Comments