Friday, January 17, 2025
Google search engine
HomeData Modelling & AIAmplifying Deep Learning: A Dive into Data Augmentation Strategies

Amplifying Deep Learning: A Dive into Data Augmentation Strategies

Introduction

In the world of deep learning, where data is often less, the role of data augmentation has become very important. We use methods like turning images or flipping them to make our model learn better. But our datasets are becoming more complicated. That’s where data augmentation steps in. This helps for our learning model and to manage complex datasets with new and effective methods.

New methods like Cutmix, Mixup and Cutout dynamically create augmented samples, that provides more easy solution for handling complicated datasets. Data augmentation makes our deep learning model even more smarter. It shows the limitations of static data augmentation. This blog begins on a journey to look into details of data augmentation in deep learning, by learning it’s importance, techniques and practical implications.

Deep Learning

Learning Objectives

  • Understanding the basics: Fundamental concept of Dynamic Data Augmentation in the context of deep learning.
  •  Explore Adaptive Techniques:  Learn different Dynamic Data Augmentation techniques and their contribution in betterment of model training.
  • Implementation in Deep Learning: Practical steps to implement Data Augmentation in Deep learning framework and step-by-step code snippets for hands-on understanding. 
  • Applications in Deep Learning: Gain a hands-on understanding of the applications of  Data Augmentation in Deep learning.

This article was published as a part of the Data Science Blogathon.

Understanding The Basics

In the area of deep learning, where computers learn to do smart things, there’s a challenge we face – sometimes, we don’t have many examples for them to learn from. That’s where the idea of data augmentation comes in. Think of it like this: if we want a computer to identify cats, we can show it lots of pictures of cats. But what if we don’t have a mountain of cat photos? Here’s where data augmentation steps up. We take the pictures we have and give them a little twist – maybe flip them, rotate them, or zoom in a bit. It’s like creating new learning moments for the model, helping it become better at noticing things, even with not-so-many examples.

Now, let’s talk about something even better called data augmentation. It’s like a new upgrade for our deep learning models. Instead of using the same old tricks all the time, data augmentation changes things up based on what the machine is learning. So, in simple words, it’s a smarter way to teach our machines, making them better learners.

Augmentation techniques

Audio Data Augmentation

1.  Noise injection: Add gaussian or random noise to the audio dataset to improve the model performance.

2.  Shifting: Shift audio left (fast forward) or right with random seconds.

3.  Changing the speed: Stretches times series by a fixed rate.

4.  Changing the pitch: Randomly change the pitch of the audio.

Text Data Augmentation

1.  Word or sentence shuffling: Randomly changing the position of a word or sentence.

2.  Word replacement: Replace words with synonyms.

3.  Syntax-tree manipulation: Paraphrase the sentence using the same word.

4.  Random word insertion: Inserts words at random.

5.  Random word deletion: Deletes words at random.

Image Augmentation

1.  Geometric transformations: Randomly flip, crop, rotate, stretch, and zoom images. You need to be careful about applying multiple transformations on the same images, as this can reduce model performance.

2.  Color space transformations: Randomly change RGB color channels, contrast, and brightness.

3.  Kernel filters: Randomly change the sharpness or blurring of the image.

4.  Random erasing: Delete some part of the initial image.

5.  Mixing images: Blending and mixing multiple images.

Image Augmentation

Now let’s see Image Augmentation Methods in more detail.

Classical Image Augmentation techniques for convolutional neural networks in computer vision are scaling, cropping, flipping, or rotating an image.

The most effective image augmentation tech other than the classical ones are:

1.  Cutout

2.  Mixup

3. Cutmix

Image Augmentation

Cutout

Cutout was introduced in a paper called “Improved regularization of convolutional neural networks with cutout” by DeVries & Taylor in 2017. The main idea behind Cutout image augmentation is to randomly remove a square region of pixels in an input image during training. 

This tech prevents the model from depending too heavily on specific features, forcing it to focus on the entire input. It acts as a regularization method, introducing noise and making the model more strong to unimportant patterns. Cutout is simple yet useful, especially in cases where the dataset is likely to overfitting.

Implementation in Python with PyTorch

transforms_cutout = A.Compose([
    A.Resize(256, 256), 
    A.CoarseDropout(max_holes = 1, # Maximum number of regions to zero out. (default: 8)
                    max_height = 128, # Maximum height of the hole. (default: 8) 
                    max_width = 128, # Maximum width of the hole. (default: 8) 
                    min_holes=None, 
                    min_height=None, 
                    min_width=None, 
                    fill_value=0, # value for dropped pixels.
                    mask_fill_value=None, # fill value for dropped pixels in mask. 
                    always_apply=False, 
                    p=0.5
                   ),
    ToTensorV2(),
])

The returned sample batch looks as follows:

sample Batch

Mixup

Mixup was introduced in a paper called “mixup: Beyond empirical risk minimization” by Zhang, Cisse, Dauphin, & Lopez-Paz also in 2017.

MixUp deals with overfitting by taking a different way. It involves linearly interpolating between pairs of training samples, both in terms of input features and corresponding labels. This smooth interpolation creates new samples, reducing the risk of the model memorizing specific examples. MixUp is particularly useful in cases where datasets doesn’t have difference, helping the model generalize better to unseen data.

Implementation in Python with PyTorch

The mixup() function applies Mixup to a full batch. The pairs are generated by shuffling the batch and selecting one image from the original batch and one from the shuffled batch.

def mixup(data, targets, alpha):
    indices = torch.randperm(data.size(0))
    shuffled_data = data[indices]
    shuffled_targets = targets[indices]

    lam = np.random.beta(alpha, alpha)
    new_data = data * lam + shuffled_data * (1 - lam)
    new_targets = [targets, shuffled_targets, lam]
    return new_data, new_targets

In addition to the function that augments the images and labels, we must modify the loss function with a custom mixup_criterion() function. This function returns the loss for the two labels according to the lam.

def mixup_criterion(preds, targets):
    targets1, targets2, lam = targets[0], targets[1], targets[2]
    criterion = nn.CrossEntropyLoss()
    return lam * criterion(preds, targets1) + (1 - lam) * criterion(preds, targets2)

The mixup() and mixup_criterion() functions, are not applied in the PyTorch Dataset but in the training code as shown below.

Since the augmentation is applied to the full batch, we will also add a variable p_mixup that controls the portion of batches that will be augmented. E.g. p_mixup = 0.5 would apply Mixup augmentation to 50 % of batches in an epoch.

for epoch in range(NUM_EPOCHS):        
    # Train
    model.train()

    # Define any variables for metrics
    
            
    for samples, labels in (train_dataloader):

        samples, labels = samples.to(device), labels.to(device)

        # Normalize
        samples = samples/255

        
        # Apply Mixup augmentation #
        
        p = np.random.rand()
        if p < p_mixup:
            samples, labels = mixup(samples, labels, 0.8)

        # Zero the parameter gradients
        ...

        with torch.set_grad_enabled(True):
            # Forward: Get model outputs and calculate loss
            output = model(samples)
                  
            
            # Apply Mixup criterion    #
                 
            if p < p_mixup:
                loss = mixup_criterion(output, labels)
            else:
                loss = criterion(output, labels) 

The returned sample batch looks as follows:

Sample Batch

Cutmix

Cutmix was introduced in a paper called “Cutmix: Regularization strategy to train strong classifiers with localizable features” by Yun, Han, Oh, Chun, Choe & Yoo in 2019.

CutMix is an augmentation tech that consists cutting and pasting patches from different images to create a new training sample. This process not only introduces differences but also makes the model to learn from regions of multiple images at the same time. By mixing different contexts, CutMix provides a more challenging training environment, improving the model’s strength against changes in real-world data.

Implementation in Python with PyTorch

The implementation for Cutmix is similar to the implementation of Mixup. First, you will also need a custom function cutmix() that applies the image augmentation.

def cutmix(data, targets, alpha):
    indices = torch.randperm(data.size(0))
    shuffled_data = data[indices]
    shuffled_targets = targets[indices]

    lam = np.random.beta(alpha, alpha)
    bbx1, bby1, bbx2, bby2 = rand_bbox(data.size(), lam)
    data[:, :, bbx1:bbx2, bby1:bby2] = data[indices, :, bbx1:bbx2, bby1:bby2]
    # adjust lambda to exactly match pixel ratio
    lam = 1 - ((bbx2 - bbx1) * (bby2 - bby1) / (data.size()[-1] * data.size()[-2]))

    new_targets = [targets, shuffled_targets, lam]
    return data, new_targets

        
def rand_bbox(size, lam):
    W = size[2]
    H = size[3]
    cut_rat = np.sqrt(1. - lam)
    cut_w = int(W * cut_rat)
    cut_h = int(H * cut_rat)

    # uniform
    cx = np.random.randint(W)
    cy = np.random.randint(H)

    bbx1 = np.clip(cx - cut_w // 2, 0, W)
    bby1 = np.clip(cy - cut_h // 2, 0, H)
    bbx2 = np.clip(cx + cut_w // 2, 0, W)
    bby2 = np.clip(cy + cut_h // 2, 0, H)

    return bbx1, bby1, bbx2, bby2

The rest is the same as for Mixup:

1.  Define a cutmix_criterion() functions to handle the custom loss (see the implementation of mixup_criterion())

2.  Define a variable p_cutmix to control the portion of batches that will be augmented (see p_mixup)

3.  Apply cutmix() and cutmix_criterion() in accordance to p_cutmix in the training code.

The returned sample batch looks as follows:

Deep Learning

 Comparison of Data Augmentation Strategies

 Comparison of Data Augmentation strategies

Applications of Data Augmentation

1. Medical Imaging

Using geometric and other transformations can help you train robust and accurate machine-learning models. For example, in the case of Pneumonia Classification, you can use random cropping, zooming, stretching, and color space transformation to improve the model performance. However, you need to be careful about certain augmentations as they can result in opposite results. For example, random rotation and reflection along the x-axis are not recommended for the X-ray imaging dataset.

Medical Imaging

2. Autonomous Vehicles

In autonomous driving scenarios, data augmentation is crucial to train models to identify objects, pedestrians, and road conditions in different environments. This includes simulating changes in weather, lighting, and road types.

Autonomous Vehicles

3. Natural Language Processing (NLP)

Data augmentation techniques such as paraphrasing and word replacement are applied in NLP tasks like text classification and sentiment analysis. This helps improve the model’s ability to understand and generalize across different forms of language expression.

Natural Language Processing

Conclusion

In this article, we covered the acceptance of Data Augmentation as an important step towards the increased model performance. This exploration consists of accepting Augmentation strategies ranging from traditional techniques like Geometric transformation and Color Space transformation to high level methods like Cutout, Cutmix and Mixup. Practical Implementation of these methods has also been explored. Further, the applications of Data Augmentation has also been discussed.

A brief overview of Data Augmentation has been done in this article. Implementing Data Augmentation shows a sign towards smarter, more adaptable models in the ever-evolving world of deep learning. Using these techniques into learning journeys builds the way for models that succeeds in the unknown real-world data, marking an important step towards strong and intelligent machine learning.

Key Takeaways

  • Learn the basics of Data Augmentation and the importance of using it to overcome overfitting and data shortage.
  • Data Augmentation is an important step towards improving the models performance.
  • Learned different high level Image Augmentation methods such as Cutmix, Cutout and Mixup and their practical implementation using Pytorch in Python.
  • Discussed different applications of Data Augmentation in different real world area.

Frequently Asked Questions

Q1. Why is Data Augmentation crucial in deep learning?

A. Data Augmentation is important as it helps overcome limitations in training data, improves model generalization, and reduces overfitting by providing a different set of augmented examples for learning.

Q2. What are some foundational techniques in Data Augmentation?

A. Foundational techniques include geometric transformations (rotation, scaling) and color space transformations, which lay the groundwork for more high level methods by introducing variability into the dataset.

Q3.  How does CutMix differ from traditional augmentation methods?

A. CutMix combines two images by cutting and pasting patches, promoting and improving model perfomance. This approach differs with traditional methods like flipping or rotating.

Q4.  Can Data Augmentation be applied beyond image-based tasks?

A. Yes, Data Augmentation extends beyond images and finds application in different domains. It is employed in natural language processing for text augmentation, in speech recognition for audio data manipulation, and in other fields.

Q5.  What challenges may arise when implementing Data Augmentation in certain applications?

A. While Data Augmentation is powerful, challenges may include the risk of removing important features, especially in few data, or the need for careful thought when applying certain techniques in specific contexts. It’s important to fit the approach to the characteristics of the dataset and the requirements of the task.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Jayesh Minigi

30 Jan 2024

RELATED ARTICLES

Most Popular

Recent Comments