Pruning is an older concept in the deep learning field, dating back to Yann LeCun’s 1990 paper Optimal Brain Damage. It has recently gained a lot of renewed interest, becoming an increasingly important tool for data scientists. The ability to deploy significantly smaller and faster models has driven most of the attention, all while minimally affecting (and in some cases improving) metrics such as accuracy.
Pruning is the process of removing weight connections in a network to increase inference speed and decrease model storage size. In general, neural networks are very over parameterized. Pruning a network can be thought of as removing unused parameters from the over parameterized network.
Mainly, pruning acts as an architecture search within the network. In fact, at low levels of sparsity (~40%), a model will typically generalize slightly better, as pruning acts as a regularizer. At higher levels, the pruned model will match the baseline. Pushing it further, the model will begin to generalize worse than the baseline, but with better performance. For example, a well-pruned ResNet-50 model can nearly match the baseline accuracy on ImageNet at 90% sparsity (90% of the weights in the model are zero).
There are numerous algorithms and hyperparameters to choose from when pruning. This can make it difficult to know where to start, or what to fix when things go wrong. In this eBook, we provide an overview of the best practices for pruning a model to make the process easier and better guarantee success. Additionally, we include an in-depth walkthrough of gradual magnitude pruning (the pruning algorithm we, and the research community, have found to work the best) and its associated hyperparameters.
For an in-depth overview of pruning, download the Pruning for Success eBook here.
[More from Neural Magic: Neural Magic Launches High-Performance Inference Engine and Tool Suite for CPUs]
Neural Magic: No-Hardware AI, or shattering the hardware barriers holding back the field of machine learning. Neural Magic is making the power of deep learning simple, accessible, and affordable for anyone. As a part of this next great unlock of machine learning, data scientists will ask bigger questions, challenge norms, and unleash a new class of AI applications that live at the edge of imagination.