Microsoft Introduces SliceGPT For Compressing LLMs

19 July 2024

1

To address the resource-intensive nature of LLMs, Microsoft introduces SliceGPT, a novel sparsification technique designed to compress models without sacrificing performance. Let’s examine the details of this new approach.

What is Special About SliceGPT?

Microsoft’s latest offering, SliceGPT, is an innovative solution for compressing large language models. This technique removes up to 25% of model parameters, including embeddings, while maintaining impressive zero-shot task performance. The reduction in model size signifies a substantial breakthrough in optimizing computing and memory resources.

Also Read: 8 Microsoft Free Courses- Machine Learning, AI, Data Science & More.

Performance Validation on Diverse Models

A paper released on Hugging Face highlights SliceGPT’s prowess by showcasing its application on LLAMA2-70B, OPT 66B, and Phi-2 models. The results are remarkable, with SliceGPT achieving 99%, 99%, and 90% zero-shot task performance of the dense model, respectively. This demonstrates the versatility of SliceGPT across various language models.

Running Faster on Fewer GPUs

One of SliceGPT’s notable advantages is its efficiency in inference. Sliced models, achieved through sparsification, exhibit accelerated performance on fewer GPUs. The paper indicates a reduction to 64% of the total compute for inference on LLAMA2-70B using 24GB consumer GPUs. Even on 40GB A100 GPUs, the reduction stands at an impressive 66%. This translates to faster execution without the need for additional code optimization.

Core Concept Behind SliceGPT

The abstract explores SliceGPT’s core idea: computational invariance in transformer networks. This revelation offers ways to reduce pre-trained models’ memory and computation requirements. SliceGPT achieves a notable reduction in the network’s embedding dimension by substituting weight matrices with compact, dense counterparts.

Also Read: Microsoft Launches Copilot on Microsoft 365; Introduces Pro Subscription Plan

The Future Path Paved by SliceGPT

Microsoft’s SliceGPT addresses the challenges posed by resource-intensive language models and lays the groundwork for future advancements. With its open-source code available on GitHub, SliceGPT invites collaboration and exploration to reduce pre-trained models’ computational footprints further.

Our Say

As we witness the emergence of SliceGPT, we envision a future where large language models can coexist with optimized resource utilization. Microsoft’s strides in computational invariance pave the way for a more efficient and sustainable era of AI. SliceGPT is a testament to innovation, offering a glimpse into the evolving landscape of language model compression.

You can explore this paper here.

Follow us on Google News to stay updated with the latest innovations in the world of AI, Data Science, & GenAI.

SliceGPT

N

Nitika Sharma

29 Jan 2024

Generative AI News

Microsoft Introduces SliceGPT For Compressing LLMs

What is Special About SliceGPT?

Performance Validation on Diverse Models

Running Faster on Fewer GPUs

Core Concept Behind SliceGPT

The Future Path Paved by SliceGPT

Our Say

Hello world!

What is Salami Attack?

How to Install Termux on Android?

LEAVE A REPLY Cancel reply

Most Popular

Interview With Robin Bolton – Head of Product at Friend MTS by Shauli Zacks

5 Best VPNs for Los Angeles in 2024: Fast & Secure by Gjurgjica Panova

How to Change Your Smart TV Region: Full 2024 Guide by Raven Wu

Samsung Galaxy S25 series bags FCC certification

Recent Comments

EDITOR PICKS

Interview With Robin Bolton – Head of Product at Friend MTS by Shauli Zacks

5 Best VPNs for Los Angeles in 2024: Fast & Secure by Gjurgjica Panova

How to Change Your Smart TV Region: Full 2024 Guide by Raven Wu

POPULAR POSTS

Interview With Robin Bolton – Head of Product at Friend MTS by Shauli Zacks

5 Best VPNs for Los Angeles in 2024: Fast & Secure by Gjurgjica Panova

How to Change Your Smart TV Region: Full 2024 Guide by Raven Wu

POPULAR CATEGORY

ABOUT US

FOLLOW US