ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

PTQ and QAT: Best Practices for Performing Hybrid and Selective Quantization

Deci AI (Acquired by NVIDIA)

Deci enables deep learning to live up to its true potential by using AI to build better AI.

å‘å¸ƒæ—¥æœŸ: 2023å¹´3æœˆ16æ—¥

While quantization can solve poor runtime performance, memory and model size constraints, and hardware limitations, it has its own issues. Some architectures are difficult to quantize, accuracy can decrease, and calibration, a must-step in INT8 quantization, also has a couple of challenges.

This is where hybrid and selective quantization come in. Unlike na?ve quantization, which applies the same quantization methods for all neural network layers, hybrid and selective quantization have additional steps, resulting in better speed without the negative impact on accuracy. You can apply these approaches during post-training quantization (PTQ) and quantization-aware training (QAT).

PTQ is a quantization technique where the model is quantized after it has been trained. QAT is a finetuning of the PTQ model, where the model is further trained with quantization in mind.

Here are rules of thumb for applying hybrid and selective quantization during PTQ and QAT:

Now see the results on the STDC semantic segmentation model on Pascal VOC. The throughput, model size, and latency are close between na?ve quantization, and selective PTQ and QAT. But look at the accuracy with selective PTQ, there was only a very small decrease. And with selective QAT, the accuracy even got get better.

You can easily do hybrid and selective PTQ and QAT using SuperGradients, our open-source library for training PyTorch-based computer vision models.

To take a deeper dive into quantization and learn how you can improve your modelâ€™s speed without reducing its accuracy, watch the webinar or read the ultimate guide.

é¢†è‹±æŽ¨è

Discover How #AI Will Transform Your Business

Marco Houthuijzen CCEP 8 ä¸ªæœˆå‰

AI = The World Modeling Machine [AI, ML, NNs, DL, GenAI, LLMs, GPT-x]

AI = The World Modeling Machine [AI, ML, NNs, DLâ€¦

Azamat Abdoullaev 12 ä¸ªæœˆå‰

Thoughts on Generative AI

Diego Bartolome 1 å¹´å‰

Get ahead with the latest deep learning content

GPT-4 is here! It includes the ability to interact with images and longer text.
Microsoft introduces Visual ChatGPT. It incorporates different visual foundation models so users can interact with ChatGPT.
Train to 94% on CIFAR-10 in less than 10 seconds on a single A100. The current world record. Or ~95.77% in ~188 seconds.
A playbook for systematically maximizing the performance of deep learning models. Or a long list of hyperparameters that need tuning and have relations between them so optimizing all of them must be a continuous process.
Generative AI drives demand for performance tools. Along with the growth of chatbots and LLMs is the need to improve the performance of these large and complex AI models.
A new open-source version of ChatGPT. OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so. Hereâ€™s another one called OpenChatKit.

Catch Deci at NVIDIA GTC 2023

Join our CEO & Co-Founder, Yonatan Geifman, at his GTC session on â€œHow to Accelerate NLP Performance on GPU with Neural Architecture Search.â€ Heâ€™s going to take a deep dive into NLP inference performance optimization, covering the details, challenges that should be addressed, as well as tools and best practices to adopt, to achieve the best possible results without sacrificing the modelâ€™s accuracy. Register here.

Can you solve this riddle? Comment your answer below.

Don't forget to catch next month's newsletter to get the answer to the riddle. ??

Enjoyed these deep learning tips? Help us make our newsletter bigger and better by sharing it with your colleagues and friends!

Deep Learning Tip of the Month

5,183 ä½å…³æ³¨è€…

è®¢é˜…

Michael Scheinfeild

Algorithm Developer

2 å¹´

how you select filter dynamic range after multiplication 16 bits *16 bit will be more >16

èµž

å›žå¤

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Deci AI (Acquired by NVIDIA)çš„æ›´å¤šæ–‡ç«

See all articles

PTQ and QAT: Best Practices for Performing Hybrid and Selective Quantization

Deci AI (Acquired by NVIDIA)

Deci enables deep learning to live up to its true potential by using AI to build better AI.

é¢†è‹±æŽ¨è

Get ahead with the latest deep learning content

Catch Deci at NVIDIA GTC 2023

Can you solve this riddle? Comment your answer below.

Deep Learning Tip of the Month

5,183 ä½å…³æ³¨è€…

Deci AI (Acquired by NVIDIA)çš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Building a GenAI Vocabulary

Generative Artificial Intelligence (GenAI)

Demystifying AI: Understanding The Real Impact Of Artificial Intelligence

Generative AI: Context and Future for existing Companies

Artificial Intelligence (AI) Type

OpenAI Launches o1: A More Powerful Upgrade to GPT-4

Artificial Intelligence is the future & the future is here!

Revolutionizing Businesses and Our Lives: Exploring the Potential of Artificial Intelligence

Mastering the Craft of Prompt Engineering for Optimal AI Performance

6 ways AI can progress in 2023

é¢†è‹±æŽ¨è

Get ahead with the latest deep learning content

Catch Deci at NVIDIA GTC 2023

Can you solve this riddle? Comment your answer below.

Deep Learning Tip of the Month

5,183 ä½å…³æ³¨è€…

Deci AI (Acquired by NVIDIA)çš„æ›´å¤šæ–‡ç«

How to Improve Small Object Detection Accuracy Without Increasing Latency

Just Launched: Deciâ€™s Gen AI Development Platform and Deci-Nano

What makes LLM inference more challenging than traditional NLP?

YOLO-NAS-Sat: A Small Object Detection Model for Edge Deployment

Exploring the Modern Transformer - From 'Attention Is All You Need' to SwiGLU, RoPE, and GQA

How to Build Better AI Models with a Production-Aware Approach and NAS

DeciCoder-6B and DeciDiffusion 2.0: Models Built for Accuracy, Speed, and Cost-Efficiency

Maximizing LLM Inference Speed: Proven Strategies and Best Practices

DeciLM-7B: The Fastest and Most Accurate 7 Billion-Parameter LLM to Date ??

Key Factors to Success of YOLO-NAS Pose ??

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Building a GenAI Vocabulary

Generative Artificial Intelligence (GenAI)

Demystifying AI: Understanding The Real Impact Of Artificial Intelligence

Generative AI: Context and Future for existing Companies

Artificial Intelligence (AI) Type

OpenAI Launches o1: A More Powerful Upgrade to GPT-4

Artificial Intelligence is the future & the future is here!

Revolutionizing Businesses and Our Lives: Exploring the Potential of Artificial Intelligence

Mastering the Craft of Prompt Engineering for Optimal AI Performance

6 ways AI can progress in 2023

é¢†è‹±æŽ¨è

5,183 ä½å…³æ³¨è€…

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†