登录查看更多内容

DKM Differentiable K-Means Clustering Layer for Neural Network Compression

ROHAN WASANKAR

Assistant Manager - ICICI Direct

发布日期: 2022年11月1日

DKM casts forth K-means clustering as an attention problem, and then joint optimisation of the DNN parameters and clustering centroids is enabled. Unlike prior experiments, which relied on additional regularisations and parameters, DKM-based compression fixes the original loss function and model architecture.

DNN or deep neural networks have shown extraordinary capability in performing many cognitive tasks. They have demonstrated superhuman performance on many cognitive tasks. An uncompressed and completely-trained DNN is ordinarily used for inference on the server side, and the user experience is enhanced by on-device inference.

All you need to do is reduce the latency and keep the data of the user on the device. Many such on-device platforms are powered by batteries; hence they are constrained for power. So a DNN needs to be efficient power-wise. The stringent use of power resources will reduce the computing budget and also decrease the storage overhead.

What are the solutions for efficient management of the power of DNN?

There are multiple solutions to this one:

One of the solutions for this is to design an extremely efficient DNN which uses power judiciously at the architecture level. One such example is MobileNet Howard.
The model could be compressed. This is the other solution. Without the accuracy regression, it would be extremely light. This way, the model consumes less storage. It also reduces the bandwidth utilisation of the System on Chip (SoC) memory. This can minimise the power-consumption and latency. To this end, various DNN compression techniques have been used for this purpose only.

Another method has been shown to deliver a high ratio of DNN compression. The set of weights having a few shareable weight values is clustered into various groups. This is based on the popular method of K-means clustering. After the clustering of weights, the model size is shrunk to store indices in 2 bits, 4 bits and so on. This depends on the number of clusters. It has a lookup table with an integer, or rather a whole number, than floating-point values.

领英推荐

BxD Primer Series: Hopfield Neural Networks

Mayank K. 1 年前

Harnessing Convolutional Neural Networks for Damage…

Jonathan Ehrlich 5 个月前

BxD Primer Series: Deconvolutional Neural Networks

Mayank K. 1 年前

How to design a compact DNN architecture using K-Means clustering?

Designing a compact DNN architecture is easy. You just need to enable weight-clustering together where the clustering could provide the best solution for efficient on-device inference. But the existing model compression approaches can’t compress it completely.

The DNN is already a small DNN like MobileNet. This means that maybe we can presume that the model has not become redundant. On the other hand, the data is presumably necessary because the mathematical model itself has no significant redundancy. We can guess that such limitation comes from the fact that weight-clustering through a K-means algorithm.

Both weight-cluster assignment and updating of weights aren’t completely optimised with the help of a brain upgrade using a target task. The basic complexity in using k-means clustering for weight-sharing is done because both weights correspond to k-means centroids. So, these centroids are free to move. An ordinary K-means clustering by using fixed observations is hard for Neural Programming.

We can use differentiable K-means clustering to enable train-time weight-clustering for compressing the model, which can be used for deep learning. This helps K-means clustering to serve as a layer in generic activation. This helps to show the state-of-the-art results on both computer vision and NLM (Natural Language Model) tasks. That is how K-means clustering is used in Neural Network Compression.

In this regard, E2E Networks has some exciting solutions for you. Especially, E2E Auto Scale and E2E Linux Smart Dedicated 3rd Generation Solutions help use K-Means clustering to compress all your DNN workloads and help in optimizing their performance.

要查看或添加评论，请登录

ROHAN WASANKAR的更多文章

Value of Cloud Computing

2022年11月17日

Value of Cloud Computing

Introduction It is not a new thing to say that a major transition is on the way. The transition in which businesses…
Vocal for Local: How Indian companies can leverage this

2022年11月11日

Vocal for Local: How Indian companies can leverage this

What do you mean by Vocal for Local? Vocal for Local is the mission of the country to be self-sufficient in goods…
CEOs Need to Know a lot about the cloud. What is this important information?

2022年11月9日

CEOs Need to Know a lot about the cloud. What is this important information?

The changing corporate environment and the global economy have necessitated significant modifications to allow…
Pooling Layers for Convolutional Neural Networks

2022年11月7日

Pooling Layers for Convolutional Neural Networks

Convolutional layers are the fundamental building blocks of a convolutional neural network, utilized for computer…
Non-Generalization Principles and Transfer Learning Technique

2022年11月2日

Non-Generalization Principles and Transfer Learning Technique

Introduction In computer science, there’s an informal name for the phenomenon known as non-generalization and transfer…
Top 5 Open source monitoring tools for Kubernetes

2022年10月31日

Top 5 Open source monitoring tools for Kubernetes

Introduction Distributed computing and orchestration have solved many problems, but they also have created new…
Improving Word Representations via Global Context and Multiple Word Prototypes

2022年10月27日

Improving Word Representations via Global Context and Multiple Word Prototypes

Introduction The improvement in Natural Language Processing (NLP) in modern times is vital in developing Artificial…
Top 5 Open source monitoring tools for Kubernetes

2022年10月25日

Top 5 Open source monitoring tools for Kubernetes

Introduction Distributed computing and orchestration have solved many problems, but they also have created new…
Most Common Kubernetes Traps, Identified by DevOps

2022年10月21日

Most Common Kubernetes Traps, Identified by DevOps

Introduction "Kubernetes opens countless options for scaling and collaboration and aids in more rapid software…
Kubernetes is the new normal for ISVs

2022年10月20日

Kubernetes is the new normal for ISVs

Due to the growing use of containers in businesses, the container-centric management tool Kubernetes has taken over as…

1 条评论

See all articles

DKM Differentiable K-Means Clustering Layer for Neural Network Compression

ROHAN WASANKAR

Assistant Manager - ICICI Direct

领英推荐

ROHAN WASANKAR的更多文章

社区洞察

其他会员也浏览了

Pooling Layers for Convolutional Neural Networks

Convolutional Neural Networks - Part 3 - Fully Connected Layer

Pooling in convolutional neural network

BxD Primer Series: Extreme Learning Machine (ELM) Neural Networks