Role of Sparse matrix in machine learning
Deepak Kumar
Propelling AI To Reinvent The Future ||Author|| 150+ Mentorship|| Leader || Innovator || Machine learning Specialist || Distributed architecture | IoT | Cloud Computing
Why to read this?
Large sparse matrices are common in general and especially in applied machine learning, such as in data that contains counts, data encodings that map categories to counts, and even in whole subfields of machine learning such as natural language processing.
Technical explanation
Sparse matrix is a matrix which contains very few non-zero elements. It has space and time complexity overheads.
There is no strict definition how many elements need to be zero for a matrix to be considered sparse but a common criterion is that the number of non-zero elements is roughly the number of rows or columns.
The sparsity of a matrix can be quantified with a score, which is the number of zero values in the matrix divided by the total number of elements in the matrix.
sparsity = count zero elements / total elements
Origin of sparse matrix in ML
While they occur naturally in some data collection processes, more often they arise when applying certain data transformation techniques like:
Efficient data structures
There are many data structures for storing sparse matrix. MATLAB stores sparse matrices in compressed sparse column format.
Its mostly best to use a library for doing sparse matrix calculations since library ensures efficient implementation which is suitable for most cases.
Use in machine learning
In recommender systems, we typically work with very sparse matrices as the item universe is very large while a single user typically interacts with a very small subset of the item universe.
In image processing, Sparse matrix is a special way of representing the image in a matrix format. In sparse matrix, most of the elements are zero. This reduces the unwanted processing of the pixel values. This papertalks about it in detail.
Relevance in neural networks
Sparse networks, that is, neural networks in. which a large subset of the model parameters are zero, have. emerged as one of the leading approaches for reducing. model parameter count.
It has been shown empirically that deep neural networks can achieve state-of-the-art results under high levels of sparsity (Han et al., 2015; Louizos et al., 2017; Gale et al., 2019) , and this property has been leveraged to significantly reduce the parameter footprint and inference complexity (Kalchbrenner et al., 2018) of densely connected neural networks. This paper talks about it in detail. This is another relevant article.
For transformer NLP neural model, memory and time complexity is big issue. OpenAI used sparse factorisation of the attention matrices which reduced the network complexity from O(N2) to O(N√N). This allows OpenAI to "model sequences with tens of thousands of elements using hundreds of layers," compared to other networks that can only handle sequences of "a few thousand elements."
Reference
Thanks to these helping hands
https://machinelearningmastery.com/sparse-matrices-for-machine-learning/ https://dziganto.github.io/Sparse-Matrices-For-Efficient-Machine-Learning/ https://en.wikipedia.org/wiki/Sparse_matrix https://www.quora.com/What-is-the-fastest-sparse-matrix-data-structure-for-speeding-up-multiplication-of-a-1000-x-1000-there-about-double-valued-matrix https://towardsdatascience.com/why-we-use-sparse-matrices-for-recommender-systems-2ccc9ab698a4 https://arxiv.org/pdf/1906.10732.pdf https://analyticsindiamag.com/rigl-google-algorithm-neural-networks/ https://www.ijert.org/research/image-interpolation-using-sparse-matrix-representation-IJERTV4IS020434.pdf https://www.researchgate.net/publication/258499564_Sparse_Matrix_Factorization https://www.infoq.com/news/2019/05/openai-sparse-transformers/ https://images.app.goo.gl/5senKK5qmieqzVNcA https://matteding.github.io/2019/04/25/sparse-matrices/ https://images.app.goo.gl/oV7jkjPbPvxYtJCY6