5 More arXiv Deep Learning Papers, Explained

5 More arXiv Deep Learning Papers, Explained

By Matthew Mayo.


arXiv, maintained by Cornell University, is a popular open access academic paper preprint repository. It is an outlet for cutting edge research in numerous scientific fields, including machine learning. Mirroring the current general trend in academia, much of the recent posted machine learning research is deep learning related.

Hugo Larochelle, PhD, is a Université de Sherbrooke machine learning professor (on leave), Twitter research scientist, noted neural network researcher, and deep learning aficionado. Since late summer 2015, he has been drafting and publicly sharing notes on arXiv machine learning papers that he has taken an interest in.

A previous KDnuggets article outlined and explained a selection of 5 arXiv machine learning papers that Hugo has read and shared notes on. In an effort to help us better understand new research, this article will present and summarize 5 additional arXiv papers, and will share excerpts from Hugo's notes in order to provide some additional perspective and critique. Links to all original papers, abstracts, and explanatory notes are also included. It is hoped that having top deep learning papers explained by a noted expert in the field will make some of the more complex aspects of the science more approachable.

1. Infinite Dimensional Word Embeddings

Authors: Eric Nalisnick, Sachin Ravi
Date posted to arXiv: 17 Nov 2015

Abstract (excerpt):
We describe a method for learning word embeddings with stochastic dimensionality. Our Infinite Skip-Gram (iSG) model specifies an energy-based joint distribution over a word vector, a context vector, and their dimensionality. By employing the same techniques used to make the Infinite Restricted Boltzmann Machine (Cote & Larochelle, 2015) tractable, we define vector dimensionality over a countably infinite domain, allowing vectors to grow as needed during training.

Hugo's Two Cents (excerpt):

This is a quite original use of our "infinite dimensions" trick we introduced in the iRBM. It wasn't entirely "plug and play" either, and the authors had to be smart in the approximations they proposed for training the iSG.

The qualitative results showing how the conditional on the number of dimensions contain information about polysemy are really neat! One assumption behind distributed word embeddings is that they should be able to represent the multiple meanings of words using different dimensions, so it's nice to see that this is exactly what is being learned here.

I think the only thing missing in this paper are comparisons with regular skipgram and perhaps other word embeddings methods on a specific task or on a word similarity task. In v2 of this paper, the authors do mention they are working on such results, so I'm looking forward to seeing those!

Read the full post on KDnuggets 5 More arXiv Deep Learning Papers, Explained

https://www.kdnuggets.com/2016/01/more-arxiv-deep-learning-papers-explained.html

要查看或添加评论,请登录

Gregory Piatetsky-Shapiro的更多文章

社区洞察

其他会员也浏览了