登录查看更多内容

One Model To Learn Them All

Diego Marinho de Oliveira

Gen-AI Search, RecSys | ex-SEEK, AI Lead, Data Scientist Manager and ML Engineer Specialist

发布日期: 2017年6月27日

"Abstract Deep learning yields great results across many fields, from speech recognition, image classification, to translation. But for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. We present a single model that yields good results on a number of problems spanning multiple domains. In particular, this single model is trained concurrently on ImageNet, multiple translation tasks, image captioning (COCO dataset), a speech recognition corpus, and an English parsing task. Our model architecture incorporates building blocks from multiple domains. It contains convolutional layers, an attention mechanism, and sparsely-gated layers. Each of these computational blocks is crucial for a subset of the tasks we train on. Interestingly, even if a block is not crucial for a task, we observe that adding it never hurts performance and in most cases improves it on all tasks. We also show that tasks with less data benefit largely from joint training with other tasks, while performance on large tasks degrades only slightly if at all."

Authors: Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit

Read full paper at https://bit.ly/2tRKJ8z

Kirill Morozov

Software Engineer | Ecommerce | Delivering great scalable and performant cloud applications.

7 年

Looks nice, hope there is way to teach this kid/model over time, build up continuously running model.

Mike Leaman

Enterprise Data Architect

7 年

This part sounds intriguing "especially since our model shows transfer learning from tasks with a large amount of available data to ones where the data is limited. " Did you find this across any particular types of data sets - e.g more homogeneous or heterogeneous?

查看更多评论

要查看或添加评论，请登录

Diego Marinho de Oliveira的更多文章

Deep Learning for Personalized Search and Recommender Systems

2017年10月5日

Deep Learning for Personalized Search and Recommender Systems

Nice review about Deep Learning for Search + Recommender Systems: "Abstract Deep learning has been widely successful in…

2 条评论
Facets: An Open Source Visualization Tool for Machine Learning Training Data

2017年7月18日

Facets: An Open Source Visualization Tool for Machine Learning Training Data

"Abstract Getting the best results out of a machine learning (ML) model requires that you truly understand your data…

1 条评论
Spark: The Definitive Guide

2017年7月6日

Spark: The Definitive Guide

Databricks published some free chapters today about Spark. "Apache Spark has seen immense growth over the past several…

5 条评论
Do Balancing Classes Improve Classifier Performance?

2017年5月25日

Do Balancing Classes Improve Classifier Performance?

Nice post by Nina Zumel "It’s a folk theorem I sometimes hear from colleagues and clients: that you must balance the…

4 条评论
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

2017年5月21日

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

"Abstract Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of…
Neural Ranking Models with Weak Supervision

2017年5月5日

Neural Ranking Models with Weak Supervision

Abstract Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP…
Simple/Incomplete Benchmark of Machine Learning Libraries for Classification

2017年4月19日

Simple/Incomplete Benchmark of Machine Learning Libraries for Classification

The sharing for today :) All benchmarks are wrong, but some are useful "This project aims at a minimal benchmark for…

1 条评论
Introducing tf-seq2seq: An Open Source Sequence-to-Sequence Framework in TensorFlow

2017年4月12日

Introducing tf-seq2seq: An Open Source Sequence-to-Sequence Framework in TensorFlow

Summary: "In addition to machine translation, tf-seq2seq can also be applied to any other sequence-to-sequence task…

1 条评论
Mask R-CNN

2017年3月22日

Mask R-CNN

"Abstract We present a conceptually simple, flexible, and general framework for object instance segmentation. Our…

3 条评论
Evolution Strategies as a Scalable Alternative to Reinforcement Learning

2017年3月21日

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

"Abstract We explore the use of Evolution Strategies, a class of black box optimization algorithms, as an alternative…

See all articles

One Model To Learn Them All

Diego Marinho de Oliveira

Gen-AI Search, RecSys | ex-SEEK, AI Lead, Data Scientist Manager and ML Engineer Specialist

Diego Marinho de Oliveira的更多文章

社区洞察

其他会员也浏览了

?? Google Releases Transformer 2.0

??Top ML Papers of the Week

AIOS: The Operating System That Thinks, Learns, and Adapts

Artificial Intelligence #121

Artificial Intelligence #130

??Top ML Papers of the Week

"Causal Fundamentalism": AI/ML/LLMs/GenAI/AGI/ASI/Robotics Fundamentals

Demystifying the Math Behind Generative Pre-trained Transformers (GPTs)

CVPR Edition: Voxel51 Filtered Views Newsletter - June 21, 2024

Upcoming Books and Articles on MLTechniques.com

Diego Marinho de Oliveira的更多文章

Deep Learning for Personalized Search and Recommender Systems

Facets: An Open Source Visualization Tool for Machine Learning Training Data

Spark: The Definitive Guide

Do Balancing Classes Improve Classifier Performance?

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

Neural Ranking Models with Weak Supervision

Simple/Incomplete Benchmark of Machine Learning Libraries for Classification

Introducing tf-seq2seq: An Open Source Sequence-to-Sequence Framework in TensorFlow

Mask R-CNN

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

社区洞察

其他会员也浏览了

?? Google Releases Transformer 2.0

??Top ML Papers of the Week

AIOS: The Operating System That Thinks, Learns, and Adapts

Artificial Intelligence #121

Artificial Intelligence #130

??Top ML Papers of the Week

"Causal Fundamentalism": AI/ML/LLMs/GenAI/AGI/ASI/Robotics Fundamentals

Demystifying the Math Behind Generative Pre-trained Transformers (GPTs)

CVPR Edition: Voxel51 Filtered Views Newsletter - June 21, 2024

Upcoming Books and Articles on MLTechniques.com