Uber's Ludwig is cool - but not a game changer for deep learning
Uber just released Ludwig, a "Code-Free Deep Learning Toolbox". I spent a few hours reading through the documentation and examples. It is a pretty cool framework, especially if your organization has made a bet on TensorFlow (which, btw, we don't currently recommend) which is notoriously hard to work with. But—like any machine learning framework—it isn't a game changer.
What I like about Ludwig
Uber has done a really nice job documenting this framework and presenting examples. If you've taken online courses from Andrew Ng or Jeremy Howard you'll see familiar examples like Text or Image Classification. These are all basic building blocks which can be the foundation for any modern AI application.
I agree with the value proposition in Uber's own press release:
"Ludwig is unique in its ability to help make deep learning easier to understand for non-experts and enable faster model improvement iteration cycles for experienced machine learning developers and researchers alike. By using Ludwig, experts and researchers can simplify the prototyping process and streamline data processing so that they can focus on developing deep learning architectures rather than data wrangling."
TensorFlow (primarily) uses static graphs while libraries like PyTorch use dynamic graphs. Dynamic graphs are easier to work with and have a lower learning curve for most data scientists. Any developer with basic data science and Python skills can begin contributing to a PyTorch application; TensorFlow takes more expertise and months of study.
The chief benefit of Ludwig is that it overcomes many challenges of working with TensorFlow—not that it provides any unique capability.
Moreover, the team seems committed to keeping the library current. I did a quick site search on BERT (a cutting-edge NLP language representation model) and they do mention it in the FAQ:
“We will prioritize new features depending on the feedback of the community, but we are already planning to add … additional text and sequence encoders (attention, co-attention, hierarchical attention, Transformer, ELMo and BERT).”
But Ludwig won't solve your primary deep learning challenges
All of us want to "democratize" AI so that more organizations can begin adopting this amazing technology. Unfortunately frameworks like Ludwig will only help improve your efficiency.
The final stages of data transformation and model training are only a small fraction of the work required to build AI systems.
Based on our metrics at client projects a framework like Ludwig would improve efficiency by 3%
A 3% efficiency booth isn't nothing—if you've already got a team of developers building deep learning models and an infrastructure like Uber's Michelangelo it can help accelerate your innovation.
But most of the time on your first couple of machine learning projects will be spent doing the following:
- Figuring out what you want to do with deep learning.
- Figuring out what data you have and trying to get it out of source systems.
- Getting the right team in place and getting a workflow which allows you to rapidly iterate.
- Analyzing your data and building initial models.
- Analyzing model results.
- Figuring out where and how to deploy your first models.
- Building your data processing pipeline.
- Getting your machine learning models integrated into your devops infrastructure.
I could go on and on, but you get the idea:
Training deep learning models is only a small fraction of the work. 90% of the effort will be spent by smart people analyzing data and model results while trying to figure out how to deploy them.
Bravo, Uber
Thanks, Uber for taking the step to open source this framework. The interface and examples are top-notch.