登录查看更多内容

Solving the ML App interoperability issue with Deep Learning Frameworks - 2/2

Soma Dhavala

发布日期: 2018年1月10日

In part 1 of the blog, the need for App interoperability, offing in the market today and their short comings were discussed. In the reminder of this blog, I will argue that Deep Learning Frameworks (such as TensorFlow and its higher level abstraction Keras) provide an alternative. Before getting into the details, let us state the larger goals - what that proposed solution needs to offer?

separate data and data flows from model spec
human readability
scalability
low barrier to translation
low barrier to adoption

We also need to look at the problem, not just from a technology PoV (schema, frameworks, readers, writers, languages, etc..), but also from a math/science PoV and even from from market trends. We raise the following questions:

Is there an oracle of sorts, like a universal algorithm, that subsumes all other algorithms?
Is there an engineering framework that can realize this?
Is there enough business interest in it to make it commercially viable?

The answer is an obvious no (largely due to 1 above), but it allows us to search for a larger class of models, and a corresponding enabler. We know we can not find an exact solution, but is there an approximate solution. Deep Learning is a strong contender.

Deep Learning is a super algorithm. Its expressional, representational power comes from its modular lego block architecture. Essentially, we are going to see Modeling (elicitation and inference) not as a monolithic block, but as a coming together of lot reusable, atomic building blocks : layers, activations, losses, regularization, initialization, optimization, etc..., are different blocks that each come with a set of options. Collective number of modeling possibilities is enormous, as a result. Sampling Models/Algorithms that can be expressed via DNNs (unless otherwise mentioned, equivalence may not be exact but in principle) shows the possibility landscape:

Many regression and classification models, exactly (with different loss functions)
SVMs and Kernel machines (with combination of active functions, layer freezing, loss functions)
Matrix Factorizations and other bilinear regressions (via graph compositions and merging)
non-linear State Space Models (RNNs)
Markov Random Fields, Graphical Models (RBMs)
Data Reduction Techniques (Auto Encoders)
Variational Bayes' and Deep Generative Models
Many more

Same DNN framework can handle variety of data -

multiple inputs - multiple outputs
variable input size and variable output size
tensors, text, speech, image, video and their combinations

Let us also understand the limitations. While not every Algorithm may not have its representation via DNN, say, my bet is that 80% of the useful model landscape can be addressed with one DNN framework. A popular misconception is that DNNs need lot of data. DNNs can work with both small and big data. Me, with Prof. Lord and Dr. Ali, are working on some exciting problems in this area. Quite a bit work needs to be done to address such gaps. There is lot more (to be elaborated in subsequent blogs) but let us move to the next point.

Deep Learning Frameworks are an engineering reality. Google released TensorFlow, FaceBook PyTorch, and many other established labs have released their tools (Theano, MXNet, Caffe). That means that, Deep Learning can become commoditized sooner than later.

In more concrete terms, let us take Keras with TensorFlow backend as a reference and see how it realizes the stated goals.

Keras is an abstraction on the top of TensforFlow and Theano, allowing rapid experimentation without getting too much into atomic computations. Keras'a author Francois Cholet announced support for MXNet as a backend, last year. MXNet is in incubation at Apache, and Google is backing Keras. It lowers the barrier to adoption.
Keras models can be saved into different formats - json and yml (even better). It is one shot - many birds. Models are readable by both machines and humans alike. Due to DNNs expressiveness, many models can be subsumed. Many models - single exchange format.
Content and Form are separated. DNN weights (which can be long, and boring) are saved separately from the structure. I like to inspect the structure first and investigate details later. Tools like TensorBoard, though a lot is left desired from it, are providing help in that direction. DataScientist would be happy.
DNNs are fundamentally computational graphs. It means that all models within the DNN framework can be stored as graphs (as a data structure). A rich, well-established, well-studied, mature data structure like graph brings lots of benefits. For example, I can immediately see whether a model has a valid syntax (a graph is DAG or not, essentially, programmatically). Model has rich data structure, not an arbitrary imposed syntax.
TensforFlow can work against any infrastructure - cpu, gpu, gpu-farms, cpu-farms, and even in embedded systems. Essentially, write code once, scale anyhow. [update:] On March, 2018, TensforFlow Hub was talked about. It is pitching the same ideas.
Writing translator between DNN frameworks is easier than doing it between Algorithm implementations. Better yet, can we define a standard for DNNs? It is hard but it can be done with few developers - whole community is not needed. Adoption asymmetry is prevented. [updated] Sowmya pointed out that Open Neural Network Exchange (onnx) is one such format. Happy to note that translations between different frameworks are already supported.

Deep Learning Frameworks can ride on the AI wave. All organizations big and small are betting on riding the AI wave. They have a commercial incentive to have the early mover advantage or capture and/or hold on to their market penetration. NVIDIA and IBM, for example are training thousands of people on Deep Learning technologies. MOOC vendors such as Udacity, Coursera are offering Deep Learning and MachineLearning courses - they are big hits - essentially tapping into the market potential and workforce aspirations. There is both supply and demand.

So it seem like, DNNs and Deep Learning Frameworks, together, can address interoperability. However, it does not prevent co-existence of other algorithms that are interoperable. DNNs can be a nice default to work with.

Overall, I think that Deep Learning Frameworks have the potential to commoditize and democratize machine learning applications at large. Interoperability is one tiny but important aspect, which they can address with right application, as argued above.

Solving the ML App interoperability issue with Deep Learning Frameworks - 2/2

Soma Dhavala

更多精彩文章

社区洞察

其他会员也浏览了

The Two Types of Machine Learning

AI vs Machine Learning vs Deep Learning vs Reinforcement Learning

Deep Learning with BigDL and Apache Spark on Docker

Deep learning & XgBoost : Winning it hands down !

TechNotes: Two Streams of Machine Learning: Deep Learning and Bayesian Reasoning

June 2016 CH&Co Newsletter - Machine Learning

What did Deep Learning actually solved ?

#artificial intelligence #61 - the best paper to explain deep learning from first principles

Machine learning and Deep Learning – what’s the difference?...Friday Morning Cyberamblings

Machine Learning & Deep Learning Blog By Amarjit Maity

Solving the ML App interoperability issue with Deep Learning Frameworks - 1/2

2018年1月9日

A [baby] step towards Data Science Automation

2017年11月8日

DataFrame is the new JSON for Machine Learning

2017年10月27日

A USB for connecting ML Apps

2017年10月26日

15 Commandments And An Axiomatic Framework For Building Machine Learning Apps With Manufacturer-backed Guarantees: Part 2 of 2

2017年10月24日

15 Commandments And An Axiomatic Framework For Building Machine Learning Apps With Manufacturer-backed Guarantees: Part 1 of 2

2017年10月22日