登录查看更多内容

Deep Learning Mechanisms in Applications

Alfred David

Tech Innovation Alchemist | AI-to-Blockchain Strategist | Building World-Class Engineering Teams | Future-First Leader

发布日期: 2017年7月16日

Deep learning is ravenously raved about right now, for those of us unaware of AI and its terminologies; well it is a branch of machine learning which in turn is a subset of AI. Deep Learning is nothing new, it is just that researchers and application developers have found a cool new way to utilize the concepts of deep learning to solve long-standing important problems, yielding at times, unexpected, but state of the art results all the same. This has contributed to its sudden new found popularity.

I was recently looking at a very interesting visualization of how deep learning startups have garnered the trending novelty factor from the IOS/Android application startups not only in the acquisition space but also in the hiring space.

Why is Deep Learning making all the splashes now? It was there some 40 years ago when researchers did all the research on the same and for many years it was at a dead end, and suddenly everybody is talking about it as if it were chanced upon by a stroke of genius.

Deep learning gained a lot of the limelight when some companies started playing with its concepts for computer vision and were able to generate many improved results tackling newer problems in that realm.

So what is the zing that Deep learning is bringing to the table? As I see it there three clever tricks that Deep learning is leveraging and utilizing far better than other computational techniques.

These so called clever tricks turn complex problems into simpler ones by breaking them into smaller logical chunks

- Variational Methods - which formulate intractable problems as approximate convex optimization problems, and then apply well-understood optimization algorithms which yield good performance and often have fast parallel and streaming variants.

- Distant supervision - self-training, or weak supervision, for starting with an insufficient data set and incrementally bootstrapping your way into sufficient data for supervised learning. Using these methods, you may or may not have some labeled data, you definitely have a bunch of unlabeled data, and you have a ‘function’ (read: a dirty yet clever hack) that assigns noisy labels to the unlabeled data — once you have lots of data with noisy labels, you’ve turned the problem into vanilla supervised learning.

- Transfer Learning — applying knowledge learned from one problem to a different but related problem. Transfer learning is especially exciting because we can learn from one data-rich domain with a totally different feature space and data distribution, and apply those learnings to bootstrap another domain where we may have much lesser data to work with.

Real World Problems Space

For many real-world problems, it is unfortunately rather expensive to get well-labeled training data. To elaborate on this issue, let’s consider two hypothetical cases:

Medical vision: if we want to build a system which detects lymph nodes in the human body in Computed Tomography (CT) images, we need annotated images where the lymph node is labeled. This is a rather time-consuming task, as the images are in 3D and it is required to recognize very small structures. Assuming that a radiologist earns 100$/h and can carefully annotate 4 images per hour, this implies that we incur costs of 25$ per image or 250k$ for 10000 labeled images. Considering that we require several physicians to label the same image to ensure close to 100% diagnosis correctness, acquiring a dataset for the given medical task would easily exceed those 250k$.
Credit scoring: if we want to build a system that makes credit decisions, we need to know who is likely to default so we can train a machine learning system to recognize them beforehand. Unfortunately, you only know for sure if somebody defaults when it happens. Thus a naive strategy would be to give loans of say 10k$ to everyone. However, this means that every person that defaults will cost us 10k$. This puts a very expensive price tag on each labeled data point.

Obviously, there are tricks to lower these costs, but the overall message is that labeled data for real-world problems can be expensive to obtain.Application builders are using Pre-training and Fine-tuning as an important offset to reduce the costs I’ve illustrated above

Pre-training: cheap large data sets on the related domain; These can be acquired as pre-trained models such as ImageNet, Model Zoos etc, public databases also could be leveraged to build models; in the absence of both data crawling tools such as scrapy, parsehub could be used to extract data from the web.

Fine-tuning: expensive well-labeled data is hard to get by and very expensive to generate with human annotations required and usually is in short supply in terms data set sizes; by trying to find a large weakly labeled data set without expensive human annotations would reduce cost. This weakly labeled data could be pre-trained on a neural network and then fine-tuned on a smaller set of well-labeled data set. This will result in a performance boost compared to just training on a small dataset

Today Deep Learning has risen like a phoenix into all the areas of the application space because of these factors:

1. Training Deep Networks - Researchers finally figured out how to train very deep networks. While it was assumed that “many back propagation layers == better /faster/smarter”, but at times networks with many layers stubbornly refused to be trained. This impacted research on neural networks and field almost went into hibernation, but work on deep networks have since allowed very deep networks to finally realize their potential and bring this field back from the thaw.

2. Large labeled datasets were created. Large networks need lots of data to train to become effective. In the last few years, more and more datasets became open to the public — ImageNet is one of the biggest, with over a million images and over 1,000 object categories (amusingly about 120 of which are breeds of dogs). Datasets for speech, video, human poses, and many others have also been published by universities and companies alike while the proliferation of smartphones (along with their cameras + sensors) provide incomprehensibly large datasets for the tech giants.

3. Easy to use frameworks - Deep learning frameworks like Tensorflow, Torch, Theano, Keras and Caffe have taken over the basic jobs, freeing up researchers to spend more time on interesting problems and less time reinventing the wheel.Added to these newer frameworks such as H20, Neon, Dmlc-maxnet, chainer, etc. are coming into the picture with the ability to do specific tasks more efficiently and also utilizing cloud platforms, this allows the researchers a gamut of choice to target tools and frameworks specific to the problem they are trying to resolve.On the Application space side, mobile and web frameworks have rapidly built plumbing interfaces to be able to use deep learning models, making prototyping and the whole end to end visualizing and solutions faster and viable. This makes sense when the go to the market timeframe is considerably reduced for solutions employing deep learning in the background.

4. Cheaper, faster processing power - Large networks can take an intimidating amount of computing power to crunch their numbers.With cloud technologies maturing and public clouds becoming more adoptable in the enterprise, researchers have taken confidence to utilize large amounts of compute power on a tap in the cloud. Compute power not only comes from CPU nodes but also GPU nodes and also a combination thereof on the cloud.

Where is this going from here?

I think the time is ripe where deep learning is gearing up to becoming a commodity by itself, like how mobility and IoT traded trends prior to this. There are products being specifically designed to leverage deep learning to solve business problems. just like how GPUs accelerate video games and scientific work like proteins folding, deep learning can gain similar boosts. Most deep learning frameworks now utilize GPUs and some companies like Cerebras are going even further and using programmable chips or creating special hardware whose sole purpose is to train neural nets. Existing web application such as browsers have also evolved to an extent where neural network computation is happening on the browsers such as MILWebDNN, speeding up the computation from conventional frameworks; It utilizes WebGPU though it is currently supported only in Safari, other browsers will soon catch up and make it more widely available. This brings the possibility that soon we could use an iPad (handheld device ) to work on neural networks completely eliminating complexity and allowing developers to focus on building and solving more complex real world problems in real time.

As computing power continues to get cheaper and smaller, networks that currently require supercomputers may soon fit in your HoloLens, smartwatch, AirPods, or any other wearable computing devices that we might use in the future and content like AR and VR would become intrinsically hemmed into our day to day technologies life size.

To illustrate pre training and fine tuning I’ve built a small iOS mobile application using tensorflow incorporating imageNet model; The application harnesses the device camera to be able to identify the object on the camera and give plausible labels from the graph generated using imageNet model and reinforced learning.

The source can be found here.

Rajesh Sanal

6 年

A good write up Alfy, with real life examples. Are you seeing any particular trend in adoption of TPUs (Tensor Processing Units) while developing customized chips for NN?

Alfred David

Tech Innovation Alchemist | AI-to-Blockchain Strategist | Building World-Class Engineering Teams | Future-First Leader

7 年

but if you were running a startup and you wanted such labeled life sciences data what would you do? you still have to get that massive exercise underway of roping in doctors to label it and then collate it; if you did it with only one region based experts there would be a regional bias, and why would a doctor do free labeling for you, he will charge at consultation prices; That would add to your immediate burn rate

Imran Khan A.N

Senior Manager | AI & Data | Ex Deloitte | UAE Golden Visa Holder

7 年

I understand that training data sets are expensive to get, however it is a one time cost. Further since it is training itself on live data, the past data sets can also help a great deal, solving the underlying data problem. I am not sure about life sciences example you provided, I am quoting only from the BFS stand point. Very interesting article, Alfy

查看更多评论

要查看或添加评论，请登录

Alfred David的更多文章

AI's Blind Spot: The Enduring Challenge in Software Development

2025年1月6日

AI's Blind Spot: The Enduring Challenge in Software Development

The rise of AI tools in programming has been nothing short of revolutionary. These tools are transforming how…
Integrating OpenAI's GPT model with WhatsApp's API

2024年12月20日

Integrating OpenAI's GPT model with WhatsApp's API

This process involves setting up a WhatsApp Business account, using a third-party service to connect WhatsApp with your…
Deep Learning Fundamental Reads

2017年4月24日

Deep Learning Fundamental Reads

I’ve been wanting to compile a list of reads I’ve done to get a basic understanding of deep learning as part of my…
Demystifying Infrastructure as Code (IAC)

2017年2月10日

Demystifying Infrastructure as Code (IAC)

What is infrastructure as code ? The concept behind infrastructure as code (IAC) is that you write and execute code to…

2 条评论
Cloud Native PaaS Tools

2017年1月2日

Cloud Native PaaS Tools

I recently was in a meeting with my organisations technology boffins and the discussion veered around the trend of how…
Tango with C4 on Azure

2016年11月30日

Tango with C4 on Azure

The new blueprint for software architects to represent a system architecture that they are designing is through the…
Tools for Deep Learning Neural Networks

2016年11月2日

Tools for Deep Learning Neural Networks

There has been a lot of banter on Deep Learning, which now is on the verge of slowly transcending from theoretical…
Before you say ' I Do ' to APIs

2016年10月25日

Before you say ' I Do ' to APIs

Well the thing is in the last couple years the entire EAI paradigm has completely transformed and monolithic SOA…

2 条评论
At the Cusp of AI

2016年9月14日

At the Cusp of AI

I’ve seen how the IT market in India has shaped itself and come about being called an IT powerhouse over the last 20 +…

2 条评论
Log Federation in AWS

2016年9月10日

Log Federation in AWS

This is a classic Big Data workflow example, where large volumes of data needs to be moved from an enterprise to a…

See all articles

Deep Learning Mechanisms in Applications

Alfred David

Tech Innovation Alchemist | AI-to-Blockchain Strategist | Building World-Class Engineering Teams | Future-First Leader

Alfred David的更多文章

社区洞察

其他会员也浏览了

How to Reduce Risk and Time-to-Market in Deep Learning Development

What Is Deep Learning? Definition and Techniques [With Examples]

What Is Deep Learning AI? A Simple Guide With 8 Practical Examples

HOW TO CATEGORIZE YOUR DEEP LEARNING PROJECT ?

Deep Learning, an Alternative way of Thinking

Deep Learning Applications Development : Introduction to DLtrain and its use in DL workload

Deep Learning Software Market: Growth Opportunities and Strategic Insights

AI Deep Learning Compared to Algorithm Programs

AI Atlas #5 Deep Learning

Deep learning startups, use cases & books

Alfred David的更多文章

AI's Blind Spot: The Enduring Challenge in Software Development

Integrating OpenAI's GPT model with WhatsApp's API

Deep Learning Fundamental Reads

Demystifying Infrastructure as Code (IAC)

Cloud Native PaaS Tools

Tango with C4 on Azure

Tools for Deep Learning Neural Networks

Before you say ' I Do ' to APIs

At the Cusp of AI

Log Federation in AWS

社区洞察

其他会员也浏览了

How to Reduce Risk and Time-to-Market in Deep Learning Development

What Is Deep Learning? Definition and Techniques [With Examples]

What Is Deep Learning AI? A Simple Guide With 8 Practical Examples

HOW TO CATEGORIZE YOUR DEEP LEARNING PROJECT ?

Deep Learning, an Alternative way of Thinking

Deep Learning Applications Development : Introduction to DLtrain and its use in DL workload

Deep Learning Software Market: Growth Opportunities and Strategic Insights

AI Deep Learning Compared to Algorithm Programs

AI Atlas #5 Deep Learning

Deep learning startups, use cases & books