登录查看更多内容

Reusable, Efficient AI - AI Meets Engineering or Simply call it Engineered!!

Dhruvin Desai

Lateral Thinker | Engineering Leader | Walmart | Tesco | Jio | Boeing | Avaya

发布日期: 2022年8月7日

Engineering is the application of science and maths to solve problems. While scientists and inventors come up with innovations, it is engineers who apply these discoveries to the real world.

Ever wondered what is with AI/Deep learning why does not it look like Software Engineering? rather a Science? been integrated in Software Engineering processes for a decade, but for every new problem to solve, the models has to be recreated, retrained. There is no concept reusability neither breaking problem down into simpler manageable building blocks.

A model which is trained for forecasting Stock prices cannot be reused without retraining with large data set for forecasting Housing prices. But forecasting a Price follows same economic models, which an Human being would be able re-apply from Stock prices. In fact in real world if housing prices goes down Stock brokers will immediately bring down the housing stocks.

Now that seems to have Caught attention at Google and elsewhere, google is developing AI framework - Pathways, that will solve following three problem.

Today's AI models are typically trained to do only one thing. Pathways will enable us to train a single model to do thousands or millions of things.
Today's models mostly focus on one sense. Pathways will enable multiple senses. For Instance, Pathways could enable multimodal models that encompass vision, auditory, and language understanding simultaneously in a single model.
Today's models are dense and inefficient. Pathways will make them sparse and efficient, given an input model can route the and activate only few paths thus than current state of Art which activates entire model. For example for vision a specific paths are activated, but Vision and Auditory share some Path.

Now behind the scenes - the techniques that are being employed are 'Sparse activation' and 'meta- learning' and 'Few Shot learning'. One of the many implementation of meta-learning uses a meta-learner based on LSTM.

The Idea of the behind meta-learning is to create a learner(the model) that can be reused and have parameters already initialised when a model for new class (never seen before by a model) has to be created, rather than with traditional approach where Parameters for new model learning are initialised randomly for every new problem. So for instance if you already have a meta-learner that has capability to output parameters weights for a model that learnt to identify certain class of images, those parameters can be input to a learner (new model), and the Learner with very few images of new class can learn and adjust parameters of meta-learner and Learner itself.

Meta-Learner for every new class of training will take loss function of learner as input and previous time step output meta-learner as state, and create new state and new output parameters. The above process would continue for each new class of image identification, and eventually build a meta-learner trained to Learn any Classes of images.

领英推荐

The way machines learn: types of learning algorithms

Naveen Joshi 5 年前

Machine Learning vs. AI: What’s the Difference?

Get Ahead by LinkedIn News 2 年前

Master Machine and Human Learning to Win the Digital…

Bill Schmarzo 4 年前

Also above Technique would require very few labelled data set to train new classes as meta-learner has already initialised the parameters and the learning does not start with random Parameters.

More info here https://www.youtube.com/watch?v=Kk1I0i6SGzY&list=LL&index=1

Sparse Activation allows only few network paths to be activated based on input data, for example Class of image of trees may be require different activation than that of Flowers but both can still learn from same meta-learner with few shots. The idea is to be efficient in terms of the amount of compute that is being utilised, by not activating the entire Network.

More - info on Switch transformer and sparse activation for a trillion parameter model https://www.dl.reviews/2021/02/10/switch-transformers/

Just taking an example of meta-learner for images, use of CNN and LSTM to integrate for image classification or video analysis to remember and update the state of current output with each time-step or each frame is tried and tested multiple times. The same technique can be employed for meta-learning as we described above.

From an Engineering Management perspective, going by google blog on Pathways why did it take two decades for Technology Companies to realise re-usability and efficiency in AI, both of which are Engineering traits.

Did Somewhere as industry we always thought AI as research (Science) and never focused on Engineering it ??? What may be cost implication of not able to realise to Engineer a solution ? How many duplicate models did industry create and how much money was spent on creating labelled data-sets? and how much compute we lost in lieu of sparse activation.

要查看或添加评论，请登录

Dhruvin Desai的更多文章

Reality Distortion Field by tech blogs. Monolith/Microservices, Monorepo/Polyrepo, CAP Theorem and more.

2023年5月7日

Reality Distortion Field by tech blogs. Monolith/Microservices, Monorepo/Polyrepo, CAP Theorem and more.

Steve Jobs mastered Reality Distortion Field (RDF) that too without Social Reach. Marketing, Charisma, Bravado…
Recursively abstracting unvaried Algorithm and Computing problems -Just hides traces of reality.

2023年1月26日

Recursively abstracting unvaried Algorithm and Computing problems -Just hides traces of reality.

ChatGPT, Web3 and Low Latency Streaming are the same wine in new bottle without solving the taste and quality, creating…

1 条评论
The Paradox of Trust - Next Decade Problem in Computing.

2022年1月2日

The Paradox of Trust - Next Decade Problem in Computing.

Three parallel Advances in computing are making the unsolved(never solved - Universal Tech Debt) problem of Trust…
Google Stadia tech Potential Android Play Store killer ?

2019年4月23日

Google Stadia tech Potential Android Play Store killer ?

Cloud gaming success would it eventually kill Play Stores, Android and iOS ? Google's Announcement(Stadia) for Cloud…
'Interactive Video':- 5G of Video Entertainment.

2019年4月2日

'Interactive Video':- 5G of Video Entertainment.

Netflix CEO:- 'We compete with Fortnite more than HBO' The video entertainment has witnessed 4 Generations of adoption…

3 条评论
Social MarketPlace: Losses on Discounts Paid Back ..

2015年12月9日

Social MarketPlace: Losses on Discounts Paid Back ..

How about a Social Marketplace ? Products are marketed by users, users decide discounts, users create engagements…

2 条评论

See all articles

Reusable, Efficient AI - AI Meets Engineering or Simply call it Engineered!!

Dhruvin Desai

Lateral Thinker | Engineering Leader | Walmart | Tesco | Jio | Boeing | Avaya

领英推荐

Dhruvin Desai的更多文章

社区洞察

其他会员也浏览了

Exploring the Amazing World of Machine Learning

AI-GA!

MIT's Test-Time Training: A New Path to Human-Level AI Reasoning

WHAT ARE THE DIFFERENCES BETWEEN ARTIFICIAL INTELLIGENCE, MACHINE LEARNING AND DEEP LEARNING?

Step 3: Prompt Engineering

3 Key ML Models ??

50 simple AI Terms to learn

The Difference Between Generative AI and Machine Learning: Breaking Down the Myths

Navigating the Era of AI and Machine Learning A Personal Odyssey

AI vs. Traditional Software

领英推荐

Dhruvin Desai的更多文章

Reality Distortion Field by tech blogs. Monolith/Microservices, Monorepo/Polyrepo, CAP Theorem and more.

Recursively abstracting unvaried Algorithm and Computing problems -Just hides traces of reality.

The Paradox of Trust - Next Decade Problem in Computing.

Google Stadia tech Potential Android Play Store killer ?

'Interactive Video':- 5G of Video Entertainment.

Social MarketPlace: Losses on Discounts Paid Back ..

社区洞察

其他会员也浏览了

Exploring the Amazing World of Machine Learning

AI-GA!

MIT's Test-Time Training: A New Path to Human-Level AI Reasoning

WHAT ARE THE DIFFERENCES BETWEEN ARTIFICIAL INTELLIGENCE, MACHINE LEARNING AND DEEP LEARNING?

Step 3: Prompt Engineering

3 Key ML Models ??

50 simple AI Terms to learn

The Difference Between Generative AI and Machine Learning: Breaking Down the Myths

Navigating the Era of AI and Machine Learning A Personal Odyssey

AI vs. Traditional Software