Reusable, Efficient AI - AI Meets Engineering or Simply call it Engineered!!
Dhruvin Desai
Lateral Thinker | Engineering Leader | Walmart | Tesco | Jio | Boeing | Avaya
Engineering is the application of science and maths to solve problems. While scientists and inventors come up with innovations, it is engineers who apply these discoveries to the real world.
Ever wondered what is with AI/Deep learning why does not it look like Software Engineering? rather a Science? been integrated in Software Engineering processes for a decade, but for every new problem to solve, the models has to be recreated, retrained. There is no concept reusability neither breaking problem down into simpler manageable building blocks.
A model which is trained for forecasting Stock prices cannot be reused without retraining with large data set for forecasting Housing prices. But forecasting a Price follows same economic models, which an Human being would be able re-apply from Stock prices. In fact in real world if housing prices goes down Stock brokers will immediately bring down the housing stocks.
Now that seems to have Caught attention at Google and elsewhere, google is developing AI framework - Pathways, that will solve following three problem.
Now behind the scenes - the techniques that are being employed are 'Sparse activation' and 'meta- learning' and 'Few Shot learning'. One of the many implementation of meta-learning uses a meta-learner based on LSTM.
The Idea of the behind meta-learning is to create a learner(the model) that can be reused and have parameters already initialised when a model for new class (never seen before by a model) has to be created, rather than with traditional approach where Parameters for new model learning are initialised randomly for every new problem. So for instance if you already have a meta-learner that has capability to output parameters weights for a model that learnt to identify certain class of images, those parameters can be input to a learner (new model), and the Learner with very few images of new class can learn and adjust parameters of meta-learner and Learner itself.
Meta-Learner for every new class of training will take loss function of learner as input and previous time step output meta-learner as state, and create new state and new output parameters. The above process would continue for each new class of image identification, and eventually build a meta-learner trained to Learn any Classes of images.
领英推荐
Also above Technique would require very few labelled data set to train new classes as meta-learner has already initialised the parameters and the learning does not start with random Parameters.
More info here https://www.youtube.com/watch?v=Kk1I0i6SGzY&list=LL&index=1
Sparse Activation allows only few network paths to be activated based on input data, for example Class of image of trees may be require different activation than that of Flowers but both can still learn from same meta-learner with few shots. The idea is to be efficient in terms of the amount of compute that is being utilised, by not activating the entire Network.
More - info on Switch transformer and sparse activation for a trillion parameter model https://www.dl.reviews/2021/02/10/switch-transformers/
Just taking an example of meta-learner for images, use of CNN and LSTM to integrate for image classification or video analysis to remember and update the state of current output with each time-step or each frame is tried and tested multiple times. The same technique can be employed for meta-learning as we described above.
From an Engineering Management perspective, going by google blog on Pathways why did it take two decades for Technology Companies to realise re-usability and efficiency in AI, both of which are Engineering traits.
Did Somewhere as industry we always thought AI as research (Science) and never focused on Engineering it ??? What may be cost implication of not able to realise to Engineer a solution ? How many duplicate models did industry create and how much money was spent on creating labelled data-sets? and how much compute we lost in lieu of sparse activation.