The 'I' stands for Inference
Dylan Fernandes
Technical Project Manager | Solutions Consultant | Presales Engineer | Systems Architect | Operations & Delivery | Business Analyst | Data Specialist
If there's one key take away I learned from IBM's release of the IC922 this year, it was the difference between training a model and using a model to make inferences. And if that didn't make your head spin, you're already on a solid start.
Artificial Intelligence (AI) is the new "flavor of the decade" this time folks, and has been for some time. AI workloads show great promise & are steadily rising with organisations thinking up new and innovative ways to take advantage of the overabundance of data at their disposal either collected over the years or the deluge of more that is yet to come.
All theories of the Terminator coming to life aside however, top use cases today remain the following:
- Classification (e.g. identify cats vs dogs)
- Prediction (e.g. how likely is a bank transaction likely to be a fraudulent one)
- Automation (e.g. when x occurs do y)
And all these based on "learnings" picked up from datasets fed into an algorithm developed by someone somewhere (Schwarzenegger in the not too distant future perhaps) .
Anyone even remotely involved in AI would know it primarily comprises of three segments (see image below). Data moves from various different sources into a machine where it is used to train, develop and fine-tune a model which is then used in the real world to make judgments, predictions, classifications (you name it) based on new data that is collected after in real time.
While the announcement of IBM's AC922 (link below) received a lot of praise owing to its use of powerful GPUs specialized for training models quickly using lots of data (Training stage), it left a big hole in terms of where to keep this data for quick access before or during training (Data stage) and then post-training to use the "trained model" to make inferences using fresh new data after (Inference stage), one of the biggest reasons for the release of the IC922.
Training and Inferencing while similar, have quite varying requirements in terms of hardware & performance:
On the one hand, training a model requires large volumes of data used to teach a model (with multiple runs used in many instances to enhance accuracy), is compute intensive and requires the use of accelerators like GPUs. In addition, advanced I/O capabilities to allow high bandwidth or low latency and scalability of servers is key to this phase of the machine learning process.
Inferencing however deals with throwing fresh new data at an already trained model that is ready or nearly ready for production. The accelerators here don't need to be as powerful allowing room for hardware that is more economical on energy & cost. Similar to the training phase (although the emphasis is greater here), advanced I/O capabilities and scalability of servers are vital to providing agile flexibility to allow sizing that can keep up with the business requirements.
What did you infer this week in the bustling world of AI?
Disclaimer: These views are those of my own based on my own learning in the field of AI. Feedback is always welcome.
Stay curious :)