Leveraging IBM Power10 for AI
IBM was the pioneer in adding on-processor accelerators for inferencing in its IBM Power10 chip (called the Matrix Math Accelerator engines). An inferencing model is one that is used to find insights on data through being trained on how to find patterns of interest in the data. MMA gave the Power10 platform the ability to be faster than other hardware architectures without the need to spend an extra watt in energy with added GPUs.
The Power10 chip can extract insights from data faster than any other chip architecture and consumes much less energy than GPU-based systems, and that is why it is optimised for AI.
Inferencing does not need as much compute power as compared to the compute power required in training an artificial intelligence (AI) model. Thus, it is totally possible—and even more energy efficient—to inference without any extra hardware accelerators (such as GPUs) and even perform on edge devices. It is common to have AI inferencing models run on smart phones and similar devices just by using the CPU. Many picture and face filters on social media phone apps are all AI inferencing models.
Leveraging IBM Power10 for AI, especially for inferencing, does not require any extra effort from AI DevOps teams. The data science libraries—such as openBLAS, libATen, Eigen and MLAS, to name a few—are already optimised to make use of the MMA engines.
领英推荐
So, AI frameworks that leverage these libraries—such as Pytorch, Tensorflow and ONNX—already benefit from the on-chip acceleration. These optimised libraries are available through the RocketCE channel in anaconda.org.
IBM Power10 can also speed up inferencing by using reduced-precision data. Instead of feeding the inference model with 32-bit floating point data, one can feed it with 16-bit floating point data, for example—filling the processor with twice as much data for inferencing at the same time. This works well for some models without prejudice in the accuracy of the inferenced data.
Inferencing is the last stage of the AI DevOps cycle, and the IBM Power10 platform was designed to be AI-optimized, thus helping clients extract insights from data in a more cost-effective way both in terms of energy efficiency and reducing the need for extra accelerators.
Message me to learn more about how IBM Power10 is enhancing the speed of AI Inferencing.