Driving a holistic AI at Scale Approach and Building AI Supercomputer
Shalaka Verma
Technical Executive Leadership | Quantum Computing | Presales| Startup Advisor
Microsoft recently developed single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server for openAI
Based on latest AI research community , there are certain tasks that are better learned and performed by a single large model and then applied more generically for broader set of capabilities. E.g. learning not just the language, but also nuances, summarising long speeches etc.
Developing and maturing large generic model building capabilities is an important step towards evolution of AI as platform instead of point solution. Making these broader set of capabilities and model which developed them, generally available, allows developers and data scientists to stay focused on specific solution at hand, and go deeper into specialised data.
“If we do it right, we might be able to evolve AI into a form of work that taps into our uniquely human capabilities and restores our humanity. The ultimate paradox is that this technology may become a powerful catalyst that we need to reclaim our humanity.
John Hagel
While this is an awesome development, if clients do not need dedicated supercomputers, Azure AI makes the same set of AI accelerators in Azure Compute and Networks. Also some of the large Azure Turing Models will be open sourced soon along with training recipes and will be available via Azure Machine Learning.
Along with this, Microsoft also announced a new version of DeepSpeed which optimises amount of compute power needed for large model training. To be specific, this version allows people to train models more than 15 times larger and 10 times faster than they could without DeepSpeed on the same infrastructure.
Details of this announcement can be read here
Technical Specialist, Architect - Data & AI at Microsoft
4 年Nicely pen down article with summerized important information.. Thanks for sharing..
Technical Executive Leadership | Quantum Computing | Presales| Startup Advisor
4 年Jones Jebaraj Preetha Ramachandran Gaurav Agarwal Kapil Dev Sapra