Meta collaborates with NVIDIA to build the world's largest AI Supercomputer for AI Research and Production

Meta collaborates with NVIDIA to build the world's largest AI Supercomputer for AI Research and Production


Meta has released the AI Research SuperCluster (RSC), claiming it to be one of the world's fastest AI supercomputers. RSC will analyze text, images, and video together across hundreds of languages, assisting in the development of better AI models.


While introducing RSC, Mark Zuckerberg said? “Meta has developed what we believe is the world’s fastest AI supercomputer. We’re calling it RSC for AI Research SuperCluster. The experiences we’re building for the metaverse require enormous compute power (quintillions of operations/second!) and RSC will enable new AI models that can learn from trillions of examples, understand hundreds of languages, and more”.


RSC will play an important role in the Metaverse

To fully understand the benefits of self-supervised learning and transformer-based models, it is essential to train increasingly large, complex, and adaptable models. Speech recognition must work effectively even in difficult scenarios with a lot of background noise. More languages and dialects must be understood by NLP.


RSC, according to Meta, can train models that use multimodal signals to determine whether an action, sound, or image is harmful or benign faster. It also stated that as the foundation for the metaverse is laid, RSC will grow even larger with enhanced capabilities. RSC has already been used by Meta researchers to train large models in NLP and computer vision.


NVIDIA's Research Infrastructure

Meta has collaborated with NVIDIA to construct the AI Research Supercomputer. It uses 760 NVIDIA DGX A100 systems as its compute nodes. It has 6,080 NVIDIA A100 GPUs connected by an NVIDIA Quantum 200Gb/s InfiniBand network for TF32 performance of 1,895 petaflops. RSC's NVIDIA Partner Network delivery partner is Penguin Computing.


With its Altus systems, Penguin also provided managed services and AI-optimized infrastructure for Meta, including 46 petabytes of cache storage. Pure Storage FlashBlade and FlashArray/C offer the scalable all-flash storage capabilities required to increase RSC.


Early Meta benchmarks show that RSC can train large NLP models three times faster and run computer vision tasks twenty times faster than the previous system. RSC will expand to 16,000 GPUs in the second phase later this year. Meta anticipates five exaflops of mixed precision AI performance.


Privacy and Security

According to Meta, RSC was designed with privacy and security as top priorities.

  • RSC is separated from the rest of the internet. It has no direct inbound or outbound connections, and all traffic is routed through Meta's production data centers.
  • The entire data path from the storage systems to the GPUs is encrypted from beginning to end. Meta asserts that it has the appropriate tools and processes in place to ensure that these standards are met on a regular basis.
  • Before the data is imported to RSC, it is subjected to a privacy check to ensure that it has been properly anonymized. After that, it is encrypted before being used to train AI models. The decryption keys are routinely erased, so that older data is no longer accessible.



Follow us for more Tech news and Web3 updates!


要查看或添加评论,请登录

社区洞察

其他会员也浏览了