Hugging Face partners with NVIDIA to democratise AI inference

Hugging Face partners with NVIDIA to democratise AI inference

Hugging Face has joined forces with NVIDIA to bring inference-as-a-service capabilities to one of the world’s largest AI communities. This collaboration, announced at the SIGGRAPH conference, will provide Hugging Face’s four million developers with streamlined access to NVIDIA-accelerated inference on popular AI models.

The new service enables developers to swiftly deploy leading large language models, including the Llama 3 family and Mistral AI models, with optimisation from NVIDIA NIM microservices running on NVIDIA DGX Cloud. ???????? ?????????????????????? ???????? ???? ???????????????? ?????? ?????????????? ???? ?????????????????????? ???????? ????????-???????????? ???? ???????????? ???????????? ???? ?????? ?????????????? ???????? ?????? ?????? ?????????????????? ???????? ???? ???????????????????? ????????????????????????.

For Enterprise Hub users, the offering includes serverless inference, promising increased flexibility, minimal infrastructure overhead, and optimised performance through NVIDIA NIM. This service complements the existing Train on DGX Cloud AI training service available on Hugging Face, creating a comprehensive ecosystem for AI development and deployment.

The new tools are designed to address the challenges faced by developers navigating the growing landscape of open-source models.


By providing a centralised hub for model comparison and experimentation, Hugging Face and NVIDIA are lowering the barriers to entry for cutting-edge AI development. Accessibility is a key focus, with the new features available through simple “Train” and “Deploy” drop-down menus on Hugging Face model cards, allowing users to get started with minimal friction.

At the heart of this offering is NVIDIA NIM, a collection of AI microservices that includes both NVIDIA AI foundation models and open-source community models. These microservices are optimised for inference using industry-standard APIs, offering significant improvements in token processing efficiency – a critical factor in language model performance.

?????? ???????????????? ???? ?????? ???????????? ???????????? ???????? ????????????????????????. When accessed as a NIM, models like the 70-billion-parameter version of Llama 3 can achieve up to 5x higher throughput compared to off-the-shelf deployment on NVIDIA H100 Tensor Core GPU-powered systems. This performance boost translates to faster, more robust results for developers, potentially accelerating the development cycle of AI applications.

Underpinning this service is NVIDIA DGX Cloud, a platform purpose-built for generative AI. It offers developers scalable GPU resources that support every stage of AI development, from prototype to production, without the need for long-term infrastructure commitments. This flexibility is particularly valuable for developers and organisations looking to experiment with AI without significant upfront investments.


As AI continues to evolve and find new applications across industries, tools that simplify development and deployment will play a crucial role in driving adoption. This collaboration between NVIDIA and Hugging Face empowers developers with the resources they need to push the boundaries of what’s possible with AI.

Jonathan Bennion

Manager and Instructor, Applied Machine Learning at The Objective AI

3 个月

Alex Galert please consider posting only when it adds substantial value to do so #AIHype #Checkthispostin2years

回复
Jing Xie

??Spot Instance Surfer | ??GPU Optimizer

4 个月

Definitely see the benefits of this partnership. Curious at a price point of $36,999 per instance per month for DGX Cloud, are the productivity benefits worth the cost for those using it?

Khrystyna Prytula

Sales Manager , Certified English Teacher

4 个月

This partnership is a game-changer! Simplifying AI deployment will empower developers to innovate faster with powerful models like Llama 3.

Svitlana Medvedyk????

Head of Sales and Marketing Department

4 个月

Awesome collaboration! ???? This will totally streamline AI projects!

Dora Vanourek

Coach in Chris Donnelly's The Creator Accelerator | xIBM Consulting | xPwC | Certified Executive Coach | #1 ???? creator

4 个月

Sounds great! Thank you for sharing this update, Alex Galert ??

要查看或添加评论,请登录

社区洞察

其他会员也浏览了