Hugging Face partners with NVIDIA to democratise AI inference
Alex Galert
Helping HERO Founders With Over 10,000 Hours of Expertise - Build $20M Startups in 24 Months with AI and Blockchain—Maximize Control, Scale Efficiently, and Create Your Legacy.
Hugging Face has joined forces with NVIDIA to bring inference-as-a-service capabilities to one of the world’s largest AI communities. This collaboration, announced at the SIGGRAPH conference, will provide Hugging Face’s four million developers with streamlined access to NVIDIA-accelerated inference on popular AI models.
The new service enables developers to swiftly deploy leading large language models, including the Llama 3 family and Mistral AI models, with optimisation from NVIDIA NIM microservices running on NVIDIA DGX Cloud. ???????? ?????????????????????? ???????? ???? ???????????????? ?????? ?????????????? ???? ?????????????????????? ???????? ????????-???????????? ???? ???????????? ???????????? ???? ?????? ?????????????? ???????? ?????? ?????? ?????????????????? ???????? ???? ???????????????????? ????????????????????????.
For Enterprise Hub users, the offering includes serverless inference, promising increased flexibility, minimal infrastructure overhead, and optimised performance through NVIDIA NIM. This service complements the existing Train on DGX Cloud AI training service available on Hugging Face, creating a comprehensive ecosystem for AI development and deployment.
The new tools are designed to address the challenges faced by developers navigating the growing landscape of open-source models.
By providing a centralised hub for model comparison and experimentation, Hugging Face and NVIDIA are lowering the barriers to entry for cutting-edge AI development. Accessibility is a key focus, with the new features available through simple “Train” and “Deploy” drop-down menus on Hugging Face model cards, allowing users to get started with minimal friction.
领英推荐
At the heart of this offering is NVIDIA NIM, a collection of AI microservices that includes both NVIDIA AI foundation models and open-source community models. These microservices are optimised for inference using industry-standard APIs, offering significant improvements in token processing efficiency – a critical factor in language model performance.
?????? ???????????????? ???? ?????? ???????????? ???????????? ???????? ????????????????????????. When accessed as a NIM, models like the 70-billion-parameter version of Llama 3 can achieve up to 5x higher throughput compared to off-the-shelf deployment on NVIDIA H100 Tensor Core GPU-powered systems. This performance boost translates to faster, more robust results for developers, potentially accelerating the development cycle of AI applications.
Underpinning this service is NVIDIA DGX Cloud, a platform purpose-built for generative AI. It offers developers scalable GPU resources that support every stage of AI development, from prototype to production, without the need for long-term infrastructure commitments. This flexibility is particularly valuable for developers and organisations looking to experiment with AI without significant upfront investments.
As AI continues to evolve and find new applications across industries, tools that simplify development and deployment will play a crucial role in driving adoption. This collaboration between NVIDIA and Hugging Face empowers developers with the resources they need to push the boundaries of what’s possible with AI.
Manager and Instructor, Applied Machine Learning at The Objective AI
3 个月Alex Galert please consider posting only when it adds substantial value to do so #AIHype #Checkthispostin2years
??Spot Instance Surfer | ??GPU Optimizer
4 个月Definitely see the benefits of this partnership. Curious at a price point of $36,999 per instance per month for DGX Cloud, are the productivity benefits worth the cost for those using it?
Sales Manager , Certified English Teacher
4 个月This partnership is a game-changer! Simplifying AI deployment will empower developers to innovate faster with powerful models like Llama 3.
Head of Sales and Marketing Department
4 个月Awesome collaboration! ???? This will totally streamline AI projects!
Coach in Chris Donnelly's The Creator Accelerator | xIBM Consulting | xPwC | Certified Executive Coach | #1 ???? creator
4 个月Sounds great! Thank you for sharing this update, Alex Galert ??