Docker AI Catalog : The Future of Curated AI Models and LLMs
Turja N Chaudhuri ( ?? to the Cloud )
Global Lead, Platform Success, EY Fabric | ? Practice at EY | Views are my own
TL ; DR
Context
AI, and specifically GenAI is progressing at a tremendous pace; it's almost impossible to keep track of it, even for someone who is full-time dedicated to that ecosystem.? The AI newsletters are brimming with details about the next new LLMs, SLMs, AI orchestrators, AI gateways, and so on.
This innovation in AI is extremely promising, but presents a new challenge to most enterprises :
how to process all this information, and decide what is the correct course of action for them.
Enterprises cannot afford to experiment with every single LLM, or library that is out there - they need someone to guide them as to what is trusted, secured and allowed to be used, within the context of an enterprise ecosystem.
The AI Corundum : To Much to Choose From
As part of the Internal Platform Adoption team at EY, when I speak to my enterprise customers, most of the questions around GenAI revolve around the topic of :
"Is this LLM allowed to be used by my company ?"
"Can I download LLMs from HuggingFace, is that allowed ?"
"How can I be sure this LLM does not have any vulnerabilities, can I use my client's data with it ?"
Obviously, every single solution team will need to do their due diligence on what is secure, or allowed within their usage context, but there is a need within enterprises for some generic guidance to help them choose between what is allowed, and what is blocked, within the confines of that enterprise itself.
Basically, they need someone to filter this ever - increasing catalog of AI models, LLMs, SLMs, AI Orchestrators, AI Gateways... and tell them which :
The need for Curation within the AI Ecosystem
The AI Ecosystem is too fractured now, for any single person to know what to do. Let alone individuals, entire portfolios, or teams are having a difficult time in identifying what to use, and what not to.
In the vast and growing landscape of AI, not all AI models are created equal. Businesses often struggle to identify which AI models meet their needs, ensure compatibility with existing systems, and maintain ethical and reliable performance.
Curated AI models solve this problem by offering pre-vetted, high-quality solutions that:
From my perspective, this is where AI Catalogs have a huge role to play.
They offer a single pane of glass within an enterprise context, for consumers to choose from the curated subset of models / LLMs that are supposed to be more secure, and compliant than others. This provides the necessary confidence that engagement teams need to get started on a piece of work.
Docker AI Catalog
Docker is a trusted name in the industry, and for long the Docker Hub has been a gold-standard for teams to find docker images, helm charts, based on the use-case. With the addition of the GenAI catalog, teams can also search within a curated list of offerings that they can then use confidently for the entire GenAI application development lifecycle.
Docker AI Catalog is a curated repository of AI and machine learning (ML) models, and other associated artifacts, designed to simplify the process of integrating advanced AI capabilities into applications.
Let's take an example :
Say, an enterprise user wants to start using Llama, by deploying it on their company's private cluster setup, but don't know where to get started, what is the latest secure LLM image they can use, and so on.
They come to Docker Hub, and can now search for Llama
领英推荐
Once they find what they are looking for, they can get more details :
As you can see there is a lot to unpack here :
Docker AI Catalog : Different artifact types
Docker has been in this business of managing, verifying and offering containerized workloads for a long time now, and Docker Hub to this day is still the de-facto image repository for most of us.
So, with the GenAI ecosystem in play now, it's only normal for Docker to put their skin in the game, and leverage all the existing investments they have in this space, to provide a similar sort of experience for GenAI assets.
But, it's not only images of LLM models that they are providing a standard consumption pattern for, you can also search and consume other artifacts involved in building a GenAI stack like :
So, essentially it's a one-stop catalog comprised of a curated, and verified collection of components that one might need to create a GenAI stack starting from the models, to the orchestration layer, to the eventual deployment engine.
Where can I find Docker GenAI Catalog
You can access the Docker GenAI Catalog right from the homepage of Docker Hub ( https://hub.docker.com/ ) :
Recently, it has also been integrated within the UI of Docker Desktop, so you don't need to navigate outside of Docker Desktop, and go to Docker Hub to find the GenAI related artifacts - you can find them within the Desktop UI itself.
Conclusion
Docker, Inc has been around for a long time now, so has Docker Hub, and other associated offerings from them.
Till now, they were providing an easy-to-comprehend way of searching, and consuming container artifacts like docker images, helm charts, etc. Now, Docker has taken it up a notch, and extended their framework to support GenAI assets ( LLM models, orchestrators, etc. ) as well, while following the same consumption pattern.
By offering curated AI models and LLMs in a user-friendly, secure, and scalable format, it paves the way for businesses to harness the full potential of artificial intelligence. As the industry continues to embrace AI, platforms like Docker AI Catalog will undoubtedly become integral to the next wave of innovation.
Anybody who has spent time in an enterprise setting knows how difficult it is to agree on something. Most teams just want to consume the latest and greatest, but don't want to invest any effort in researching different options, etc. For such use-cases, having a trusted repository like Docker AI Catalog makes a lot of difference.
If the enterprise already has a enterprise agreement with Docker, Inc , it becomes even less of a discussion. In those cases, the security team has already vetted docker and approved it's usage within the firm, in which case using the catalog just becomes an extension of the same process, and becomes even easier.
In any case, when it comes to AI - there is a gap in the industry today, that is not a technology gap, rather a trust gap.
With so many companies coming up with the next big orchestrator technology, or vector database, or LLM(s), SLM(s0, etc. nobody really knows who they can trust with their client data, and who not. Any enterprise-grade company who tries to solve this issue, by providing an authenticity to the consumption process will always have my support.