Generative AI Platform Offerings, Compute Services, Inference Endpoints & Pre-packaged offerings in AWS Marketplace (Part 4)
Generative AI Platform Offerings, Compute Services, Inference Endpoints & Pre-packaged offerings in AWS Marketplace

Generative AI Platform Offerings, Compute Services, Inference Endpoints & Pre-packaged offerings in AWS Marketplace (Part 4)

In this article, I am sharing the Platform Offerings, Compute Services, Inference Endpoints & Pre-packaged offerings for Generative AI in AWS Marketplace from partners. This is 3rd in a series of articles on 'Classifying Generative AI Partner Offerings in AWS Marketplace'. Previous articles are here: Part 1: Classification, Part 2: Foundation Models & Vector Databases, Part 3: LLMOps, Observability & Monitoring, and Security & Privacy tools.

Classifying Generative AI Partner Offerings in AWS Marketplace

Generative AI Platform offerings (Count: 14)

These offer platform tools to build Generative AI applications from end-to-end platforms to low/no-code development platforms.

Amazon Machine Image / EC2

Nvidia’s Nvidia AI Enterprise is an end-to-end, cloud-native software platform that accelerates data science pipelines and streamlines development and deployment of production-grade AI applications, including generative AI.

Intel’s Intel? Distribution of OpenVINO? Toolkit is an open-source toolkit for optimizing and deploying AI inference. With this toolkit developers can quickly develop applications and solutions that solve a variety of tasks including emulation of human vision, Generative AI, LLMs, automatic speech recognition, natural language processing, recommendation systems etc.

Ai Bloks’ llmware for Enterprise-Grade LLM Applications is a unified, open source, extensible framework for LLM-based application patterns including Retrieval Augmented Generation (RAG).

ZS Max.AI Platform is a low code/no code platform for generative AI to enable creation of agents at scale.

ecosystem.ai Platform is a low code environment which uses a combination of AI and behavioral science to select the best campaigns, products, messages, and offers for customers - in real time. It sets up effective recommendations in minutes using their default Dynamic Recommenders and uses Generative AI to enhance customer interactions.

SaaS

Fireworks AI Generative AI platform as a service, optimizing for rapid product iteration building on top of gen AI, as well as minimizing cost to serve.

Stratio’s Stratio Generative AI Data Fabric is a platform containing end-to-end data fabric capabilities enhanced with generative AI technologies. The product can be divided in four main high level components: 1. Generative AI interface to access private enterprise trusted data 2. Autodiscovery and data virtualization 3. Data Governance and Business Data Layer 4. Analytics and MLOps.

Arcee.ai's Arcee SLM Adaptation System The future doesn't revolve around a single, all-encompassing model; instead, it's a world comprising millions, if not billions, of smaller, specialized LLMs known as SLM's. Each SLM is meticulously tailored to customer data for their specific tasks and use cases. This system takes customer data through multiple layers of domain adaptation, starting with domain adaptive pretraining, then aligning the model for your task and use case, ending with retrieval augmented generation - all integrated into a unified system.

IBM Software’s IBM watsonx Orchestrate powered by LLMs, includes pre-built skills that use natural language processing to draw from a catalog of basic and advanced skills to execute customer requests, in context and in the right order, without the need for any specialized training or developer experience. It connects to various apps and tools to work seamlessly across Salesforce, Workday, Outlook, Gmail and other tools to accomplish tasks in a simple, no-code interface. Additionally, watsonx Orchestrate can consume automations from UI Path, Boomi, Mulesoft, Dell, and many other sources.

FriendliAI’s Friendli Dedicated Endpoints is a SaaS service for deploying generative AI models. Friendli Engine cuts LLM inference serving costs by 40~80% while providing low latency and high throughput LLM serving.

Reconify’s Reconify is an analytics and optimization platform for Generative AI - featuring deeper insights into prompts and responses and tools to take action to improve response effectiveness.

Vectara GenAI Platform is an end-to-end platform for product builders to embed powerful generative AI features into their applications with extraordinary results. Built on a solid hybrid-search core, Vectara delivers the shortest path to a correct answer-action through a safe, secure, and trusted entry point.

Saturn Cloud is an award-winning ML platform with 75,000+ users, built to make AI/ML and LLMs easy and secure within the enterprise.

TrustPortal’s TrustPortal for AWS Generative AI is a no-code, SaaS-based platform that intelligently orchestrates any type of 'software robots' (RPA), API 'MiniBots', Digital, GenAI and other AI tools and people in real-time. This allows GenerativeAI to easily initiate complex, real-time automated processes across any number of old or new corporate systems, but keeping staff and customers fully "in the loop".

Compute Services, Inference Endpoints & Pre-packaged offerings (Count: 17)

These offerings provide compute services, inference endpoints and pre-packaged open-source foundation model offerings with various options.

SaaS

OctoML’s OctoAI is a compute service to run, tune, and scale generative AI models. With OctoAI, developers get the simplicity and reliability of closed-source APcI endpoint services for generative AI, with the flexibility to select and run their choice of models.

Amazon SageMaker

MK1 MK1 Flywheel - Llama2-Chat-7B Serve Llama2-Chat-7B at blazing fast speeds with MK1 Flywheel. Using MK1 Flywheel within SageMaker provides a robust and private LLM deployment well-suited for medium to low volume applications.

MK1 MK1 Flywheel - Mistral-Instruct-7B Serve Mistral-Instruct-7B at blazing fast speeds with MK1 Flywheel. Using MK1 Flywheel within SageMaker provides a robust and private LLM deployment well-suited for medium to low volume applications.

MK1 MK1 Flywheel - Bring Your Own Model Serve LLMs at blazing fast speeds with MK1 Flywheel. Using MK1 Flywheel within SageMaker provides a robust and private LLM deployment well-suited for medium to low volume applications.

MK1 MK1 Flywheel - Llama2-Chat-13B Serve Llama2-Chat-13B at blazing fast speeds with MK1 Flywheel. Using MK1 Flywheel within SageMaker provides a robust and private LLM deployment well-suited for medium to low volume applications.

Amazon Machine Image / CloudFormation Template

CyberWorx’s AnythingLLM-Private-Deployment Deploy own privately hosted, moderately hardened Retrieval Augmented Generation (RAG) solution using AnythingLLM and ChromaDB.

Meetrix.io’s AWS Marketplace: LLaMa 2 Meta AI 7B: OpenAI & API Compatible LLaMa 2 Meta AI 7B is tailored for the 7 billion parameter pretrained generative text model in the LLaMa 2 collection. This Amazon Machine Image is very easily deployable without devops hassle and fully optimized for developers eager to harness the power of OpenAI's advanced text generation capabilities.

Meetrix.io’s AWS Marketplace: LLaMa 2 Meta AI 13B: OpenAI & API Compatible LLaMa 2 Meta AI 13B is tailored for the 13 billion parameter pretrained generative text model in the LLaMa 2 collection. This Amazon Machine Image is very easily deployable without devops hassle and fully optimized for developers eager to harness the power of OpenAI's advanced text generation capabilities.

Meetrix.io’s LLaMa 2 Meta AI 70B: OpenAI & API Compatible Step into the forefront of large language models (LLMs) mastery with unprecedented depth and precision. This Amazon Machine Image is fortified by an unparalleled 70 billion parameters, tapping into a colossal pretrained dataset that sets the bar higher than ever before. With the sheer magnitude of the 70B model's data foundation, customers are not just getting results; they are achieving superior insights, intricate context comprehension, and unparalleled text generation finesse.

Meetrix.io’s Mistral AI 7B Instruct v0.2: OpenAI & API Compatible Mistral AI 7B Instruct v0.2 is a compact powerhouse of AI capabilities. Surpassing Llama 2 13B across all benchmarks, Mistral 7B boasts natural coding abilities and an impressive 8k sequence length. Tailored for versatility, deploy it effortlessly on AWS. To streamline customer’s AWS deployment, Meetrix.io offers a specialized AMI product, ensuring seamless integration. It's fully OpenAI-compatible and primed with an API and designed for ARM64. Experience next-level AI with Mistral AI 7B Instruct v0.2.

Meetrix.io’s Mixtral AI 8x7b Instruct v0.1: OpenAI API Compatible Mixtral 8x7B Instruct v0.1 is an innovative sparse mixture-of-experts model released by Mistral AI, designed for the developer community. This open-weight model, licensed under Apache 2.0, stands out with its high quality and 6x faster inference compared to Llama 2 70B. With capabilities such as handling a context of 32k tokens, multilingual support (English, French, Italian, German, and Spanish), strong performance in code generation, and the ability to be fine-tuned for instruction-following tasks, Mixtral 8x7B introduces a new frontier in open models by employing sparse architectures to optimize cost and latency.

Apps4Rent is a leading provider of hosted software applications for business, serving over 10,000 clients in more than 90 countries. Over the last decade, Apps4Rent has forged ahead in offering hosted applications such as Exchange, SharePoint, Dynamics CRM, Project Server, Virtual Desktop, and Virtual Server augmented with our excellent 24/7/365 support.

Llama2-13B-chat on Ubuntu20.04LTS with maintenance support by Apps4Rent is a pre-configured image, which means that it is ready to use right out of the box. Customers can simply deploy the image to an Ubuntu 22.04 LTS server and the chat application will be up and running in minutes.

Llama2-13B-chat & Gradio on Ubuntu 20.04 Apps4Rent

Llama2-7B-chat and Gradio on Ubuntu 20.04 by Apps4Rent is a great option for businesses that are looking to deploy a chat application on a budget. It is easy to use and secure, and it includes all of the necessary software for the chat application.

Mistral-7B on Ubuntu20.04LTS with maintenance support by Apps4Rent is an open-source workflow engine developed by Apps4Rent specifically for Ubuntu 20.04 LTS. It aims to simplify and automate complex workflows, allowing users to orchestrate various tasks and processes across different systems and tools.

spaCy, spacy-llm, Codellama_13b on Ubuntu22 with support by Apps4Rent This is a repackaged open-source software product wherein additional charges apply for technical support and maintenance by Apps4Rent. Spacy on AWS: NLP superpowers in Python, ready to deploy in seconds. CodeLlama-13B is an open-source python libraries for AI based projects. spacy-llm: Integrating LLMs into structured NLP pipeline.

spaCy, spacy-llm, Codellama_13b on Ubuntu20 with support by Apps4Rent This is a repackaged open-source software product wherein additional charges apply for technical support and maintenance by Apps4Rent. Get started with spaCy, the leading Python library for Natural Language Processing (NLP), instantly on your AWS environment. No need for manual setup or dependencies. CodeLlama-13B is an open-source python libraries for AI based projects.

My key observations from these platform offerings, compute services, inference endpoints & pre-packaged offerings are:

  • Inference endpoints from low, medium to large scale deployments are available in AWS Marketplace.
  • There are multiple options to choose from for throughput, price and latency considerations.
  • Separate pricing for software and realtime inference or batch transform on SageMaker.
  • Popular models like Lllama2 from Meta, Mistral models and even Bring Your Own Model (BYOM) are supported on these inference and app hosting services.

Note: This list is current based on my review as of February 3rd. If there are any Generative AI Platform tools, compute services or inference endpoints I am missing, feel free to mention that and I will make an update.

Alex(Ji-Young) Kim

Head of Technology Partnerships(All ISV Partners & AWS Marketplace) Korea in APJ

7 个月

Thanks for posting

Amit V. Singh Thank you for the mention. FriendliAI offers high-speed Inference optimization for generative AI models. Please check our page to know more.

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

8 个月

Your exploration of generative AI partner offerings in the AWS Marketplace is impressive. It's evident that AWS is fostering a rich ecosystem for AI solutions. Considering the comprehensive classification you've undertaken, I'm curious about the evolving trends you see in terms of demand and adoption for these offerings across various sectors. Are there specific sectors or use cases where these AI solutions are gaining more traction, and if so, what might be driving this trend?

要查看或添加评论,请登录

社区洞察

其他会员也浏览了