Build RAG applications using only APIs with Postman! ??

Build RAG applications using only APIs with Postman! ??

Welcome to the?AI in 5?newsletter with Clarifai!

Every week we bring you new models, tools, and tips to build production-ready AI!

This week, we bring you: ??

  • New Multi-Modal LLM: MiniCPM-Llama3-V 2.5?
  • Clarifai + LiteLLM Integration - Now you can call all the LLM APIs using the OpenAI format
  • Build RAG applications with APIs using Postman
  • May Wrap-Up: All the latest models wrapped and available in the community
  • AI tip of the week - Bulk Label your data at Upload time

MiniCPM-Llama3-V 2.5! ???

MiniCPM-Llama3-V 2.5 is a high-performance, efficient 8B parameter multimodal model excelling in OCR, multilingual support, and multimodal tasks.

Here are some key capabilities of the model:

? Leading Performance: Achieves an average score of 65.1 on OpenCompass, outperforming larger proprietary models like GPT-4V-1106, Gemini Pro, Claude 3, and Qwen-VL-Max.

? Strong OCR Capabilities: Excels in OCR tasks, handling images with high pixel counts and various aspect ratios, scoring over 700 on OCRBench.

? Multilingual Support: Extends capabilities to over 30 languages, leveraging Llama 3's multilingual strengths and VisCPM's cross-lingual generalization. The model is now available on the platform.

Try out the model and access it with an API here: ??

Try out the model here.

Clarifai ?? LiteLLM

LiteLLM is a library for seamless integration with various LLM provider's APIs that standardizes interactions through an OpenAI API format. This integration allows you to call various open-source and third-party LLMs available on the Clarifai Platform using the same input/output format.

Check out the below guide to get started:! ??

Checkout the guide here.

Build RAG with Clarifai using Postman! ??

You can access the Clarifai platform to create RAG applications with ease using just the APIs.

The steps include:

1. Create a Clarifai app with “Text” as the base workflow.

2. Create a RAG prompter operator model and then create the model?version.

3. Create a RAG Prompter Workflow along with a text-to-text node and specify your preferred LLM.

4. Upload your data to the above app created using the Python SDK. Check out the guide here on how to upload the data.

5. Make API predictions to the created RAG prompter workflow using Postman.

Check out the public Postman collection below that will walk you through these steps: RAG using Postman

May Wrap-up: Latest Models Integrated into the Community ???

  • GPT-4o (omni): GPT-4o is a multimodal AI model from Open AI that excels in processing and generating text, audio, and images. The model offers rapid response times and improved performance across languages and tasks. It can also respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation. The model also matches GPT-4 Turbo performance on text in English and code, with significant improvement in non-English languages, while also being much faster and 50% cheaper in the API.
  • Gemini-1.5-Flash: Gemini 1.5 Flash is a cost-efficient, high-speed foundational LLM optimized for multimodal tasks, ideal for applications requiring rapid processing and scalability. The model is lighter-weight than 1.5 Pro and designed to be fast and efficient for serving at scale. Additionally, the model excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more.
  • Snowflake-Arctic Instruct: A new cost-effective, enterprise-focused open LLM that excels in SQL, coding, and instruction-following from Snowflake! The model is designed to be both cost-effective and powerful, providing state-of-the-art performance for enterprise applications.
  • CogVLM-Chat:?An advanced open-source visual language model that can process and understand both visual and textual data. Compared to BLIP-2, Otter, and various LLaMA variants, CogVLM-Chat not only surpasses in scoring but also shows comprehensive strength in handling adversarial and complex scenarios.
  • Qwen-VL-Chat: A high-performing Large Vision Language Model (LVLM) by Alibaba Cloud for text-image dialogue tasks, excelling in zero-shot captioning, VQA, and referring expression comprehension while supporting multilingual dialogue.

  • WizardLM-2-8x22B: A state-of-the-art open-source LLM, excelling in complex tasks like chat, reasoning, and multilingual understanding, competing closely with leading proprietary models. The model is designed to be both cost-effective and powerful, providing state-of-the-art performance for enterprise applications.
  • Mixtral-8x22B-instruct: The latest and largest mixture of expert LLM from Mistral AI using a mixture of 8 experts (MoE) 22b models. The model comes with multilingual support and can handle a context of 32k tokens.
  • Qwen1.5-110B-Chat: 100 billion parameter model, demonstrates competitive performance in base language tasks and significant improvements in chatbot evaluations and boasts multilingual capabilities.
  • DeepSeek-V2-Chat: A high-performing, cost-effective 236 billion MoE LLM excelling in diverse tasks such as chat, code generation, and math reasoning.

AI tip of the week: ??

Bulk Labeling at Upload Time!

You can label your inputs as they are uploaded with bulk labeling. To do so, go to your application's page and select the Inputs option on the left sidebar.?

Next, upload your inputs. Once they are uploaded, you can easily add labels by selecting the inputs you want to label by clicking the checkmark. You can even select multiple inputs by holding down the "shift" key.

Then, label them all in bulk by clicking the Label as button which?saves you a ton of time.

Check out the complete guide here.

Want to learn more from Clarifai? “Subscribe” to make sure you don’t miss the latest news, tutorials, educational materials, and tips. Thanks for reading!

要查看或添加评论,请登录

Clarifai的更多文章

社区洞察

其他会员也浏览了