登录查看更多内容

Latest Updates: FREE Llama 3.2 Multimodal & FLUX.1 [schnell], NVIDIA H200s, and Enterprise Platform

Together AI

The future of AI is open-source. Let's build together.

发布日期: 2024年10月16日

Hey there!

Welcome to the first issue of Together We Build, a handpicked selection of news, product launches, novel research, and AI tools from Together AI , aimed at everyone interested in keeping up with the latest developments in generative AI and LLMs.

These last few weeks were packed with massive launches for vision and image models and major updates to our product, research, and tools.

To ensure you don't miss our future updates, subscribe to the LinkedIn newsletter.

New Models

Llama 3.2 Multimodal - With free endpoint

We partnered with Meta to launch support for the new Llama 3.2 vision models and Llama Stack API. You can access our Llama 3.2?11B ?&?90B ?Turbo endpoints to get ultimate speed and accuracy for vision tasks like image captioning, visual question answering, and image-text retrieval.

Even more exciting, we added a completely?FREE endpoint ?for Llama 3.2 11B Vision!

Try Llama 3.2 11B Vision for free →

FLUX1.1 [pro] and FLUX.1 [schnell] - With free endpoint

We're thrilled to provide state-of-the-art image generation with the latest FLUX models! We added an endpoint for the powerful?FLUX1.1 [pro] ?model, plus two endpoints for FLUX.1 [schnell]: a?turbo endpoint ?(with the fastest performance) and a completely?FREE endpoint ?you can use to experiment with open-source image generation at no cost.

Explore FLUX models →

Qwen-2.5-7B & 72B

We also welcomed the latest super powerful large language model family by Alibaba Cloud! The 72B model rivals the capabilities of larger models like Llama 3.1 (405B) and Surpasses Qwen2 with higher scores: MMLU & HumanEval (85+), MATH (80+).

Try our 72B endpoint →

Product Updates

GPU Clusters with NVIDIA H200 and the Together Kernel Collection

Our?GPU clusters ?took a huge leap in performance! Now we offer NVIDIA H200 Tensor Core GPUs equipped with our custom-built?Together Kernel Collection , an optimized kernel stack. The result? Up to 24% speedup for operators used frequently in training, and up to 75% speedup for the fundamental operation used in FP8 inference (compared against PyTorch implementations). Fewer GPU hours = cost efficiencies = faster time to market.

Together Enterprise Platform

We launched the?Together Enterprise Platform ?to empower organizations to manage their entire Generative AI lifecycle: training, fine-tuning, and running inference on any model, in any environment. We deliver 2-3x faster inference, and up to 50% lower operational costs on your existing cloud (AWS, Azure, GCP, OCI) or on-premise infrastructure.

Analytics Dashboard

You know how “your AI is only as good as your data?” Well, with our new Together Analytics (beta), we now show your usage over time including requests, latency, and TPM.

View your analytics →

Improved Reliability & New Status Page

You've been asking for it, we shipped it! Our reliability scores have been soaring after our recent updates and fixes. We also introduced a handy new?status page ?so you can keep track of the uptime of our different models and services.

View status →

New AI Apps

Napkins

To inspire you to build with Llama 3.2 vision models, we launched?napkins.dev ?— a tool where you can upload a screenshot of a simple site/design and get the code in seconds. 100% free and open source!

领英推荐

This AI newsletter is all you need?#15

Towards AI 2 年前

This AI newsletter is all you need #12

Towards AI 2 年前

How Nvidia Will Reach Peak AI Profitability in 2024

Michael Spencer 1 年前

Try it out →

Blinkshot

We put the FLUX models to the test with?blinkshot.io ?— a tool that generates images as you type. This really shows the performance of our FLUX.1 [schnell] turbo endpoint. Blink once and you might miss it!

Try it out →

Product Descriptions

And to really showcase the flexibility of Llama 3.2 vision models, we also built a?demo app ?where you can upload product images and get descriptions in multiple languages for your e-commerce shop.

Try it out →

Featured Content & Research

Linearizing LLMs with LoLCATs

Meet LoLCATs (Low-rank Linear Conversion via Attention Transfer) — new work from our research team on linearizing LLMs. LoLCATs converts existing Transformers like Llama and Mistral into state-of-the-art subquadratic variants. Now for the same cost as a LoRA finetune!

Read research piece →

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Our research team cooked up a?thrilling read ?on their findings in introducing a novel method to distill Transformers into linear RNN architectures, particularly focusing on the Mamba model. The results were impressive, surpassing open-source hybrid models trained with trillions of tokens.

?Read research piece →

Build Multimodal Document RAG with Llama 3.2 Vision and ColQwen2

RAG is the talk of the town but many are just getting started with building RAG apps. So we rolled up our sleeves and ran a practical?webinar ?where we discussed how you can perform RAG over complex PDF documents using ColPali, Llama 3.2 Vision and ColQwen2.

Watch the recording →

Together Talks

It's hard to keep up with the growth of AI. So we are bringing the brightest minds to the virtual stage to dive into AI's biggest questions and opportunities. Check out our?first episode ?with our founders, Percy Liang and Vipul Ved Prakash, and the?second episode ?with our Chief Scientist, Tri Dao, and best-selling author and VP of AI & Open Source at Voltron Data, Chip Huyen.

Community Spotlight

Pika 1.5

Huge congrats to our friends Pika for launching Pika 1.5, a powerful model you can use to create stunning footage, longer clips, and jaw-dropping moves with unreal Pikaffects. Fully trained on Together GPU Clusters!

Try Pika 1.5 →

Llama 3.1 in the wild

Quick shoutout to?@KevIsDev ?for creating a modified version of the bolt.new repo using our Llama 3.1 endpoint . Great source of inspiration for other builders out there!

Check it out →

Always stay in the know—subscribe to the LinkedIn newsletter to receive our future updates.

Latest Updates: FREE Llama 3.2 Multimodal & FLUX.1 [schnell], NVIDIA H200s, and Enterprise Platform

Together AI

The future of AI is open-source. Let's build together.

New Models

Llama 3.2 Multimodal - With free endpoint

FLUX1.1 [pro] and FLUX.1 [schnell] - With free endpoint

Qwen-2.5-7B & 72B

Product Updates

GPU Clusters with NVIDIA H200 and the Together Kernel Collection

Together Enterprise Platform

Analytics Dashboard

Improved Reliability & New Status Page

New AI Apps

Napkins

领英推荐

Blinkshot

Product Descriptions

Featured Content & Research

Linearizing LLMs with LoLCATs

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Build Multimodal Document RAG with Llama 3.2 Vision and ColQwen2

Together Talks

Community Spotlight

Pika 1.5

Llama 3.1 in the wild

Together We Build

6,186 位关注者

社区洞察

其他会员也浏览了

XenonStack AI Factory: Revolutionizing Gen AI Workloads for Next-Gen Success

DeciDiffusion 1.0: 3x the Speed of Stable Diffusion with the Same Quality

Insider’s Edit: Nvidia’s Future Defining Hardware, Google’s AI Search Edits, Microsoft’s $18.9 Billion Partnership

We Finally Found NeMo! (no, not the clownfish)

Is PP-YOLOE Better than YOLOv5?

Vision processing with NVIDIA and Jetson at the edge

Notable and Interesting Recent AI News, Articles, and Papers for Thursday, August 01, 2024

Deep Learning - year 16 quarter 1

Exploring NVIDIA's AI and Machine Learning Frameworks: A Guide to Accelerated Innovation

NVIDIA GTC Keynote Recap