Latest Updates: FREE Llama 3.2 Multimodal & FLUX.1 [schnell], NVIDIA H200s, and Enterprise Platform

Latest Updates: FREE Llama 3.2 Multimodal & FLUX.1 [schnell], NVIDIA H200s, and Enterprise Platform

Hey there!

Welcome to the first issue of Together We Build, a handpicked selection of news, product launches, novel research, and AI tools from Together AI , aimed at everyone interested in keeping up with the latest developments in generative AI and LLMs.

These last few weeks were packed with massive launches for vision and image models and major updates to our product, research, and tools.

To ensure you don't miss our future updates, subscribe to the LinkedIn newsletter.


New Models

Llama 3.2 Multimodal - With free endpoint

We partnered with Meta to launch support for the new Llama 3.2 vision models and Llama Stack API. You can access our Llama 3.2?11B ?&?90B ?Turbo endpoints to get ultimate speed and accuracy for vision tasks like image captioning, visual question answering, and image-text retrieval.

Even more exciting, we added a completely?FREE endpoint ?for Llama 3.2 11B Vision!

Try Llama 3.2 11B Vision for free →


FLUX1.1 [pro] and FLUX.1 [schnell] - With free endpoint

We're thrilled to provide state-of-the-art image generation with the latest FLUX models! We added an endpoint for the powerful?FLUX1.1 [pro] ?model, plus two endpoints for FLUX.1 [schnell]: a?turbo endpoint ?(with the fastest performance) and a completely?FREE endpoint ?you can use to experiment with open-source image generation at no cost.

Explore FLUX models →


Qwen-2.5-7B & 72B

We also welcomed the latest super powerful large language model family by Alibaba Cloud! The 72B model rivals the capabilities of larger models like Llama 3.1 (405B) and Surpasses Qwen2 with higher scores: MMLU & HumanEval (85+), MATH (80+).

Try our 72B endpoint →


Product Updates

GPU Clusters with NVIDIA H200 and the Together Kernel Collection

Our?GPU clusters ?took a huge leap in performance! Now we offer NVIDIA H200 Tensor Core GPUs equipped with our custom-built?Together Kernel Collection , an optimized kernel stack. The result? Up to 24% speedup for operators used frequently in training, and up to 75% speedup for the fundamental operation used in FP8 inference (compared against PyTorch implementations). Fewer GPU hours = cost efficiencies = faster time to market.

Read more →


Together Enterprise Platform

We launched the?Together Enterprise Platform ?to empower organizations to manage their entire Generative AI lifecycle: training, fine-tuning, and running inference on any model, in any environment. We deliver 2-3x faster inference, and up to 50% lower operational costs on your existing cloud (AWS, Azure, GCP, OCI) or on-premise infrastructure.

Read more →


Analytics Dashboard

You know how “your AI is only as good as your data?” Well, with our new Together Analytics (beta), we now show your usage over time including requests, latency, and TPM.

View your analytics →


Improved Reliability & New Status Page

You've been asking for it, we shipped it! Our reliability scores have been soaring after our recent updates and fixes. We also introduced a handy new?status page ?so you can keep track of the uptime of our different models and services.

View status →


New AI Apps

Napkins

To inspire you to build with Llama 3.2 vision models, we launched?napkins.dev ?— a tool where you can upload a screenshot of a simple site/design and get the code in seconds. 100% free and open source!

Try it out →


Blinkshot

We put the FLUX models to the test with?blinkshot.io ?— a tool that generates images as you type. This really shows the performance of our FLUX.1 [schnell] turbo endpoint. Blink once and you might miss it!

Try it out →


Product Descriptions

And to really showcase the flexibility of Llama 3.2 vision models, we also built a?demo app ?where you can upload product images and get descriptions in multiple languages for your e-commerce shop.

Try it out →


Featured Content & Research

Linearizing LLMs with LoLCATs

Meet LoLCATs (Low-rank Linear Conversion via Attention Transfer) — new work from our research team on linearizing LLMs. LoLCATs converts existing Transformers like Llama and Mistral into state-of-the-art subquadratic variants. Now for the same cost as a LoRA finetune!

Read research piece →


The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Our research team cooked up a?thrilling read ?on their findings in introducing a novel method to distill Transformers into linear RNN architectures, particularly focusing on the Mamba model. The results were impressive, surpassing open-source hybrid models trained with trillions of tokens.

?Read research piece →


Build Multimodal Document RAG with Llama 3.2 Vision and ColQwen2

RAG is the talk of the town but many are just getting started with building RAG apps. So we rolled up our sleeves and ran a practical?webinar ?where we discussed how you can perform RAG over complex PDF documents using ColPali, Llama 3.2 Vision and ColQwen2.

Watch the recording →


Together Talks

It's hard to keep up with the growth of AI. So we are bringing the brightest minds to the virtual stage to dive into AI's biggest questions and opportunities. Check out our?first episode ?with our founders, Percy Liang and Vipul Ved Prakash, and the?second episode ?with our Chief Scientist, Tri Dao, and best-selling author and VP of AI & Open Source at Voltron Data, Chip Huyen.


Community Spotlight

Pika 1.5

Huge congrats to our friends Pika for launching Pika 1.5, a powerful model you can use to create stunning footage, longer clips, and jaw-dropping moves with unreal Pikaffects. Fully trained on Together GPU Clusters!

Try Pika 1.5 →


Llama 3.1 in the wild

Quick shoutout to?@KevIsDev ?for creating a modified version of the bolt.new repo using our Llama 3.1 endpoint . Great source of inspiration for other builders out there!

Check it out →



Always stay in the know—subscribe to the LinkedIn newsletter to receive our future updates.

Takahide Maruoka

Credly Top Legacy Badge Earner | ISO/IEC FDIS 42001 | ISO/IEC 27001:2022 | NVIDIA | Google | IBM | Cisco Systems | Generative AI

1 个月

Thank you for info.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了