??? Industry News in 1 Line (20th Nov 2024)

??? Industry News in 1 Line (20th Nov 2024)


1)?Qwen-2.5 Turbo?features context length up to 1M tokens, and can generate them in?68 seconds, a 4.3x speedup. The price remains ?0.16/1M tokens.

Got 100% accuracy of retreival in 'Passkey Retrieval' across varying document depths and context lengths

2) Mistral’s?Le Chat?got a huge upgrade. Includes feature like web search, vision, canvas ideation, coding and my favorite image generation with Flux Pro.?All are currently free?during their beta. More details here.Try their?Le Chat?now.

Le Chat is so far the best free alternative with all the premium features imo

3) Cerebras’?Llama-3.1 405B now runs at?969 tokens/s. With 128K context length, 16-bit weights, they are the industry’s fastest time-to-first token @ 240ms.

Fastest open-source model in inference, surpassing proprietary models with impressive performance. A major win for open-source innovation!


要查看或添加评论,请登录

Piyush Sharma的更多文章

社区洞察

其他会员也浏览了