Fireworks AI Monthly Roundup - Jan 2025

Fireworks AI Monthly Roundup - Jan 2025

New year = Different look for the newsletter + same drive for shipping at incredible speeds??

Here’s a peek into what the last month (plus change) has brought Fireworks platform users:


?? What's New at Fireworks AI

Product Updates & Releases

?DeepSeek R1 Release:

We're thrilled to announce the availability of DeepSeek R1 on Fireworks Serverless, On-Demand, and Enterprise platforms. This state-of-the-art reasoning model matches OpenAI's o1 model in performance across math, code, and reasoning tasks.?

Built with 671B MoE parameters (37B activated) and released under the MIT License, DeepSeek R1 features transparent thinking tokens and supports a 160K context window. The model comes as part of a broader family including R1-Zero and six dense models based on Llama and Qwen. This early release shows tremendous promise, with further performance and speed improvements on the horizon. Try it now in our playground or via API for your enterprise needs.

??Try it out now: R1 Model Playground


?DeepSeek V3 Release:

We'd also announced that DeepSeek V3, our state-of-the-art open model, is available on Fireworks Serverless and Enterprise. This powerful model has established itself as the leading open model for coding and reasoning, consistently outperforming competitors on both Chatbot Arena and WebDev Arena.?

With its impressive architecture of 671B MoE parameters (37B activated) and a substantial 131K context window, DeepSeek V3 delivers exceptional performance at speeds up to 30 tokens per second – and we're working to make it even faster. We've made this cutting-edge technology accessible at just $0.9 per million tokens.

??Try it out now: V3 Model Playground


?Serverless 2.0 is live:

We just upgraded our Serverless offering - 10x higher rate limits, reliable uptime and fast speeds ??

?? What's new:

  • 10x higher rate limits - now up to 6000 requests per minute and 2.5 Billion tokens per day
  • Reliable uptime and consistently fast response times
  • Access the best & latest open models like DeepSeek V3, Llama 3.3 70B and more
  • Grow to unlimited scale with Fireworks Enterprise


?Streaming Transcription API Launch:

Our new streaming transcription service delivers Whisper-v3-level quality with just 300ms latency. Perfect for live captions and voice agents, this service creates a Websocket to stream and translate audio chunks in real-time. For a limited time, try it free for two weeks, then enjoy competitive pricing at $0.0032 per audio minute.

?Document Inlining Launch:

This groundbreaking feature transforms any LLM into a vision model for better digital asset processing. Our tests show OSS models with Document Inlining achieved a 68% win rate against GPT-4o at document processing, while offering ultra-simple implementation through our OpenAI-compatible API.

?? Coming Soon: Stay tuned for more compound AI systems designed to process audio, knowledge stores, and other data types more effectively!


?? Customer Success Story

? Cresta's AI-Powered Contact Center Revolution

Industry: Customer Service Technology?

Challenge: Cresta needed to scale customized AI models efficiently while maintaining high performance and cost-effectiveness for millions of real-time contact center interactions.?

Solution: Implemented Fireworks' infrastructure to power their Knowledge Assist platform, utilizing multi-LoRA capabilities with a single mistral-based Ocean model cluster.?

Results:

  • Achieved 100x cost reduction compared to GPT-4
  • Consistently outperformed GPT-4 in RAG-powered tasks
  • Maintained minimal latency and robust uptime at scale
  • Improved key metrics including handle time and first call resolution


?? Learning Resources

?DeepSeek V3 Vision Enhancement:

Building on our DeepSeek V3 release, we've further enhanced DeepSeek V3 by integrating vision capabilities through our Document Inlining function. This addition transforms an already powerful language model into a comprehensive tool capable of handling both text and visual inputs, making it even more versatile for diverse applications. These capabilities are available across both our Serverless and Enterprise platforms.

??Read the full article here


?All you need to know about DeepSeek R1 model:

Our comprehensive guide dives deep into this groundbreaking model, exploring its innovative architecture and practical applications. We walk through how DeepSeek R1's 671B MoE parameters and reinforcement learning training create superior reasoning capabilities. The guide also provides practical insights for organizations considering the transition from proprietary models, with detailed examples of implementation across various use cases.?

Most importantly, we explain how the model's MIT License and open-source nature empower developers to build and customize their own solutions through model distillation. Read the full article to understand why DeepSeek R1 represents a significant step forward in democratizing advanced AI capabilities.

??Read the full article here


? Multi-LoRA Implementation Guide:?

We’re excited to spotlight Multi-LoRA demo- Revolutionizing Product-Specific AI with Fireworks' Multi-LoRA ??

This demo highlights the capabilities of Fireworks' Multi-LoRA, a cutting-edge technique for fine-tuning models across various product domains. By tailoring the AI to answer product-related inquiries in specific categories like beauty, fashion, outdoor products, and baby products, Multi-LoRA ensures more precise, relevant responses. This eliminates the need for users to rely on generic filters and enables them to ask product-specific questions directly, with the AI providing answers based on domain-specific knowledge.

?? Read this article if you want to learn more


?? Company News & Events

CEO Interview Highlight: Lin Qiao , Fireworks AI's co-founder and CEO, recently shared fascinating insights about compound AI in an interview with The Stack. She emphasized how this approach moves beyond the "one model fits all" paradigm to optimize for quality, cost, and speed through specialized expert models.


?? Join the Conversation - Community Highlights

The conversation around compound AI is growing! Join our community to discuss how multiple specialized models can work together to deliver superior results in specific domains.

Social Shoutout: We love seeing how you're using Fireworks AI! Tag us @FireworksAI to get featured.


?? We're Hiring!

We are actively looking for high-impact professionals to join our team. Check out our careers page for more information.


Connect With Us?

Sign in to Fireworks AI and take the next step today!

Zia Alam

Attended Georgia Institute of Technology

3 周

I'm interested

回复

要查看或添加评论,请登录

Fireworks AI的更多文章

社区洞察

其他会员也浏览了