Fireworks AI Monthly Roundup - Jan 2025
Fireworks AI
Generative AI platform empowering developers and businesses to scale at high speeds
New year = Different look for the newsletter + same drive for shipping at incredible speeds??
Here’s a peek into what the last month (plus change) has brought Fireworks platform users:
?? What's New at Fireworks AI
Product Updates & Releases
We're thrilled to announce the availability of DeepSeek R1 on Fireworks Serverless, On-Demand, and Enterprise platforms. This state-of-the-art reasoning model matches OpenAI's o1 model in performance across math, code, and reasoning tasks.?
Built with 671B MoE parameters (37B activated) and released under the MIT License, DeepSeek R1 features transparent thinking tokens and supports a 160K context window. The model comes as part of a broader family including R1-Zero and six dense models based on Llama and Qwen. This early release shows tremendous promise, with further performance and speed improvements on the horizon. Try it now in our playground or via API for your enterprise needs.
??Try it out now: R1 Model Playground
We'd also announced that DeepSeek V3, our state-of-the-art open model, is available on Fireworks Serverless and Enterprise. This powerful model has established itself as the leading open model for coding and reasoning, consistently outperforming competitors on both Chatbot Arena and WebDev Arena.?
With its impressive architecture of 671B MoE parameters (37B activated) and a substantial 131K context window, DeepSeek V3 delivers exceptional performance at speeds up to 30 tokens per second – and we're working to make it even faster. We've made this cutting-edge technology accessible at just $0.9 per million tokens.
??Try it out now: V3 Model Playground
?Serverless 2.0 is live:
We just upgraded our Serverless offering - 10x higher rate limits, reliable uptime and fast speeds ??
?? What's new:
Our new streaming transcription service delivers Whisper-v3-level quality with just 300ms latency. Perfect for live captions and voice agents, this service creates a Websocket to stream and translate audio chunks in real-time. For a limited time, try it free for two weeks, then enjoy competitive pricing at $0.0032 per audio minute.
This groundbreaking feature transforms any LLM into a vision model for better digital asset processing. Our tests show OSS models with Document Inlining achieved a 68% win rate against GPT-4o at document processing, while offering ultra-simple implementation through our OpenAI-compatible API.
?? Coming Soon: Stay tuned for more compound AI systems designed to process audio, knowledge stores, and other data types more effectively!
?? Customer Success Story
? Cresta's AI-Powered Contact Center Revolution
Industry: Customer Service Technology?
Challenge: Cresta needed to scale customized AI models efficiently while maintaining high performance and cost-effectiveness for millions of real-time contact center interactions.?
领英推荐
Solution: Implemented Fireworks' infrastructure to power their Knowledge Assist platform, utilizing multi-LoRA capabilities with a single mistral-based Ocean model cluster.?
Results:
?? Learning Resources
Building on our DeepSeek V3 release, we've further enhanced DeepSeek V3 by integrating vision capabilities through our Document Inlining function. This addition transforms an already powerful language model into a comprehensive tool capable of handling both text and visual inputs, making it even more versatile for diverse applications. These capabilities are available across both our Serverless and Enterprise platforms.
Our comprehensive guide dives deep into this groundbreaking model, exploring its innovative architecture and practical applications. We walk through how DeepSeek R1's 671B MoE parameters and reinforcement learning training create superior reasoning capabilities. The guide also provides practical insights for organizations considering the transition from proprietary models, with detailed examples of implementation across various use cases.?
Most importantly, we explain how the model's MIT License and open-source nature empower developers to build and customize their own solutions through model distillation. Read the full article to understand why DeepSeek R1 represents a significant step forward in democratizing advanced AI capabilities.
We’re excited to spotlight Multi-LoRA demo- Revolutionizing Product-Specific AI with Fireworks' Multi-LoRA ??
This demo highlights the capabilities of Fireworks' Multi-LoRA, a cutting-edge technique for fine-tuning models across various product domains. By tailoring the AI to answer product-related inquiries in specific categories like beauty, fashion, outdoor products, and baby products, Multi-LoRA ensures more precise, relevant responses. This eliminates the need for users to rely on generic filters and enables them to ask product-specific questions directly, with the AI providing answers based on domain-specific knowledge.
?? Company News & Events
CEO Interview Highlight: Lin Qiao , Fireworks AI's co-founder and CEO, recently shared fascinating insights about compound AI in an interview with The Stack. She emphasized how this approach moves beyond the "one model fits all" paradigm to optimize for quality, cost, and speed through specialized expert models.
?? Join the Conversation - Community Highlights
The conversation around compound AI is growing! Join our community to discuss how multiple specialized models can work together to deliver superior results in specific domains.
Social Shoutout: We love seeing how you're using Fireworks AI! Tag us @FireworksAI to get featured.
?? We're Hiring!
We are actively looking for high-impact professionals to join our team. Check out our careers page for more information.
Connect With Us?
Sign in to Fireworks AI and take the next step today!
Attended Georgia Institute of Technology
3 周I'm interested