OpenAI's Voice Engine, AI21's Jamba, Google's HeAR, DBRX & Stable Audio 2.0: A Glimpse into Emerging AI Technologies
Katonic AI
Katonic AI's award-winning platform allows companies build enterprise-grade Generative AI apps and Traditional ML models
Welcome to our weekly newsletter ??, your go-to source for the latest developments and trends in Generative AI.
Each edition brings you a curated selection of impactful news, insightful analyses, and exciting advancements from the dynamic world of generative AI. Stay tuned for a concise and informative exploration of this rapidly evolving field.
1. The Future of Synthetic Voices: OpenAI's Voice Engine Preview
OpenAI unveiled Voice Engine, aiming to enhance communication and accessibility while ensuring ethical use. This advanced technology promises realistic speech synthesis from text and a brief audio sample, focusing on educational and communicative applications. With safety as a priority, OpenAI is setting strict guidelines for its use, highlighting the importance of responsible development and deployment in the evolving landscape of synthetic voices. Read more
2. Jamba: AI21 Labs' Novel Hybrid AI Model
AI21 Labs announces Jamba, the first production-grade model combining Mamba's Structured State Space (SSM) with Transformer architecture, to overcome traditional LLM limitations.
Jamba boasts a 256K context window, triples throughput on long contexts, and uniquely fits 140K context on a single GPU. Released with open weights under Apache 2.0, it's accessible on Hugging Face and NVIDIA's platform, marking a significant advancement in AI model efficiency and scalability. Read more
3. Diagnosing Health Conditions with Google's HeAR
Google's team has developed an AI tool capable of diagnosing diseases such as COVID-19 and tuberculosis by analyzing coughs and other sounds. Named Health Acoustic Representations (HeAR), this machine-learning system uses a vast collection of audio clips for training, employing self-supervised learning to predict health conditions from audio inputs.
Although not yet commercialised, HeAR demonstrates promise for non-invasive disease detection and monitoring, opening new avenues in the field of health acoustics or "audiomics". Read more
4. DBRX: Databricks' Leap in LLM Technology
Databricks unveiled DBRX, an open LLM surpassing GPT-3.5 and competitive with Gemini 1.0 Pro, particularly in coding. This model, featuring a mixture-of-experts architecture, promises 2x faster inference and 40% smaller size compared to predecessors, marking significant advancements in AI efficiency and performance. Read more
5. Stable Audio 2.0: Elevating AI Music Generation
Stability AI introduced Stable Audio 2.0, a game-changing model for AI-generated audio. It features high-quality music track generation up to three minutes long, audio-to-audio transformation, and extensive sound effect capabilities.
Leveraging a licensed dataset for training, it offers new creative dimensions for artists and musicians, free to use on its platform. Stable Audio 2.0 represents a significant leap in AI music technology, promising innovative tools for the creative industry. Read more
?? Foundation Model of the Week - Qwen 14B Chat
Qwen-14B is the 14B-parameter version of the large language model series, Qwen, proposed by Alibaba Cloud. It is a transformer-based large language model that efficiently requires less than 2GB memory for inference, and is pre-trained on over 2.2 trillion tokens, covering multiple languages and fields. It supports up to 32K tokens for long contexts, outperforms similar and larger models in various tasks, and features a 150K token vocabulary for enhanced multilingual support.
Try it on Katonic Playground: Qwen 14B Chat
Subscribe for more exciting AI updates in the future. Have a great weekend! ?