A large week for Mistral

A large week for Mistral

Mistral Large joins the chat. Alibaba goes EMO. NVIDIA gets technical with nemotron. Let’s dive in!

ML Engineering Highlights:

  • Mistral announces partnership with Microsoft and new AI model : Mistral AI has released Mistral Large, a multilingual text generation model for enterprises, and has formed a strategic partnership with 微软 , gaining $16 million in fresh capital. Mistral Large is designed to understand, reason with, and generate text in multiple languages with a context window of 32K tokens. The model has also been made available on Azure AI Studio and Azure Machine Learning as part of the partnership. Mistral plans to expand its distribution through AWS and a new chat app for business teams.

Credit: Mistral

  • Alibaba's new AI system 'EMO' creates realistic talking and singing videos from photos : Researchers at 阿里巴巴集团 ’s Institute for Intelligent Computing have developed a new AI system called EMO that can animate a single portrait photo and generate lifelike videos of the person talking or singing. The system directly converts audio waveforms into video frames, capturing natural speech motions and identity-specific quirks. EMO outperforms existing methods regarding video quality, identity preservation, and expressiveness, and may lead to personalized video content synthesis from just a photo and audio clip.
  • SambaNova debuts 1 trillion parameter Composition of Experts model SambaNova Systems released the trillion-parameter Samba-1, not a single model but a combination of over 50 AI models in what they call a Composition of Experts architecture. Their focus on hardware and efficiency sets them apart from competitors, enabling highly customizable and high-performance deployment for enterprises. Samba-1 offers flexibility in chaining expert models together based on prompts and responses, allowing for exploration of different perspectives and secure, private deployment and inference.

Research Highlights:

  • Nemotron-4 15B Technical Report : 英伟达 's Nemotron-4 15B is a 15-billion-parameter large multilingual language model trained on 8 trillion text tokens, demonstrating strong performance on English, multilingual, and coding tasks. It outperforms existing similarly-sized open models on 4 out of 7 evaluation areas and performs competitively in the remaining ones. Specifically, it showcases the best multilingual capabilities among similarly-sized models, surpassing even larger models explicitly specialized for multilingual tasks.

Credit: NVIDIA

  • Do Large Language Models Latently Perform Multi-Hop Reasoning? This paper by Google DeepMind investigates whether Large Language Models (LLMs) are capable of latent multi-hop reasoning when given complex prompts. They find strong evidence of latent multi-hop reasoning for certain types of prompts, with the reasoning pathway used in over 80% of them. However, the utilization of this reasoning is highly contextual and varies across different types of prompts, with the evidence for the second hop and full multi-hop traversal being moderate on average. The findings also suggest potential challenges and opportunities for future development and applications of LLMs, with a clear scaling trend for the first hop of reasoning but not for the second hop.
  • Executable Code Actions Elicit Better LLM Agents : This paper by 苹果 and the 美国伊利诺伊大学香槟分校 introduces CodeAct, a method that uses executable Python code to expand the action space of Large Language Model (LLM) agents, enabling them to perform a wider range of tasks and interact with environments through multi-turn interactions. The study demonstrates that CodeAct outperforms existing alternatives by up to 20% in success rate, motivating the creation of an open-source LLM agent that can execute interpretable code and collaborate with users using natural language. Additionally, the authors present a new instruction-tuning dataset, CodeActInstruct, which improves the performance of LLM agents in complex tasks without sacrificing their general capability and introduces the CodeActAgent, tailored to perform advanced tasks such as model training and self-debugging.

Lightning AI Studio Highlights:

Credit: Google

Don’t Miss the Submission Deadline

  • ECCV 2024: European Conference on Computer Vision 2024 Submission Deadline: Fri Mar 08 2024 06:59:00 GMT-0500
  • MICCAI 2024: International Conference on Medical Image Computing and Assisted Intervention Submission Deadline: Fri Mar 08 2024 02:59:59 GMT-0500
  • ECAI 2024 : European Conference on Artificial Intelligence 2024 Submission Deadline: Fri Apr 26 2024 07:59:59 GMT-0400
  • RLC 2024 : Reinforcement Learning Conference 2024 Submission Deadline: Fri Mar 01 2024 23:59:59 GMT-1200

Want to learn more from Lightning AI? “Subscribe” to make sure you don’t miss the latest flashes of inspiration, news, tutorials, educational courses, and other AI-driven resources from around the industry. Thanks for reading!

Microsoft is now partner with two of the globally best LLMs. Plans seem clear.

回复
Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

8 个月

You mentioned significant developments in AI this week, highlighting Mistral Large, Alibaba's venture into EMO, and NVIDIA's technical strides with Nemotron. Delving deeper, considering the evolving landscape of AI, how do you foresee these advancements shaping the integration of AI technologies in real-world applications, particularly in domains where human interaction is crucial, like healthcare or customer service? Additionally, can you elaborate on the potential ethical considerations arising from the increased emotional intelligence in AI systems, and how industry leaders plan to address these challenges?

回复
Sebastian Raschka, PhD

Machine learning and AI researcher ? author of the "Build a Large Language Model From Scratch" book (amzn.to/4fqvn0D) ? research engineer at Lightning AI ? ex-statistics professor at University of Wisconsin-Madison

8 个月

Thanks for the kind shoutout of my Gemma Studio ??

要查看或添加评论,请登录

Lightning AI的更多文章

社区洞察

其他会员也浏览了