Open Weights on Open Studios

Open Weights on Open Studios

Mistral AI ’s new model leaked. OpenAI launches next-gen embeddings. Ai2 releases Open Language Model toolkit. Meta drops CodeLlama 70B. Let’s dive in!

ML Engineering Highlights:

  • Mistral CEO confirms 'leak' of new open source AI model : A new open source large language model labeled “miqu-1-70b,” was uploaded to the Hugging Face Face platform, with its performance approaching that of OpenAI’s GPT-4 on the EQ-Bench. Researchers are speculating about the origin and performance of this new model, as it's believed to match or even exceed the GPT-4, which has been the top performing open source LLM for some time. This release of a GPT-4 class open source model would likely place competitive pressure on OpenAI.

  • OpenAI launches new generation of embedding models : OpenAI introduced new embedding models and updated versions of its GPT-4 Turbo, GPT-3.5 Turbo, and moderation models, along with new API usage management tools and lower pricing on its GPT-3.5 Turbo model. The new embedding models offer improved performance and reduced pricing compared to previous models, and can create embeddings with up to 3072 dimensions. OpenAI also updated its GPT-4 Turbo and GPT-3.5 Turbo models, and introduced new ways for developers to manage API keys and understand usage.
  • Anthropic confirms it suffered a data leak : AI startup Anthropic experienced a data breach when a contractor inadvertently sent non-sensitive customer information to a third party, causing concerns over the security of third-party language models with proprietary data. The breach came to light just before the Federal Trade Commission announced an investigation into Anthropic's strategic partnerships with 亚马逊 and 谷歌 .

Research Highlights:

  • Accelerating the Science of Language Models : Language models have become important in NLP research and commercial products, but the most powerful models are closed off and proprietary. This paper introduces OLMo, a truly open language model and framework, to provide access to powerful language models for the research community. OLMo includes training data, training and evaluation code, and is intended to empower the open research community and inspire innovation.

  • Deconstructing Denoising Diffusion Models for Self-Supervised Learning : This study examines the representation learning abilities of Denoising Diffusion Models (DDM) originally designed for image generation by breaking them down into classical Denoising Autoencoders (DAE). The study finds that only a few modern components of DDMs are critical for learning good representations, while many others are nonessential. Ultimately, the study proposes a simplified approach that resembles a classical DAE, hoping to renew interest in classical methods within modern self-supervised learning.
  • Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks : The paper addresses the vulnerability of language models to adversarial attacks and jailbreaking, proposing a defense that is effective, universal, and practical. The authors introduce an adversarial objective for defending language models and an algorithm called robust prompt optimization (RPO) to enforce harmless outputs. The results show that RPO significantly improves robustness to both known and unknown jailbreaking attacks, reducing the attack success rate on Starling-7B from 84% to 8.66% and on GPT-4 from 92% to 6%.

Lightning AI Studio Highlights:

  • Run CodeLlama 70B Instruct : Use this Studio to run Meta’s CodeLlama model and chat with it! CodeLlama takes an input context length up to 100k tokens, and can run on machines as cost efficient as NVIDIA A10Gs at speeds of up to 13 tokens a second on A10Gs.

  • Code LoRA from scratch : LoRA, which stands for?Low-Rank Adaptation , is a popular technique to finetune LLMs more efficiently. Instead of adjusting all the parameters of a deep neural network, LoRA focuses on updating only a small set of low-rank matrices. This Studio explains how LoRA works by coding it from scratch, which is an excellent exercise for looking under the hood of an algorithm.
  • How to scrape web data to finetune LLMs : Generate your own dataset by collecting web URLs to fine-tune LLMs. This Studio allows you to easily parallelize web scraping across multiple machines and processes so that instead of taking hours, it takes minutes.

Don’t Miss the Submission Deadline

  • CHIL 2024: Conference on Health, Inference, and Learning Submission Deadline: Mon Feb 05 2024 23:59:59 GMT-0500
  • CoLLAs 2024 : Third Conference on Lifelong Learning Agents Submission Deadline: Fri Feb 16 2024 06:59:59 GMT-0500
  • ECCV 2024: European Conference on Computer Vision 2024 Submission Deadline: Fri Mar 08 2024 06:59:00 GMT-0500
  • MICCAI 2024: International Conference on Medical Image Computing and Assisted Intervention Submission Deadline: Fri Mar 08 2024 02:59:59 GMT-0500
  • ECAI 2024 : European Conference on Artificial Intelligence 2024 Submission Deadline: Fri Apr 26 2024 07:59:59 GMT-0400

Want to learn more from Lightning AI? “Subscribe” to make sure you don’t miss the latest flashes of inspiration, news, tutorials, educational courses, and other AI-driven resources from around the industry. Thanks for reading!

Farkhanda Hammad

Complete MERN Stack Web-Developer | MongoDB | SQL | Express.js| React | Node.js | HTML | CSS | JavaScript | Bootstrap | Tailwind

9 个月

Fantastic read! The insights into the new open source AI model'miqu-1-70b' and the advancements in OpenAI's GPT models are fascinating. The blog effectively covers key industry developments, including Anthropic's data breach and innovative research highlights like OLMo and Denoising Diffusion Models. The Lightning AI Studio features, such as CodeLlama and Code LoRA, provide valuable practical insights for enthusiasts. Lightning AI Studio's guide on scraping web data for LLMs is particularly helpful. The inclusion of upcoming conference submission deadlines adds value to readers interested in staying updated with the latest in the field. Great job on delivering a comprehensive and engaging post!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了