AI/ML news summary: week 22

AI/ML news summary: week 22

Here are the weekly articles, guides, and news about AI, curated so you won't have to.


This week in TTS' AI/ML news summary, there's new models at Google I/O and controversy at OpenAI.

Google’s I/O event presented their answer to many of OpenAI’s latest models, such as its “Project Astra” live agent assistant and “Veo”, its video generation model. There was a lot of focus on its improved Gemini Pro 1.5 and new, much cheaper Gemini Pro 1.5 Flash model, particularly their stand-out long context capabilities (available up to 2m tokens).

I am also excited by Gemini’s upcoming LLM context storage options (progress on KV cache) and how it could enable the use of context in place of some fine-tuning or retrieval strategies. Together with the GPT-4o release — I think these LLM improvements enable many more powerful applications to be built.

However, I believe it is increasingly difficult to identify which AI models and tools can be most impactful to your workflows, products, roles, team, and industry and to ensure employees are using these appropriately, safely, and effectively.


Subscribe to the TechTonic Shifts newsletter

Why should you care?

In my view, there is a huge disparity between managers’ growing appreciation of the need to adapt their workflows and products to AI and the actions being taken to manage the risks and make the most of the opportunity.


Hottest news

1. Jan Leike Resigns From OpenAI, Citing Concerns Over Safety And Priorities

The digestion of OpenAI’s GPT4o partial release and demo was followed by the resignation of key staff members (Ilya Sutskever & Jan Leike) and the disbandment of its Superalignment team. Jakub Pachocki replaced Ilya Sutskever as Open AI’s New Chief Scientist. Jan Leake tweeted that his resignation was due to a lack of focus and compute available for safety work and AGI preparation.

2. Google I/O 2024: Here’s Everything Google Just Announced

Google unveiled a revamped AI-powered search engine, an AI model with an expanded context window of 2 million tokens, AI helpers across its suite of Workspace apps, like Gmail, Drive, and Docs, tools to integrate its AI into developers’ apps, and even a future vision for AI, codenamed Project Astra, in their Google I/O conference.

3. Hugging Face Is Sharing $10 Million Worth of Compute To Help Beat the Big AI Companies

Hugging Face is committing $10 million in free shared GPUs to help developers create new AI technologies. The goal is to help small developers, academics, and startups counter the centralization of AI advancements. Hugging Face aims to level the playing field by donating these shared GPUs to the community through a new ZeroGPU program.

4. Stability AI Discusses Sale Amid Cash Crunch, the Information Reports

As reported by the Information, Stability AI recently discussed selling with at least one potential buyer as it faces a cash crunch. In the first quarter of 2024, Stability AI generated less than $5 million in revenue and lost more than $30 million, the report said, adding that the company currently owes close to $100 million in outstanding bills to cloud computing providers and others.

5. Microsoft Offers Cloud Customers AMD Alternative to Nvidia AI Processors

Microsoft announced plans to sell AMD’s MI300X AI chips through its Azure cloud computing service, offering another option besides Nvidia’s in-demand H100 GPUs. Microsoft said it will reveal more about the AMD service at its Build Conference on May 21–23, when Microsoft is also expected to reveal other new AI offerings for consumers, developers, and businesses.

6. Scarlett Johansson Told OpenAI Not To Use Her Voice — and She’s Not Happy They Might Have Anyway

Scarlett Johansson says that OpenAI asked her to be the voice behind ChatGPT, but when she declined, the company created a voice that sounded just like her. In a statement shared to NPR, Johansson says that she has now been “forced to hire legal counsel” and has sent two letters to OpenAI inquiring how the soundalike ChatGPT voice, known as Sky, was made.


Subscribe to the TechTonic Shifts newsletter

Five 5-minute reads/videos to keep you learning

1. What OpenAI’s New GPT-4o Model Means for Developers

GPT-4 Omni was trained from the ground up to be multimodal and is faster, cheaper, and more powerful than its predecessors. This is incredibly significant for software developers who plan on leveraging AI models in their own apps and features. This article provides an overview of why it is particularly relevant for developers.

2. Crafting QA Tool With Reading Abilities Using RAG and Text-to-Speech

One tool often required by the business is the Question-Answering (QA) tool. This article provides a step-by-step guide on developing QA-LLM-powered tools with RAG and text-to-speech(TTS) capability.

3. How Do AI Supercomputers Train Large Gen AI Models? Simply Explained

AI supercomputers process information millions of times faster than standard desktop or server computers. This post walks you through what AI supercomputers are and how they train large AI models, such as GPT3, GPT4, and even the latest GPT-4o, that power ChatGPT and BingChat.

4. Transforming DevOps With AI: Practical Strategies To Supercharge Your Workflows

This article shares the top strategies for leveraging AI in DevOps workflows. It recommends options such as a random forest classifier for self-healing systems, LSTMs for anomaly detection-based smart alerting, Q-Learning for Autonomous System Configuration Updates, and more.

5. 105+ Machine Learning Statistics for 2024 (Exploring AI Realms)

More than 80% of companies and businesses need employees with machine learning skills. This article discusses interesting facts, statistics, and machine learning trends. It covers notable areas, including market size, adoption, sales, benefits, and business applications.


Repositories & Tools

  1. Online RLHF presents a workflow of Online Iterative RLHF, which is widely reported to outperform its offline counterpart by a large margin.
  2. Reminisc is an open-source memory framework for conversational LLMs.
  3. MarkLLM is an open-source Toolkit for LLM watermarking.
  4. Athina AI helps manage LLMs in production by detecting and fixing hallucinations while efficiently managing prompts, all in a single interface.
  5. Lumina-T2X is a unified framework for ‘text to any modality’ generation.


Top papers of the week

1. RLHF Workflow: From Reward Modeling to Online RLHF

This paper aims to fill in this gap and provide a detailed recipe that is easy to reproduce for online iterative RLHF. It presents the workflow of Online Iterative Reinforcement Learning from Human Feedback (RLHF), which is widely reported to outperform its offline counterpart by a large margin in the recent large language model (LLM) literature.

2. LoRA Learns Less and Forgets Less

This work compares the performance of LoRA and full finetuning on two target domains, programming and mathematics. It considers instruction finetuning (≈100K prompt-response pairs) and continued pretraining (≈10B unstructured tokens) data regimes. The results show that, in most settings, LoRA substantially underperforms full finetuning. However, LoRA better maintains the base model’s performance on tasks outside the target domain.

3. Granite Code Models: A Family of Open Foundation Models for Code Intelligence

This paper introduces the Granite series of decoder-only code models for code generative tasks, trained with code written in 116 programming languages. The Granite Code models family consists of models with sizes from 3 to 34 billion parameters, suitable for applications ranging from complex application modernization tasks to on-device memory-constrained use cases.

4. Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

Xmodel-VLM is a multimodal vision language model designed for efficient deployment on consumer GPU servers. In this research, they developed a 1B-scale language model from the ground up, employing the LLaVA paradigm for modal alignment. Xmodel-VLM delivers performance comparable to that of larger models.

5. What Matters When Building Vision-Language Models?

This paper introduces Idefics2, an efficient foundational VLM with 8 billion parameters. Critical decisions regarding the design of VLMs are often not justified. This paper addresses this issue by conducting extensive experiments on pre-trained models, architecture choice, data, and training methods.


Quick links

  1. OpenAI partnered with Reddit to integrate rich community content into ChatGPT and other products. Through Reddit’s Data API, OpenAI will get access to a vast repository of community discussions, enabling OpenAI to enhance ChatGPT and its other products.
  2. CoreWeave raises $7.5 Billion in debt for AI computing push. The deal is one of the largest debt financing rounds for a startup and adds firepower to CoreWeave’s balance sheet as it looks to double its number of data centers to 28 this year.


Think a friend would enjoy this too? Share the newsletter and let them join the conversation.


Well, that's it for now. If you like my article, subscribe to my newsletter or connect with me. LinkedIn appreciates your likes by making my articles available to more readers.

Signing off - Marco


Top-rated articles:



要查看或添加评论,请登录

社区洞察

其他会员也浏览了