AI/ML news summary: week 34

AI/ML news summary: week 34

Here are the articles, guides, and news about AI; Week 34. I read tons of RSS feeds and blogs, so you won't have to scour the internet yourself for the latest AI news of this week.


Before we start!

If you like this topic and you want to support me:

  1. Comment on the article; LinkedIn appreciates that and it will really help spread the word ??
  2. Connect with me on Linkedin ??
  3. Subscribe to TechTonic Shifts to get your daily dose of tech ??


The AI landscape was moving fast this week. GPT-4 class models are becoming more prevalent because now xAI has joined the ranks of OpenAI, Anthropic, DeepMind, Meta, Mistral, and DeepSeek.

Yet only the first four offer multimodal capabilities. Anthropic's new context caching feature is particularly noteworthy because it can significantly reduce costs for reused input tokens. This innovation opens up new possibilities for complex LLM agent pipelines.

Sakana AI's "The AI Scientist" is another intriguing development this week.

This LLM agent is designed to assist in machine learning research. It brainstorms ideas, conducts literature searches, executes experiments, and writes research papers. I have tried it and it is really impressive. The quality of the output is not yet that groundbreaking, but the cost-effectiveness is impressive.

But to be hones, I've seen a tsunami of research journals with low-quality AI-generated content these past few months. This will pose a threat to research integrity.

Sakana's implementation is active in the broader discussion on "inference-time scaling laws." Some experts argue that scaling alone will not lead to AGI. But a lot of different approaches are being explored to improve LLM capabilities without it increasing training budgets. And the training budgets is 70% of the LLM cost. So this is a big, big plus!

Agent pipelines or research breakthroughs could yield new capabilities and even small-scale experiments managed by LLM agents will start producing insights that can be scaled and integrated into state-of-the-art models.

I think that the use of AI in scientific research is a sensitive topic though.

Man scientists are hesitant (rightly so) to delegate human work to AI at this time. But Sakana's agent functions more as a tool to improve or even amplify human researchers! It works best when it is guided by an experienced AI scientist with promising ideas and codebases in stead of doing the grunt work for you. I think that responsible use of such agents will accelerate research because it will allow human researchers to focus on distilling the most promising experimental results.


Rest of the news

  1. xAI’s Grok-2 Beta Release: xAI launched Grok-2 Beta, featuring Grok-2 and Grok-2 mini models on ??. Grok-2 shows significant improvements, scoring 75.5% on MMLU-Pro, surpassing Grok-1.5 and even GPT-4o. An enterprise API will soon be available, offering enhanced security and low-latency access.
  2. Anthropic Introduced Prompt Caching: Anthropic’s API now offers prompt caching, reducing costs by up to 90% and latency by up to 85% for long prompts. Available in public beta for Claude 3.5 Sonnet and Claude 3 Haiku.
  3. Perplexity Answers 250 Million Questions a Month: Perplexity, an AI search engine, handled 250 million queries last month, signaling growing interest in AI-driven search despite Google's dominance.
  4. Runway ML Releases Gen-3 Alpha Turbo: Runway ML’s Gen-3 Alpha Turbo, now officially released, is seven times faster and half the cost of Gen-3 Alpha. Available to all users, with more improvements expected.
  5. OpenAI Introduced SWE-Bench Verified: OpenAI released SWE-Bench Verified, a subset of the SWE-Bench benchmark with human verification, to more reliably evaluate AI models’ ability to solve real-world software issues.
  6. xAI’s Controversial Image Generator: xAI’s Grok 2 chatbot on ?? integrates Black Forest Labs’ Flux model for image generation with minimal restrictions, sparking debate over digital safety and AI regulation.
  7. Multion Introduces Agent Q: MultiOn launched Agent Q, an autonomous AI agent with self-healing and planning capabilities. It leverages MCTS, AI self-critique, and RLFH for complex reasoning and decision-making.
  8. Google’s Upgraded AI Image Generator: Google released Imagen 3, its latest AI text-to-image generator, in the US. Available on Google’s AI Test Kitchen, it promises better detail, richer lighting, and fewer artifacts.


Short readings

1.How To Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model

This is a guide on refining the Llama-3.1 8B language model into a compact 4B version using NVIDIA’s structured compression techniques, including weight pruning and knowledge distillation. This approach yields a resource-efficient Llama-3.1-Minitron 4B that delivers high performance on benchmarks while cutting down on computational expenses.

2. Why I Bet on DSPy

DSPy is an open-source framework that facilitates the coordination of multiple LLM calls to tackle complex issues. It offers verifiable feedback to enhance practical solution deployment. The framework is currently improving reliability and user accessibility to strengthen its utility and continued development within the AI community. This article provides insight into how DSPy forces you to think about the problems with LLMs.

3. Review: ChatGPT’s New Advanced Voice Mode

ChatGPT’s new Advanced Voice Mode enhances speech understanding and production, outperforming predecessors and competitors like Siri and Alexa. In this article, the author reviewed the basics of Advanced Voice Mode and explored a few use cases that underscore the leap-forward nature of this technology.

4. The Workflow of PEFT

PEFT is a method designed to fine-tune large models more efficiently by focusing on a subset of parameters. This blog looks under the hood of the PEFT library to better understand how things work and explores how to create a base model and use it to build a LoRA model.

5. Free Tools Every ML Beginner Should Use

This article highlights some of the essential tools that every beginner — or person willing to get started — with ML should use. It introduces tools such as Jupyter Notebook, Hugging Face and Transformers, Kaggle, and more.

6. A Crash Course of Model Calibration — Part 1

Many experiments have revealed that modern neural networks are often not well-calibrated. A model is perfectly calibrated if the predicted probabilities of outcomes align closely with the actual outcomes. This article explores how to make ML models reflect true probabilities in their predictions.

7. Synthetic Data Solves AI’s Biggest Problem

This article discusses how synthetic data is a useful application of AI technology already delivering real, tangible value to customers. Unlike fake data, synthetic data supports data-driven business systems throughout their lifecycle, mainly where ongoing access to production data is impractical or ill-advised.


Tools

  1. Qwen 2 is the official repository of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
  2. Deep Live Cam allows real-time face swap and one-click video deepfake with only a single image.
  3. LongWriter dataset contains 6,000 SFT data with ultra-long output ranging from 2k-32k words.
  4. SWE Agent takes a GitHub issue and tries to automatically fix it using GPT-4 or your LM of choice.
  5. Fabric is an open-source framework for augmenting humans using AI.
  6. MiniCPM-V is a GPT-4V-level MLLM for a single image, multi-image, and video on your phone.
  7. Tinygrad is a deep learning framework that is like a blend of PyTorch and micrograd.


Research papers

1. Imagen 3

This is the official paper for Google’s Imagen 3, a latent diffusion model that generates high-quality images from text prompts. The paper discusses their quality and responsibility evaluations, issues around safety and representation, and methods used to minimize the potential harm of the models.

2. The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Researchers from Sakana AI, Oxford, University of British Columbia, and several other institutions published a paper unveiling the AI Scientist, a pipeline for open-ended scientific research using LLMs.

3. Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Microsoft Research published a paper introducing rStar, a self-play multi-reasoning approach that improves reasoning capabilities in small language models. rStar uses a generation-discrimination process to decouple the different steps in the reasoning process

4. Causal Agent based on Large Language Model

This paper explores the difficulty of large language models in mastering causal reasoning and addresses the issue by introducing a Causal Agent. This agent, enhanced with causal reasoning techniques and memory components, shows proficiency in tackling various causal problems.

5. Tree Attention: Topology-Aware Decoding for Long-Context Attention on GPU Clusters

The paper presents a topology-aware decoding approach that improves long-context attention in transformer models on GPU clusters. It connects self-attention to energy-based models, leading to parallel GPU computation, significantly faster processing, reduced inter-GPU communication, and lower memory consumption.

6. Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

The paper reviews model merging strategies in machine learning, underscoring their cost-effectiveness and minimal resource usage. It introduces a new classification system for these techniques, detailing their use in language models, continual learning, and multi-task learning. It points out existing literature deficits, current obstacles, and potential areas for future study.

7. Med42-v2: A Suite of Clinical LLMs

This paper introduces Med42-v2, an advanced clinical large language model based on the Llama3 architecture. It is tailored for healthcare with specialized data and preference alignment and surpasses its predecessor and GPT-4 in medical query performance.


Links

1. Nvidia will train 100,000 California residents on AI in a first-of-its-kind partnership. The program focuses on training students, educators, and workers, supporting job creation and promoting innovation, and using AI to solve challenges that can improve the lives of Californians

2. Midjourney releases a new unified AI image editor on the web. It combines inpainting, outpaining/canvas extension, and more into a single view. The new web editor is now live and available to all users who have created at least ten images on the platform. Users can access this tool by visiting midjourney.com/imagine.

3. Lambda has partnered with Nous Research to launch Hermes 3, a new fine-tuned version of Meta’s open-source Llama 3.1–405 billion parameter large language model (LLM). Hermes 3 offers an unlocked, uncensored, open weights model designed to be highly steerable, enabling users to tailor the model’s responses to their individual needs.

Signing off - Marco


Well, that's a wrap for today. Tomorrow, I'll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ??

Think a friend would enjoy this too? Share the newsletter and let them join the conversation. LinkedIn appreciates your likes by making my articles available to more readers.


Top-rated articles






Akriti Singh

Front-End Web Developer | Web designing | HTML5 | CSS3 | JS | Power bi

1 个月

yes I think use of AI in research can accelerate the research work

要查看或添加评论,请登录

社区洞察

其他会员也浏览了