Large language models have been around for several years, but it wasn’t until 2023 that their presence became truly ubiquitous both within and outside machine learning communities. Previously opaque concepts like fine-tuning and RAG have gone mainstream, and companies big and small have been either building or integrating LLM-powered tools into their workflows.
As we look ahead at what 2024 might bring, it seems all but certain that these models’ footprint is poised to grow further, and that alongside exciting innovations, they’ll also generate new challenges for practitioners. The standout posts we’re highlighting this week point at some of these emerging aspects of working with LLMs; whether you’re relatively new to the topic or have already experimented extensively with these models, you’re bound to find something here to pique your curiosity.
- Democratizing LLMs: 4-bit Quantization for Optimal LLM Inference. Quantization is one of the main approaches for making the power of massive models accessible to a wider user base of ML professionals, many of whom might not have access to limitless memory and compute.
Wenqi Glantz
walks us through the process of quantizing the Mistral-7B-Instruct-v0.2 model, and explains this method’s inherent tradeoffs between efficiency and performance.
- Navigating the World of LLM Agents: A Beginner’s Guide. How can we get LLMs “to the point where they can solve more complex questions on their own?”
Dominik Polzer
’s accessible primer shows how to build LLM agents that can leverage disparate tools and functionalities to automate complex workflows with minimal human intervention.
- Leverage KeyBERT, HDBSCAN and Zephyr-7B-Beta to Build a Knowledge Graph. LLMs are very powerful on their own, of course, but their potential becomes even more striking when combined with other approaches and tools.
Silvia Onofrei, PhD
’s recent guide on building a knowledge graph with the aid of the Zephyr-7B-Beta model is a case in point; it demonstrates how bringing together LLMs and traditional NLP methods can produce impressive results.
- Merge Large Language Models with mergekit. As unlikely as it may sound, sometimes a single LLM might not be enough for your project’s specific needs. As
Maxime Labonne
shows in his latest tutorial, model merging, a “relatively new and experimental method to create new models for cheap,” might just be the solution for those moments when you need to mix-and-match elements from multiple models.
- Does Using an LLM During the Hiring Process Make You a Fraud as a Candidate? The types of questions LLMs raise go beyond the technical—they also touch on ethical and social issues that can get quite thorny.
Christine Egan
focuses on the stakes for job candidates who take advantage of LLMs and tools like ChatGPT as part of the job search, and explores the sometimes blurry line between using and misusing technology to streamline tedious tasks.
As always, the range and depth of topics our authors covered in recent weeks is staggering—here’s a representative sample of must-reads:
- Going beyond LLMs,
Nate Cibik
dives deep into the emerging world of large multimodal models (LMMs) and how they’re shaping the autonomous robotics field.
- In a new project walkthrough,
Christabelle Pabalan
combines NLP techniques, feature engineering, and visualization to assess the links between the semantic attributes of movie dialogue and genre.
- Where is temporal graph learning headed this year?
Shenyang Huang
and coauthors
Emanuele Rossi
,
Michael Galkin
,
Andrea Cini
, and
Ingo Scholtes
offer a panoramic overview of the field’s trajectory.
- Causal inference is an essential concept for all data professionals, but its meaning can vary depending on your role.
Zijing Zhu
zooms in on how these differences play out in academia and in industry.
- Diffusion has made a splash in recent years in the context of AI-generated images, but as
Christopher Landschoot
shows, its potential extends to the world of music and audio generation as well.
Thank you for supporting the work of our authors! If you’re feeling inspired to join their ranks, why not write your first post? We’d love to read it.
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
10 个月You mentioned highlighting emerging aspects of working with Large Language Models (LLMs) in data science. It's fascinating to see how LLMs have become a cornerstone of modern AI research and applications. When considering their rapid evolution, can we draw parallels with historical advancements in AI or even language processing? I'm curious to explore how the adoption and challenges of LLMs compare to earlier AI breakthroughs, sparking innovation and curiosity in the field.