Fine-Tuning Strategically: When to Resist the Urge & What to Do Instead

Elon Musk outlines five steps for improving systems:

  1. Make the requirements less dumb.
  2. Delete the part or process.
  3. Simplify or optimize the design.
  4. Accelerate cycle time.
  5. Automate.

Most engineers don't spend enough time on the first two steps of Musk's framework, which can be applied to almost any system. I see this happening frequently, especially with ML engineers. They tend to gravitate towards steps 3 (fine-tuning) and 5 (various "Ops" – DevOps, MLOps, LLMOps, etc.), which, while exciting, are often unnecessary or premature.

It seems that fine-tuning has an addictive quality for ML engineers and data scientists (who wouldn't want to see that loss function decrease?). They can spend months iterating to achieve marginal improvements. But where's the business value in that?

Here's some advice I've found helpful:

  • Prioritize the Fundamentals: Force yourself to revisit steps 1 and 2. In many cases, fine-tuning is simply not necessary.
  • Focus on Embeddings: If you must fine-tune, consider fine-tuning an embedding model like Google's open-source model instead of the generative model itself. You still get the pleasure of fine-tuning, while taking a different approach.

Why is this worthwhile? There are multiple reasons, but the most important are that many models already possess strong reasoning capabilities out of the box; context is key and effective search is paramount for providing relevant context, and almost all ML systems benefit from RAG. It’s even considered "ADOPT" on the Thoughtworks Technology Radar.

要查看或添加评论,请登录

Stef Ruinard的更多文章

  • Time Machines & The Uneven Now

    Time Machines & The Uneven Now

    Rapid advancements in AI have collapsed the distance between the future and the present. As William Gibson famously…

  • Understanding Systems or Users?

    Understanding Systems or Users?

    Living through a technological disruption is humbling. The AI field is moving forward at such an incredible pace, that…

    1 条评论
  • Used by humans, built for AI

    Used by humans, built for AI

    It’s the start of a new year, and I've been thinking about something that's been quietly changing in the tech world…

  • Connecting the dots in the world’s playground

    Connecting the dots in the world’s playground

    For the past month, I’ve been working with a close friend on a robotics project. It’s mind-blowing to experience what…

  • Intelligence is Free - Use it to amplify your creativity.

    Intelligence is Free - Use it to amplify your creativity.

    I've been noticing a suboptimal pattern in dealing with intelligence lately, and it's got me thinking. We're in the…

  • Can AI Reason? Who Cares! Let's Solve Some Problems.

    Can AI Reason? Who Cares! Let's Solve Some Problems.

    There's a lot of talk about whether AI can truly reason or understand emotions. Some say it's just pattern-matching…

  • The "Unlock Fallacy": Why We Crave Simple Solutions

    The "Unlock Fallacy": Why We Crave Simple Solutions

    It's fascinating how our human minds, shaped by narratives and prone to seeking patterns, often overestimate the impact…

  • Decoding Human Behavior

    Decoding Human Behavior

    Decoding Human Behavior Let's chat about how online experiences are about to get a major makeover. And no, I'm not…

  • Manage complexity and find flow

    Manage complexity and find flow

    Starting an engineering project from scratch can be challenging. There are so many components that still have to be…

  • Measuring the Unmeasurable

    Measuring the Unmeasurable

    This week I looked into the challenges related to building ML systems. My primary motivation for exploring this topic…

社区洞察

其他会员也浏览了