Fine-Tuning Strategically: When to Resist the Urge & What to Do Instead
Elon Musk outlines five steps for improving systems:
Most engineers don't spend enough time on the first two steps of Musk's framework, which can be applied to almost any system. I see this happening frequently, especially with ML engineers. They tend to gravitate towards steps 3 (fine-tuning) and 5 (various "Ops" – DevOps, MLOps, LLMOps, etc.), which, while exciting, are often unnecessary or premature.
It seems that fine-tuning has an addictive quality for ML engineers and data scientists (who wouldn't want to see that loss function decrease?). They can spend months iterating to achieve marginal improvements. But where's the business value in that?
Here's some advice I've found helpful:
Why is this worthwhile? There are multiple reasons, but the most important are that many models already possess strong reasoning capabilities out of the box; context is key and effective search is paramount for providing relevant context, and almost all ML systems benefit from RAG. It’s even considered "ADOPT" on the Thoughtworks Technology Radar.