A Better Model? Or a Smarter Approach to Problem Solving?
Frank Corrigan
Making Decision Intelligence for Supply Chain | Economics and Finance MA
I often get lost when reading the mathematical formulations in research papers. Sometimes I quit on the spot entirely. Sometimes, the promise of what’s on the other side is too damn tempting.
I was reading this paper on automating part of the ML process using LLMs. I was cruising along until I hit this:
min r Lf ? (Dval ⊕ r) subject to f ? = arg min f Lf (Dtrain ⊕ r)
And there it was. Quit? Or grind through it?
Neither. I realized a python tutorial where I could run the code, see the result, & try something different, would be a much more effective way for me to internalize this knowledge. It's what Jeremy Howard refers to as the tops down learning approach.
Enter chatGPT.
Attempt 1: chatGPT 4o, turn this research paper into a python tutorial...
★☆☆☆☆
Ok start. I got code I could run in a notebook & understand. The results seemed promising but I quickly realized they had a lot of holes. Naturally, I thought, maybe I need a better model?
Attempt 2: chatGPT o1-mini, turn this research paper into a python tutorial...
★★☆☆☆
I switched, without much thought, to using o1-mini. Why mini? Honestly, no clue. Sometimes we pretend to know exactly why we did something. In hindsight, I could say I turned to mini because I knew it was cheaper than preview, so that's my first step. But in the moment, it caught my eye, and I just selected it hoping for better luck.
And it worked—sort of. The "for loops" looked good. The feature clean up looked good. However, I realized shortly thereafter that the implementation was actually wrong. The way the feedback loop to the LLM was implemented left out the most important part; the decision tree rules!
So again, maybe I need a better model?
Attempt 3: chatGPT o1-preview, turn this research paper into a python tutorial...
领英推荐
★★★☆☆
Surely, this model—5x the cost of o1-mini—would deliver. While performance improved, it wasn’t by leaps and bounds. The cost increase definitely didn’t translate into 5x better results. In fact, I ran into a new problem. The script kept generating the same feature over and over again instead of the variations I was expecting. Still not the breakthrough I needed.
At this point, the cycle was becoming obvious: every time my intuition said, “try a better model,” it only resulted in incremental progress at best.
Quit? Or grind through it?
Neither... maybe I need a different approach.
Finally, this is what worked...
★★★★★
I went back to ChatGPT-4o, but this time with a new strategy. Instead of throwing the entire paper at it, I broke it down bit by bit. Paragraph by paragraph, collaborating with the model to understand each section like a study group.
Only after we had a solid grasp of the core ideas did I ask ChatGPT to generate the code to tie it all together.
And you know what? It finally worked.
I ended up with a Python script that captured the nuances. But more importantly, the step-by-step approach wasn’t just helpful for the LLM... it was more effective for me too. I was able to take each of those longer math formulations and talk through what they mean, and how they would be represented in English and in code.
There is more than one way to understand a research paper, just like there is more than one way to use an LLM. The meta skill is being able to recognize that we have flexibility in how we solve the problem at hand.
Full Links & Extended Explanation of References:
Data Engineer
4 个月"Only after we had a solid grasp of the core ideas". I like the use of "we". Nice breakdown ??
Data Sciences at Target
4 个月This is such a great approach to learning and breaking complex things down into smaller chunks always help bring clarity, as long as you can learn from the previous context which llms tend to do great at as well! Going to use this in my own technical readings!
Pricing & Analytics
4 个月Cool approach to transform "reading" into interactive, multi-modal learning. It's like assigning yourself practice problems in the middle of the chapter instead of the end.