A Better Model? Or a Smarter Approach to Problem Solving?

A Better Model? Or a Smarter Approach to Problem Solving?

I often get lost when reading the mathematical formulations in research papers. Sometimes I quit on the spot entirely. Sometimes, the promise of what’s on the other side is too damn tempting.

I was reading this paper on automating part of the ML process using LLMs. I was cruising along until I hit this:

min r Lf ? (Dval ⊕ r) subject to f ? = arg min f Lf (Dtrain ⊕ r)

And there it was. Quit? Or grind through it?

Neither. I realized a python tutorial where I could run the code, see the result, & try something different, would be a much more effective way for me to internalize this knowledge. It's what Jeremy Howard refers to as the tops down learning approach.

Enter chatGPT.

Attempt 1: chatGPT 4o, turn this research paper into a python tutorial...

★☆☆☆☆

Ok start. I got code I could run in a notebook & understand. The results seemed promising but I quickly realized they had a lot of holes. Naturally, I thought, maybe I need a better model?

Attempt 2: chatGPT o1-mini, turn this research paper into a python tutorial...

★★☆☆☆

I switched, without much thought, to using o1-mini. Why mini? Honestly, no clue. Sometimes we pretend to know exactly why we did something. In hindsight, I could say I turned to mini because I knew it was cheaper than preview, so that's my first step. But in the moment, it caught my eye, and I just selected it hoping for better luck.

And it worked—sort of. The "for loops" looked good. The feature clean up looked good. However, I realized shortly thereafter that the implementation was actually wrong. The way the feedback loop to the LLM was implemented left out the most important part; the decision tree rules!

So again, maybe I need a better model?

Attempt 3: chatGPT o1-preview, turn this research paper into a python tutorial...

★★★☆☆

Surely, this model—5x the cost of o1-mini—would deliver. While performance improved, it wasn’t by leaps and bounds. The cost increase definitely didn’t translate into 5x better results. In fact, I ran into a new problem. The script kept generating the same feature over and over again instead of the variations I was expecting. Still not the breakthrough I needed.

At this point, the cycle was becoming obvious: every time my intuition said, “try a better model,” it only resulted in incremental progress at best.

Quit? Or grind through it?

Neither... maybe I need a different approach.

Finally, this is what worked...

★★★★★

I went back to ChatGPT-4o, but this time with a new strategy. Instead of throwing the entire paper at it, I broke it down bit by bit. Paragraph by paragraph, collaborating with the model to understand each section like a study group.

Only after we had a solid grasp of the core ideas did I ask ChatGPT to generate the code to tie it all together.

And you know what? It finally worked.

I ended up with a Python script that captured the nuances. But more importantly, the step-by-step approach wasn’t just helpful for the LLM... it was more effective for me too. I was able to take each of those longer math formulations and talk through what they mean, and how they would be represented in English and in code.

There is more than one way to understand a research paper, just like there is more than one way to use an LLM. The meta skill is being able to recognize that we have flexibility in how we solve the problem at hand.

Full Links & Extended Explanation of References:

  1. Annie Duke's book and philosophy on quitting: quitting is ok, and we gnerally don't quit early enough to divert our resources to more promising avenues. https://www.amazon.com/s?k=quit+annie+duke&hvadid=602286203716&hvdev=c&hvlocphy=9052852&hvnetw=g&hvqmt=e&hvrand=2751827178934580752&hvtargid=kwd-1676463833943&hydadcr=22534_10353871&tag=googhydr-20&ref=pd_sl_602wesns7t_e
  2. Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning. https://arxiv.org/pdf/2406.08527
  3. Jeremy Howard is a legend in the AI industry. His course (and python package) fast.ai - in which he leverages a tops down learning approach - are beloved. https://forums.fast.ai/t/learning-strategy-for-top-down-approach/66173
  4. Usage of LLM cost comes in all sort of varieties. It's important to understand your costs if you intend to optimize usage between platforms and models. https://openai.com/api/pricing/

"Only after we had a solid grasp of the core ideas". I like the use of "we". Nice breakdown ??

Ashish Batra

Data Sciences at Target

4 个月

This is such a great approach to learning and breaking complex things down into smaller chunks always help bring clarity, as long as you can learn from the previous context which llms tend to do great at as well! Going to use this in my own technical readings!

回复
Christine Carragee

Pricing & Analytics

4 个月

Cool approach to transform "reading" into interactive, multi-modal learning. It's like assigning yourself practice problems in the middle of the chapter instead of the end.

要查看或添加评论,请登录

Frank Corrigan的更多文章

  • How to think about ‘prolific’

    How to think about ‘prolific’

    Prolific, by definition, is about producing much or being present in large numbers. When used as an adjective to…

    2 条评论
  • Modern NLP might give us new teamwork metrics

    Modern NLP might give us new teamwork metrics

    In basketball, an assist is attributed to a player who passes the ball to a teammate in a way that leads directly to a…

    1 条评论
  • Accelerating Through the Funnel

    Accelerating Through the Funnel

    Who is the “most successful?” This is a hypothesis; it’s the people that are able to accelerate through the funnel the…

    2 条评论
  • Convincing Your Colleagues: How Data Professionals Can Make a Powerful Impact

    Convincing Your Colleagues: How Data Professionals Can Make a Powerful Impact

    As a data professional (data analyst, data scientist, data engineer), who is responsible for convincing your peers and…

    3 条评论
  • In order to be a great data science leader, I need to be a great data scientist.

    In order to be a great data science leader, I need to be a great data scientist.

    In order to be a great data science leader, I need to be a great data scientist. To whatever degree this is true, I…

    5 条评论
  • Coincidence of Attention

    Coincidence of Attention

    Sometimes, people praise the barter system; a system of exchange in which participants in a transaction directly…

    2 条评论
  • This is a Straw Man

    This is a Straw Man

    The straw man mental model is one I’ve always struggled to grasp. When you Google search, there are two definitions…

    2 条评论
  • Are you an interesting person?

    Are you an interesting person?

    “I’ll probably lose another million dollars or $10 million or something. But directionally, I think that sort of thing…

    2 条评论
  • We don't have context for asking better questions. And therefore we don't.

    We don't have context for asking better questions. And therefore we don't.

    David Turetsky, during his talk at WorldAtWork 2022, said “We don't have context for asking better questions. And…

    2 条评论
  • Am I a Data Analyst or a Data Scientist?

    Am I a Data Analyst or a Data Scientist?

    I should know this one, cold. What’s the difference between data analytics and data science? An acquaintance asked me…

    3 条评论

社区洞察

其他会员也浏览了