#14 The Future of o(a)1
this one is kinda cool

#14 The Future of o(a)1

Few things to write about this week:

OpenAI's o1 model, Terrance Tao on Future Iterations, and ecosystem shifts

Product x Engineering Management and new professional challenges

PyTorch Conf!

OpenAI o1

OpenAI came out with a new line of models - the o1. In their technical breakdown, they call out two methods: chain of thought and reinforcement learning.

Chain of thought is when you prompt the model to "think step by step". We've seen examples of this working with current models that weren't specifically trained to respond in that format. Now they trained specifically to give higher quality answers in that format.

Reinforcement learning is an interesting addition and they didn't give much detail on how they were using it. Reinforcement learning is what gave us AlphaZero - the model that beat world champions at chess and go - and OpenAI's Dota 2 team.


they really just use "reinforcement learning" three times in the technical breakdown

Looking for other's takes, I found James Chiang's post.

His theory is they're doing some sort of tree search in parallel - similar to Microsoft’s rStar method (that I'm learning about for the first time). This is why each response can take up to 2 minutes to return! But I would bet on that also improving over time.

maybe? sounds plausable

Terence Tao is actually optimistic

If you're not familiar, Terence Tao is a renowned mathematician. A joke I've heard is if you're ever stuck on a research project, get Terry interested and he'll solve it over lunch. Something like that.

He's willing to try new methods like using machine assisted provers.

His latest take is on o1 - and some people have taken some quotes out of context. So I wanted to link it here and talk a little about it.

"mediocre, but not completely incompetent, grad student" ouch

People fixated on the mediocre and incompetent grad student part. Pretty harsh word (which he clarified). The interesting part is that he's extrapolating to "one or two further iterations" until they get to a "competent grad student". It's good and will get better - as a tool.

Can you imaging bringing the brightest minds in every discipline to train the n-th generation of models?


cant replace humans tho

Terry does clarify his words. I love how he expands on the many characteristics that make people successful and impactful.

Insider Look at OpenAI on the Latent Space pod

The Latent Space pod had Michelle Pokrass of OpenAI on talking about the latest 4o changes and gave an insider look at how they take in customer feedback and iterate.

What I took away from the convo was that OpenAI is trying to be ~the~ developer platform for AI. Structure Output was something that you would import a library for like Instructor. Chain of thought you can think of as now being embedded in your model.

And OpenAI's corporate structure is likely changing.

It's a scary time to be a developer tool and a first mover.

How will Meta / Google / Anthropic respond?

This might be the first new big architectural change since mixture of experts. I can see open datasets being even harder to accumulate if more refinement of data is needed for CoT.

Meta wants to be the open AI platform - like they've done with React, PyTorch, Llama, and more.

Google is the sleeping giant in reinforcement learning.

Will Anthropic copy OpenAI's lead and release a similar model to o1?

The Manager's Path vs Founder Mode

I liked how these podcasts and posts came out around the same time.

Camille Fournier was on Lenny's pod talking about the relationship between product and engineering leadership.

Shreyas Doshi was on The Skip pod giving a grounded take on the Founder Mode essay by Paul Graham.

Both great. Having recently stepped into eng management responsibilities, it's making more sense how organizations operate differently depending on scale and scope just as how engineers and systems operate at different scales and scope.

As yet another person who aspires to build their own destiny, not much to say except to keep learning and building.


Hope you're happy, healthy, and hopeful and have a good rest of the week. DM me if you're going to PyTorch conf this week!!!


要查看或添加评论,请登录

Roger Lam的更多文章

社区洞察

其他会员也浏览了