Sugar rush, limit. Where is "Good Enough" when building with Generative AI?
Daniel Shi
South Quad LLC | Investments, Growth strategy | Fintech Operator, ex-Remitly | ex-VC
After building in Generative AI for the last few months, I've noticed a pattern emerging. It is incredibly easy to get results from Generative AI. OpenAI also makes it incredibly easy to get going because they have such a wonderful developer experience.
You get this effect of a sugar rush of results. You think that things are really wonderful. The content that you are generating is higher quality than you expected, and the time and cost needed is so low. Or the accuracy of your results looks super promising, and it is taking no time at all to get results back.
“Promising” is the key word
But very quickly you also hit a limit. Things start to not work as good as you were hoping. The content you are trying to generate is not quite good enough. GPT3.5 is repeating words so frequently it makes it look demented. The accuracy of GPT4 looks good when you eyeball it, but it turns out to only be 60% accurate against the test set.
Good enough? Should you continue?
I’ve been trying to frame up an approach towards building apps in Generative AI, and I think it looks like the above. X-axis is effort, and Y-axis is some quality bar. Quality could be accuracy, amount, or some other dimension.
领英推荐
When you start building with LLMs (or image generators), the initial output looks amazing. “It’s magic!” There is a massive sugar rush of quality that looks amazing.
But then the results will hit a limit and start to asymptote. And then you have to ask yourself, should I continue? What is the point of “Good Enough”? Meaning, when is the point where you know that you should continue building?
Is “Good Enough” before the limit? Is it after the limit? Or worse, is it at the limit?
Before the limit is easy — you hit it and have confidence to rally more resources to keep going.
After the limit is also kind of nice. You know that it will be out of reach for whatever reason. You can give up and try something else.
But, I would argue, at the limit is the worst. Because at that point, you could keep going right on the Effort dimension and hope to surmount it. But you really don’t know how asymptotic the limit is. It does not help that there are issues around probabilistic results, model slippage or commercial failure that can throw your projects into peril.
Engagement Manager
11 个月I am still in the "amazing" part of the curve, perhaps I'm not pushing at it hard enough. My experience is possibly a bit of an outlier as I use ChatGPT for a wide-variety of tasks: sales emails, marketing collateral, application prototypes. Context switching on a daily or a weekly basis across these can be a huge drag on productivity, but ChatGPT has helped me immensely by getting to the "80%" of the finished product in a matter of minutes.