AI is getting dumber!
A few friends of mine have been talking about how models are starting to fall apart with the new "Reasoning" training techniques. Vendors say it's because the new models are stronger in some areas, weaker in others. That doesn't make it alright for me.
Today I was trying to get Claude Sonnet 3.7 to fix the typescript errors it had created generating a test fixture:
First of all, it looks like it was just spitting out random code given the amount of different issues in this one file. And what's the point of AI if it's can't apply typescript rules better than humans? (And I look forward to AGI so that the problem of finding a missing closing parentheses can finally be solved...) Now I don't believe in Artificial Intelligence, just Apparent Intelligence. Nevertheless I don't relish spending time fixing typescript errors, especially as my "assistant" created them.
So I posed the question to 3.7, Anthropic's "Most intelligent model yet!"
I decided to ask 3.7, in its lofty intelligence level, to sort this out. Part of me suspected that this might just be a dummy file needing to be deleted, considering all the errors, so I ask for confirmation that this is indeed a real test.
This has got to be one of the laziest messages I have received from an AI. I can plainly see that the symbol doesn't appear to be on the type, on line 581. Then it just makes some vague guesses. Then esentially tells me good luck in fixing it.
I was peeved and sent the ball back in its court:
You got that right, bud! You break it, you fix it!
I configured an MCP server to let it access my project files as required so it goes ahead and reads files.
Unfortunately, the response is completely bogus, blaming the typescript errors on a glitch in the typescript server (used by VS code to convert typescript on the fly to javascript). While I've had a problem with the typescript server once or twice in the past, it's really a leap to jump immediately to that. And telling me to look into the tsconfig.json is a great way for me to waste hours trying different typescript options. All unecessarily of course.
I had heard that Claude 3.5 was actually better for coding, so I decided to switch models and see if the rumor was true with this simple test.
From the same prompt, it gets the problem RIGHT AWAY!!!
It continues:
So there we have it: a simple problem, but where 3.5 went straight ahead and solved the problem, the lethargic 3.7 gave me a bunch of phoney-baloney answers. I am now switching back!
And let my tale of woe be a warning to everyone eagerly awaiting progress just because the version number is incrementing.
Sometimes progress isn't all it's cracked up to be!
Martin Béchard is currently annoyed with AI vendors that are passing off their bloated reasoning models as capable of 10x Coding. If you are struggling to get these overly-verbose AI Coding assistants to do their job, try going back, or for human-grade intelligence please reach out at [email protected]!
Fullstack Engineer | React, Ruby, Postgres | InterlinearHub
2 天前When I hear about AI does get dumber sometimes, I feel a sense that our profession might not be doomed after all. Is that counterintuitive? ??