The death of the programmer is greatly exaggerated
Every week, a new model is released; this past week saw the release of Claude 3.5 Sonnet. This model can better leverage artefacts and is better trained to produce frontend code, among other improvements.
As with any new release, the Claude 3.5 Sonnet has sparked a wave of excitement and speculation. Some are hailing it as a game-changer, while others are questioning its impact on the role of front-end engineers. Was this all it was cracked up to be? Or was it yet another guerilla marketing campaign with carefully crafted demos?
As always, I embarked on a journey (not really, but it seems fitting given the extreme claims), firing up Anthropic's chat interface. The aim was to gauge the model's capabilities and understand the level of effort the individuals had invested in crafting the perfect prompt or series of prompts to generate such impressive results.
The test was simple: generate a simple tax calculator in React. Off it went, generating a bunch of typescript react code.
I'll go easy on it, as I didn't give it a list of libraries it should have used as dependencies; I didn't prompt it to ensure it was optimised and didn't fall for common design flaws by overusing useState or form validation. I could have written a prompt longer than the code generated that would have solved the majority of the problems it produced in the first few iterations.
But more importantly, there was something that I have continuously dealt with, from educating and coaching developers on why it was not the right approach to the outcome of errors and bugs caused by its usage—rounding issues caused by using floats for currency.
领英推荐
A second prompt to use decimals resulted in a better outcome; it added the required library as a dependency and adjusted the code accordingly.
Now, I could blame the language. Javascript does not have a native decimal type, so it has to bring in a dependency or do the dance with integers to avoid the common foot gun of floats. Now, it is coming to Javascript, but this model is only as good as what it is trained on, and there is a lot of bad code out there that makes this mistake and it will be the basis of data it is trained on for many years to come.
I was disappointed it had stumbled at the first hurdle. Given the demos I had seen, I was desperately hoping for more.
I heavily use Copilot when I code. Often, it autocompletes specific fragments, avoiding the need for me to type out the actual code, but it still hasn't solved a problem I hadn't already solved. Without a much deeper understanding of the problem domain, I struggle to envision AGI any time soon.
This specific model might streamline some of the tooling in the middle for the developer; its ability to generate good react code is undoubtedly a win for converting designs between different frameworks and libraries. Developers will continue to be able to ship features faster and faster leveraging these tools.
These models will continue to improve. Maybe I'm just getting old and becoming a gatekeeper for the craft I have been "perfecting" throughout my career. We are only in the early days of the gold rush, but I'm not buying a farm yet.
Chief Marketing Officer at Ordermentum Pty Ltd
9 个月Great article! What about the death of the marketer?
Driving The Next Era of Hiring: AI-Driven and Candidate-Focused
9 个月Great article! Using zero-shot works best for small snippets of code or for problems where the LLM is already pre-trained (so it's not really generating, it's just regurgitating). I've used LangChain for a PoC and try to feed back the errors back to the LLM and chain the process together; The quality of the code written was much better! Nonetheless i had to add a 'tool' to search when it stumble across unknown problem. What i would like to see, it's the same PoC however done with "VISION" to understand if the technique could be used even in the front-end!