Fact-Checking Headlines about ChatGPT and Coding
Barclay R. Brown, Ph.D., ESEP
Senior Fellow, AI Research, Collins Aerospace
A post by futurism.com features the headline, "STUDY FINDS THAT 52 PERCENT OF CHATGPT ANSWERS TO PROGRAMMING QUESTIONS ARE WRONG." Today, I'm playing fact checker. When I read the original study, I find the actual conclusions of the study: "Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose. Nonetheless, our user study participants still preferred ChatGPT answers 35% of the time due to their comprehensiveness and well-articulated language style. However, they also overlooked the misinformation in the ChatGPT answers 39% of the time." The study's conclusions are not at all captured correctly in the futurism.com headline, which leads me to wonder what percent of HEADLINES contain incorrect information. The fact that some percent of responses CONTAIN incorrect information is not at all the same thing as saying that same percent are "wrong."
If someone would like to make the case that chatGPT (GPT-4o) is always wrong about coding questions, I believe I can create prompts that will generate wrong answers nearly every time. One easy way is for the prompt to be vague--then when the LLM chooses one interpretation, we simply claim that we really meant it the other way.
I use chatGPT for coding nearly every day. Yes, sometimes the code has a bug. I tell GPT4o about the bug and it recodes it in a second, eliminating the bug. If you know enough about how LLMs work, you know why there can be a difference between it coding it one way at the start and then correcting it with specific feedback. Newsflash to no one: human coders often have bugs in first draft code--many more than ChatGPT, since human coders like me also make syntax errors, forget function and class property names,
I find using a powerful LLM to assist me in coding is a much more fun and interesting way to work, and on personal projects enables me to do things I would likely not take the time to code by hand.
领英推荐
There IS a learning curve. Well-formed and clearly worded prompts are as important for coding tasks as they are for writing tasks.
Next steps: get good information, try it yourself and make your own conclusions. Also, we need some good programming COMPETITIONS between three kinds of teams (echoing Karparov's Advanced Chess tournaments): humans, AIs, and human+AI teams. If chess is any predictor, the human+AI teams are likely to emerge victorious, at least for now. Competitions would measure overall time to complete, working code, not first pass coding accuracy.
For me, I intend to get better at BOTH AI-assisted coding and also coding itself. What might change for me is that I'll focus more on gaining broad knowledge of coding techniques, capabilities of various libraries and subsystems, and architectural approaches to software systems, rather than trying to remember specific syntax, punctuation, and keywords. That way, I can better guide my AI programming assistant to do my bidding.
Barclay - Right on target! I don't use AI for coding but I do use it for research and analysis in the social science areas. I find that it functions much like I did when I was a law student clerking for trial lawyers. I give ChatGPT instructions that shape its viewpoint and instruct it on my area of inquiry. The better my instructions as to its role and questions to be answered, the better its return. Your hunch that the human + AI team will produce the best quality output is spot on. That works for me now - minimizing the tendency of the AI to hallucinate and maximizing its response to exactly what I need. That's just like a good trial team with litigators supported by well-instructed paralegals. I would also recommend Ethan Mollick's new book, Co-Intelligence. He has lots of good stuff to say about how to relate to AI for the best results.
Sr Mgr, Intelligent Systems | AI Product Manager at Collins Aerospace
9 个月I like the coding competitions idea!
Great insight on AI and ChatGPT! AI is truly transforming the way we interact, and ChatGPT is leading the way in this revolution. Thank you for the share!
Senior Project / Program Manager - M&A and Transformation Projects at WTS Solution
9 个月Just wanted to say hello ??, it’s been a while - I hope all is well with you and I will review your post. Take care Alex Dryden
Systems Engineering Igniter
9 个月I wait for the follow-up... about: - approaches for education to take advantage of LMM without falling into complacency or creating professionals without the ability to steer the LLM like you're doing (e.g., my analogy is the "script kid", ample user of available script online but unable to penetrate and create something more deep than that) - your thoughts are pointed to a scenario of augmentation instead of replacement. Which should be the direction we should strive? For curiosity do you consider being a "team" with the AI system or it just being a sophisticated tool? cheers,