Is Devin on the path to Coding AGI?
Michael Spencer
A.I. Writer, researcher and curator - full-time Newsletter publication manager.
Members of the Cognition AI team in New York. Credit: Levi Mandel for Bloomberg Businessweek
Hey Everyone,
The real AI bros just keep getting younger this year as funding appears limitless for AI startups of this variety recently. To see this blog with proper formatting and images go here.
In 2024 we are seeing more AI startups and coding assistants/agents coming to market with demos suggesting they can do a lot of things. I first wrote about Devin here.
Is Devin just another agent-GPT or something more sophisticated? Many are saying it uses GPT-4, which seems likely to me.
Built by Cognition Labs, who were first backed by Peter Thiel is claiming a viable coding assistant called Devin.
Generative AI has meant that we are seeing an increasing number of Chinese Americans and American educated Chinese found pretty fascinating startups. A lot of the builders and software engineers most interested in machine learning happen to be Asian. Open a random AI paper in 2024, and you will see what I mean.
Cognition Labs of course is making incredible claims like: “Devin can today in March, 2024 make thousands of decisions, recall relevant context, learn over time, and correct mistakes in code.”
The tool, called Devin, has rattled software engineers across the tech sector. and many are skeptical it’s much more than a GPT-4 skin.
To start using Devin for engineering work, please reach out here or get in touch at [email protected].
What can Devin Do?
See the video demos.
"Devin is an autonomous agent that solves engineering tasks through the use of its own shell, code editor, and web browser," said the company on X.
Follow them on X. (all the following have demos on their blog announcement)
And there are others.
The startup Cognitive Labs, is claimed to be about 2 months old which would make it founded in January, 2024.
Devin claims to do things GitHub Copilot hasn’t managed t build in many years. GitHub Copilot went live in October, 2021.
Devin has $21 million in funding so far, though appears rather secretive.
Cognition AI’s founders (left) Hao, Scott Wu, Walden Yan (right)
The company claims Devin sets a new state-of-the-art on the SWE-Bench coding benchmark, successfully passing practical engineering interviews and completing real jobs on Upwork.
In the Bloomberg article: CEO Scott Wu explains, "Teaching AI to be a programmer is actually a very deep algorithmic problem that requires the system to make complex decisions and look a few steps into the future to decide what route it should pick. It's almost like this game that we've all been playing in our minds for years, and now there's this chance to code it into an AI system."
It’s way too early to know what to make of Devin, but their demo has gotten a lot of positive people on X saying great things about them.
Besides Peter Thiel and Elad Gil (who is in on nearly everything AI), it’s not clear to me how Devin got funding so far. The product is said to be slow and breaks often.
How different will Devin from any variation or packaged Auto-GPT, Ollama, AgentGPT, GPT Engineer or Super AGI? I couldn’t tell you.
Other angels include:
Patrick and John Collison, Elad Gil, Sarah Guo, Chris Re, Eric Glyman, Karim Atiyeh, Erik Bernhardsson, Tony Xu, Fred Ehrsam
领英推荐
After just a few months into 2024? Something very weird is going on.
Is the demo really as good as advertised? When it was evaluated on a benchmark asking AI to resolve issues found in real-world open-source projects on GitHub, Devin managed to fix 13.86% unassisted. That may seem low, but it's a huge leap from the 1.96% of issues a previous top model could correct.
The founders of Cognition AI are Scott Wu, its chief executive officer; Steven Hao (ex Scale AI), the chief technology officer; and Walden Yan, the chief product officer.
It’s hard to know what’s going on here or how difficult it would be to clone (by the Chinese themselves for instance). While the technical details remain undisclosed, Wu hints at a unique combination of large language models and reinforcement learning techniques that enable Devin's advanced reasoning capabilities.
So what are we to Believe?
Independent testers, including prominent VCs and CEOs have lauded Devin's ability to maintain coherence and stay on task through hundreds or thousands of steps, a notable improvement over existing AI coding assistants.
Patrick Collison of Stripe says these aren’t just cherry-picked demos.
Gold medal Coders? Wu, 27, is the brother of Neal Wu, who also works at Cognition AI. These two men are world-renowned for their coding prowess: The Wu brothers have been competing in, and often winning, international coding competitions since they were teenagers, and they have helped elevate the US national coding team to a more respectable position against its Chinese and Eastern European rivals in recent years.
CEO of Perplexity had this to say:
Devin’s Abilities for Software Engineers
So are programmers supposed to buy Devin to make them more productive? Cognitive Labs says Devin can help with:
With the ability to scale to become more useful soon.
I don’t know anything about Sports coding, never heard of it to be honest. Cognition AI is full of sport-coders. Its staff has won a total of 10 gold medals at the top international competition, and Scott Wu says this background gives his startup an edge in the AI wars.
Cognition (Cognitive Labs) thinks that "solving reasoning" can "unlock new possibilities in a wide range of disciplines” is now possible with the evolution of Devin.
Team Wu want wants Devin to be seen as a "tireless, skilled teammate" capable of building alongside humans, or independently if left to do that. I’m okay if that turns out to be true, but I’m naturally pretty skeptical.
To start using Devin for engineering work, please reach out here or get in touch at [email protected].
Misc.
Cognition’s founding team has 10 IOI gold medals and includes leaders and builders who have worked at the cutting edge of applied AI at companies like Cursor, Scale AI, Lunchclub, Modal, Google DeepMind, Waymo, and Nuro.
Packy McCormick of Not Boring said:
To see this post with images go here.
Fred Ehrsam
I guess this is a pretty decent quote:
“For the first time I have seen AI take a complex task, break it down into steps, complete it, and show a human every step along the way - to a point where it can fully take a task off a human’s plate.”
Linas Beliūnas
Linus featured Cognition in a viral piece (probably sponsored) here on LinkedIn.
Kyle Shevlin
Per Business Insider, Kyle Shevlin, founder and software engineer at software development agency Athagist, expressed frustration on X about the industry "trying to aggressively replace one of the few remaining jobs that provides a legit middle-class income."
While young people have little reason to be critical of the AI bro cultures of AI tools spam, some older engineers and workers see this as not so healthy a development.
As more Magic AI and Devins come to market, I wonder what exactly we will see. A new era of AI-human hybrid jobs and agency or us literally teaching AI how to replace us?
The price of increased productivity might mean shorter work weeks for those lucky to still have great jobs. There will definitely be winners and losers.
Building NeoApps.AI for 90x Faster Software Delivery and 60% Cost Reduction for Internal Tools
8 个月Hey Devin, I’ve got this impressive collection of repos for frontend, backend, workernodes, Kubernetes, you name it! And guess what? I want to dive into all of them simultaneously, making changes in different languages, while also executing test cases. Oh, and by the way, it’s basically what you’re doing, just, you know, generating the app. Think you’re up for the challenge? Oh, and if you’re keen on brushing up your skills the human way, check out my YouTube channel. Feel free to drop by and teach yourself a thing or two. Also, for some extra learning, swing by https://neoapps.ai and its docs at https://docs.neoapps.ai. Dont forget to subscribe YT videos so you can learn better https://youtube.com/@NeoAppsAI?si=3B0J8xp3aKVmlrLf Happy coding!. Lol.