Devin: Introducing the World’s First Ever AI Software Engineer

Devin: Introducing the World’s First Ever AI Software Engineer

Introduction

The current AI market is booming, especially in the Generative AI universe, with the launch of OpenAI’s ChatGPT-4 back in 2023 and Anthropic AI’s Claude 3 earlier this month. These models are easing the jobs of Content developers, and now there’s a Software Engineer in town.

Less than 72 hours ago, Cognition released Devin, the world’s first fully autonomous AI Software Engineer, setting a new standard of state-of-the-art on the SWE-bench coding benchmark. With just a single prompt, Devin is capable of writing code or creating websites, much like a human software engineer.

Before we delve a little deeper into Devin, we’ll familiarize ourselves with its creator – Cognition.

What is Cognition?

Founded in November 2023, Cognition is an applied AI lab based in the United States focused on reasoning. By leveraging reasoning, they intend to unlock a plethora of disciplines in Artificial Intelligence. Cognition currently comprises professionals and leaders who have worked with tech giants like Google DeepMind, Cursor, Scale AI, and Nuro. They’ve already secured $21 million, led by Peter Thiel’s Founders Fund. Cognition is backed by giants like Tony Xu, CEO of DoorDash, and Fred Ehrsam, founder of Coinbase, a crypto platform.

What is Devin?

Devin is an autonomous model that can plan, analyze, and execute complex code and software engineering tasks with a single prompt. It has its own command line, a code editor, and a separate web browser.?

The model’s capabilities were shown off by testing Meta’s Llama 2 on a couple of different API providers. Devin first set up a step-by-step “Plan” before tackling the problem. It then went on to build the whole project using the same tools as a human software engineer would. Using its built-in browser, Devin was able to pull up the API documentation to read up and learn how to plugin to each of these APIs. Finally, it built and deployed a website with full styling.?

What sets Devin apart is its ability to learn from mistakes. It can make thousands of decisions and gets better over time.?

It outperformed other solutions when it was tested on a few standard sets of software engineering problems.

Devin also underwent interviews with top tech brands regarding AI tasks and met its expectations. It has also completed tasks from real jobs posted on Upwork, such as coding tasks, debugging computer vision models, and generating detailed reports.

A glimpse of Devin was seen with GitHub Copilot, a code completion tool. Programmers can turn prompts into runnable code. This AI coder can not only complete code chunks but also can translate them across multiple languages. Pretty impressive, right? But Devin takes it up a notch by being able to finish codes from scratch to finish without human intervention.?

How does Devin work?

As discussed earlier, Devin has its own command line, its very own code prompter section, and its own web browser to collect the resources.?

When a prompt is entered, Devin goes into “Planner” mode, where a step-by-step guide explains how to tackle the problem.

Once this is done, the dashboard moves to a four-section interface

  • one which has all the input prompts
  • second is the command line section
  • third, its own code editor and
  • fourth, it has its own browser, which thoroughly analyzes resources to derive inferences.?
  • Finally it gives a visualization of the solution.

How does Devin stack up against other Models?

Devin has been tested on SWE-bench, a benchmarking platform that tasks agents to resolve real-world issues on open-source projects, most commonly used by software engineers. According to Cognition, Devin was evaluated on a random 25% subset of the dataset. All models were assisted, i.e., the models were told the exact files that needed to be edited, whereas Devin was unassisted.? Devin correctly resolved 13.86% of the issues end to end, which is a huge jump from Claude 2’s 4.8% and ChatGPT-4’s 1.74%. Cognition stated they will post a more detailed technical report soon!

Will Devin replace a Software Engineer?

The impressive numbers, as seen on the benchmarking, have caused turmoil in the minds of people, especially software developers and engineers, regarding the future of software jobs and related ones.?

Cognition, an applied AI lab focused on reasoning, claims to be building AI teammates with capabilities that surpass existing AI tools.?

Cognition states, “Devin is a tireless, skilled teammate, equally ready to build alongside you or independently complete tasks for you to review. With Devin, engineers can focus on more interesting problems, and engineering teams can strive for more ambitious goals”.?

Funny enough, as most presume that Devin is the end of many software engineers, Cognition, the makers of Devin, is actively hiring “human” software engineers! The opinions are mixed, and until Devin has been fully tested, we cannot come to any conclusions.

As Andrej Karpathy, the ex-director of AI at Tesla, quotes, “In my mind, automating software engineering will look similar to automating driving.” He goes on to say software engineering is on track to change substantially. It would involve much more supervised automation while pitching in high-level commands, ideas, or progression strategies in English.

Just like any other generative AI tool, Devin can only be as good as the person using it! These are just tools in the hands of an efficient user, making his/her tasks much less cumbersome and time-consuming!

Conclusion

Devin AI is a huge stride forward in the Generative AI realm, revolutionizing the software development field by automating coding tasks and complex problems. With models like GPT-4, Claude 3, and now Devin out, the future seems hopeful in Generative AI; they are not here to replace us but to assist us. See you guys in the next one!

要查看或添加评论,请登录

Nithiya Rubini的更多文章

  • The Incredible Strength of Spider Silk: Nature's Marvel

    The Incredible Strength of Spider Silk: Nature's Marvel

    Introduction: Spider silk is a remarkable material that has captivated scientists and researchers for its extraordinary…

    1 条评论
  • GENOMIC MEDICINE

    GENOMIC MEDICINE

    Genomic medicine is rapidly changing the future of medicine. Medical librarians need to understand this field of…

    1 条评论
  • CHANDRAYAAN 3

    CHANDRAYAAN 3

    Chandrayaan-3 (CH-3) is India's third lunar mission, and the second to attempt a soft landing. The mission was launched…

  • ARTICLE ON CHATGPT

    ARTICLE ON CHATGPT

    ChatGPT is an innovative artificial intelligence (AI) language model developed by OpenAI, one of the leading AI…

    1 条评论
  • Google’s Gemini AI

    Google’s Gemini AI

    Google’s Gemini AI , a new artificial intelligence (AI) system that can seemingly understand and talk intelligently…

    1 条评论
  • INDUSTRIAL VISIT

    INDUSTRIAL VISIT

    Hello connections..

  • ANDROID VS IOS

    ANDROID VS IOS

    When buying a phone, we generally recommend sticking with the same platform your current phone uses. At a minimum…

  • INFLUENCER MARKETING

    INFLUENCER MARKETING

    Influencer marketing is now a mainstream form of online marketing. It has been a buzzword for a while now, and the…

  • FULL STACK DEVELOPMENT

    FULL STACK DEVELOPMENT

    A full stack web developer is an individual who is capable of developing both client and server-side software. A full…

  • AWS

    AWS

    Migrating from an on-premises Redis or from an alternative Cloud Service Provider to an AWS environment. While Amazon…

社区洞察

其他会员也浏览了