登录查看更多内容

ACAD 32: Devin Dares- Ripe or Hype?

Sanyam Singh Sengar

Healthcare Enthusiast | Analyst | SkillCoach

发布日期: 2024年3月15日

Social media is flooded with the launch of Devin, the first autonomous AI software engineer launched by Cognition AI Inc. This super-smart computer program is a clever assistant for software engineering tasks such as writing code, debugging errors, and deploying applications in real time. Emerging startups in the “code assistant” arena are obsessed with turning everyone into a developer/programmer.

Cognition is a young startup incorporated in the Bay Area earlier in 2024. The founder trio and a band of 10 sport coders are shuttling across AirBnBs in Silicon Valley to teach Devin the ropes.

What can Devin do?

Devin has accomplished real jobs on platforms such as Upwork involving fixing issues and making reports. Statistically speaking, Devin has been able to solve ~14% of the diverse code issues fielded to this algorithm as part of a Real World Software Engineering test. Essentially, this autonomous coder is performing 3X better than the next large language model Claude which stands at 4.8% (#2). Is 14% good enough? Is the test truly representative of real-world scenarios?

At its heart, Devin is an LLM autonomous agent capable of drafting a detailed execution plan to achieve your stated software engineering goals. The agent can then execute the planned activities independently and iteratively such as browsing the internet and reviewing API documentation to get access to the right data, scanning GitHub repositories to get access to a jumpstart code, applying several debugging techniques leveraging the information on Stack Overflow and so on.

Sample activities where Devin has demonstrated proficiency

Develop and refine existing open-source AI models (such as Llama from Meta).
Build and launch a website by building front-end & back-end autonomously
Build automated unit tests and integration tests for stated business scenarios
Detect security vulnerabilities within your code

Want to Access Devin?

Link to Google Form for getting access to Devin

You can fill out the above form link to request access to Devin. The access is constrained to a handful of developers with an active, real use case.

Concept of the day- LLM Agents

The models in the LLM realm are moving from “predicting the next word” to “advancing the reasoning” paradigm. Teaching AI to be a programmer is a deep algorithmic problem that requires making complex decisions while looking at a few steps into the future to decide what route to pick next- quite like the game of chess! (#3). See a high-level process for setting up this agent below. I have covered this process for building a specific data mining LLM agent in one of my previous ACAD blogs.

Preparation: Define scope (content generation, customer service), define autonomy level, and feed relevant data to tune the algorithm.
Configure Decisioning Engine: Train the LLM to have conversations with itself and the user, such as asking follow-up clarifying questions or narrowing down to a final solution among a myriad of choices.
Deployment: Deploy the app in a controlled environment with pilot users, monitoring user interactions and technical performance to form an active feedback loop and improvise. The agent should be configured to learn and fine-tune from the data collected during every interaction.

Steve Nouri 3 个月前

Microsoft’s AI Copilot Could Change the Coding Industry

Bloomberg News 7 个月前

Devin Debunked, AI Coding Assistant, and GPT Explained…

HackerRank 6 个月前

Example 1: Personalized Content Recommendation

Scenario: A recommendation system for a news app that uses an LLM to analyze user interests and reading habits to suggest relevant articles.

What does the agent do? The LLM processes user interactions and feedback on various articles to understand preferences. It then decides which new articles might be of interest to the user by:

Identifying topics, authors, or genres that the user prefers.
Analyzing the sentiment or engagement level of past interactions.
Recommending articles that match the user's profile, potentially adjusting the recommendations based on the user's feedback to improve over time.

Example 2: Customer Support Chatbot

Scenario: A chatbot designed to handle customer service inquiries autonomously, using an LLM to understand and respond to customer requests.

What does the agent do? When a customer asks about tracking their order, the chatbot uses the LLM to comprehend the request and then consults the company's order tracking system to retrieve the specific order status. The decision-making process involves:

Interpreting the customer's inquiry to identify it as a tracking request.
Extracting order details (e.g., order number) from the conversation.
Querying the tracking system with these details.
Communicating the retrieved information back to the customer clearly

What’s the buzz about?

As per Global Count, Software engineering boasts 26.3 million jobs globally, recording an average of 3.7% YoY growth over the last 5 years. India leads the pack with a 17% YoY growth followed by North America (#1). The “fear”? Are we reaching a "plateau of growth" or even worse “will the software engineering industry turn down on its head?”

Devin might seem like an enthusiastic teammate but let's not dial down the limitations of benchmark tests such as the Real World Software Engineering test. We've had tools like Google's Alpha Code for a while now, offering various code solutions in multiple languages.

Most tech companies today embrace the agile ways of working (over the traditional waterfall approach), implying that software requirements are not set in stone on day 1. By design, developing a meaningful product offering today requires a cross-functional team involving representation from product management, software development, analytics, and marketing teams to make sense of the customer requirements in real time. Such an environment often requires software engineers to devise hacky ways of accomplishing objectives given their prior experience of what drives success in a domain.

As Francois Chollet puts this across brilliantly- if software engineering is fully automated, software engineers can move on to “high leverage” positions. In the end, software engineering is about “developing mental models of problems and their solutions”. Sam Altman (Founder of Open AI and touted as the Oppenheimer of AI) recently put it “For me, AGI is the equivalent of a median human that you could hire as a co-worker” (#4)

In my view, the future software engineer will operate on a much bigger screen, with perhaps a decent mix of autonomous AI engineers and junior software engineers at their disposal. This dream team would help amplify the impact of the lead engineer and reduce the drudgery of mundane tasks, freeing up time for high-level thinking. AI is here to not take away jobs but to eventually make the runway smaller for driving impact. One can only imagine how many more skilled software engineers would we need to graduate into a world where half of the GDP is driven by digital enterprises(#5)

Resources:

ACAD - A Concept A Day

489 位关注者

Carlo Beltran

Integrating emerging technologies with jurassic methodologies.

8 个月

Have they made any impact declaration on the human-to-AI engineer team ratio using Devin? Say for example the speed of 10 human software engineers versus 1 HSE and 1 Autonomous SE?

1 次回应

Sanyam Singh Sengar

Healthcare Enthusiast | Analyst | SkillCoach

8 个月

Raise access to Devin using this link: https://docs.google.com/forms/d/e/1FAIpQLScHG0Kuxf9rVLR2Ceamr9qq85YLxKPx8fxdQeBr5TwvYEsPUg/viewform

查看更多评论

要查看或添加评论，请登录

查看全部

ACAD 32: Devin Dares- Ripe or Hype?

Sanyam Singh Sengar

Healthcare Enthusiast | Analyst | SkillCoach

领英推荐

ACAD - A Concept A Day

489 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Is Devin on the path to Coding AGI?

How to Become a Certified Prompt Engineer??

Will A.I. be Able to Augment Programmers? DeepMind's AlphaCode

Engineers of Endava | Meet ?pela

Issue #300 - The ML Engineer ??

Thriving Alongside AI: A Developer's Guide to Navigating the Future

GossipProtocol ?? The Next Revolution in AI Development! ??

Breaking Boundaries: The Revolutionary Vision of DSPyGen

AI is Reshaping How We Learn to Code—And How to Build 'The AI Team' at Work

Digixvalley Granite 3.0: open, state-of-the-art Enterprise Models

领英推荐

ACAD - A Concept A Day

489 位关注者

ACAD 50: A Quantum Leap

2024年6月5日

ACAD 49: Project GreyMatter

2024年5月29日

ACAD 48: The "O" Moment

2024年5月21日

ACAD 47: Contrastive Learning

2024年5月14日

ACAD 46: Boost your learning

2024年5月11日

ACAD 45: Talk to your Stock

2024年5月7日

ACAD 44: Layer your way up

2024年5月3日

ACAD 43: Federated Learning

2024年4月26日

ACAD 42: Wonder to Wisdom

2024年4月23日

ACAD 41: Progressive Thinking

2024年4月16日

社区洞察

其他会员也浏览了

Is Devin on the path to Coding AGI?

How to Become a Certified Prompt Engineer??

Will A.I. be Able to Augment Programmers? DeepMind's AlphaCode

Engineers of Endava | Meet ?pela

Issue #300 - The ML Engineer ??

Thriving Alongside AI: A Developer's Guide to Navigating the Future

GossipProtocol ?? The Next Revolution in AI Development! ??

Breaking Boundaries: The Revolutionary Vision of DSPyGen

AI is Reshaping How We Learn to Code—And How to Build 'The AI Team' at Work

Digixvalley Granite 3.0: open, state-of-the-art Enterprise Models