登录查看更多内容

点击“继续加入或登录”，即表示您同意遵守领英的《用户协议》、《隐私政策》及《Cookie 政策》。

How will AI Impact Software Engineering?

Victor Dibia, PhD

Principal RDSE at Microsoft Research (Generative AI, Agents) | Carnegie Mellon Alumnus

发布日期: 2024年8月22日

+ 关注

How will software engineering change in the age of strong AI models that can write code and what should individual engineers and teams do to adapt?

An earlier version of this post is available on here.

Generative AI models can now write code. There is an uptick in AI-Assisted software engineering with tools like GitHub Copilot showing a 180% year-over-year adoption increase and revenue run rate of $2 billion over the last 2 years. From simple functions that an LLM can directly generate (reverse a string), to medium complexity apps that are now achievable via multi-agent systems (see Devin, GitHub Copilot workspace, AutoGen).

As a result of the seemingly expanding scope of what these systems can now do, many questions arise - Will AI replace software engineers? Will software engineering degrees still be relevant? How will the software engineering profession change?

This post will try address these questions by covering:

What coding tasks can generative AI models reliably handle today? What can they not do today?
How will the software engineering trade change?
What should individual software engineers and engineering teams do to adapt?

I have also discussed some of the ideas below in a video podcast with Arstechnica and MongoDB in Jan 2024 - AI and the Evolution of Developer Experience [2]

What Can Generative AI Do?

Today's capable Generative AI models (e.g., OpenAI GPT-3.5+, Google Gemini, Anthropic Claude 3.5) can interpret complex problem descriptions (prompts) and generate code solutions. However, due to their probabilistic nature, these models exhibit varying levels of proficiency across different tasks. To understand their capabilities, it's useful to view them along a spectrum of reliability:

High Reliability (today):

Writing functions/components: Generate correct code for small to medium-scale problems (e.g., creating a complete React page component with styling for customer data entry).
Code Refactoring: Restructure existing code based on provided guidelines.
Bug Fixing: Identify and resolve minor to moderate issues in code.

Medium/Low Reliability (today)

Niche/New Problems: Today's models excel at tasks that exist in its training dataset (in-distribution). See similar note on this by Yann LeCun. For example, it can reliably write sorting algorithms or build React components with TailwindCSS. However, quality degrades for niche or out of distribution tasks. For example implementing algorithms but in a non-mainstream or internal programming language; developing UIs in less common frameworks (e.g., Rust), writing efficient CUDA kernels for custom or new operations etc.
Cognitive Leaps In addition, problem settings that require cognitive leaps (perhaps in the presence of new information or feedback) continue to be challenging for LLMs. For example, LLMs might unreasonably stick to an incorrect position primarily because that information already exists in its context. For example, if an LLM is told the sky is black early within a multi-turn conversation, it may stick to this position despite feedback. This can be relevant to problems where a given course of action is unfruitful and needs to be abandoned.
Complex End-to-End Systems Models by themselves struggle to address tasks that require extensive context and iterative processing such as building an entire software product. The system must design and update the system architecture, integrate feedback, iteratively addressing complex bugs each with intricate dependencies while respecting contextual nuances.

Note: While AI models themselves fall short in the ways described above, early results indicate a multi-agent approach could address these issues. Frameworks like AutoGen simplify the development of these systems and tools like Devin and GitHub Copilot workspaces illustrate what is possible today.

How Will Software Engineering Change?

In my opinion, the question "How will software engineering change with advances in AI?" is perhaps one of the more consequential matters for our profession. I think we can expect two main types of changes. First, we'll see technical changes directly affecting the core aspects of software development. In addition to Software 1.0 (code, rules), Software 2.0 (ML algorithms learning from data), we will see more advanced Software 3.0+ paradigms (prompting and agents). Second, we'll likely see non-technical changes throughout the software engineering ecosystem. These include shifts in developer roles and required skill sets, adaptations in software development processes (focus on integration of, verification, benchmarking of AI generated code), changes in documentation practices to improve AI-generated code, AI models influencing technological choices, the potential emergence of AI model marketplaces, etc.

From Software 1.0 (writing code to solve problems) to Software 3.0+ (defining agents that address tasks)

Advances in machine learning and generative AI have introduced potentially new ways to create software that solves user problems. In my upcoming book, "Multi-Agent Systems with AutoGen," I describe this change across three phases:

Software 1.0: Traditional software development where we write code and rules that are compiled to solve problems.
Software 2.0 (popularized by Andrej Karpathy's article): We apply ML algorithms to extract rules directly from data to solve problems.
Software 3.0+: This phase is driven by generative AI models and consists of two steps: (a) Prompting: We write natural language instructions that elicit solutions from capable pretrained generative AI models. Tools like GitHub copilot are an implementation of this approach. (b) Agents. We define agents (configurations of models with access to tools) that can collaborate to solve dynamic, complex problems.

Excerpt from the upcoming book - Multi-Agent Systems with AutoGen - https://mng.bz/eVP9 — Excerpt from the upcoming book - Multi-Agent Systems with AutoGen -

Importantly, there are reliability tradeoffs introduced with probabilistic AI/ML models. As systems become capable of independently addressing tasks (reducing developer or user effort), the complexity and probabilistic nature of models introduce reliability issues. It is likely that future software systems will work best with a careful combination of approaches from Software 1.0, 2.0, and 3.0.

A Shifting Focus to Verification, Benchmarking, Red Teaming:

The cornerstone of software development has long been the pursuit of reliable, bug-free code. This goal has spawned entire sub-industries and processes, including code peer review, software testing, and CI/CD pipelines. Many of these processes are deeply human-centric, built around interpersonal interactions and reputation systems. As AI and generative models become potential contributors to the development process, we will need to adapt these reliability processes to verify reliability of automatically generated code.

This necessitates the development of new verification methods which may take the form of direct correctness checks tailored for AI outputs; AI reputation systems; Human-in-the-loop integration, as demonstrated by GitHub Copilot workspaces etc.

A considerable amount of energy will be spent on systems to verify AI output, benchmark performance (e.g., HuggingFace LLM leaderboards and ScaleAI SEAL benchmarks) and red teaming (see the Microsoft PyRIT tool for finding risks in models) to understand scenarios where models fail or can result in catastrophic errors.

The term red teaming has historically described systematic adversarial attacks for testing security vulnerabilities. With the rise of LLMs, the term has extended beyond traditional cybersecurity and evolved in common usage to describe many kinds of probing, testing, and attacking of AI systems. With LLMs, both benign and adversarial usage can produce potentially harmful outputs, which can take many forms, including harmful content such as hate speech, incitement or glorification of violence, or sexual content. Microsoft Learn - What is Red teaming

Writing Design Docs for Humans AND Machines:

Traditionally, design documents have been written for human developers. However, with the rise of Generative AI (GenAI), we now need to consider both human and AI audiences.

In the Software 3.0 paradigm, it is plausible that humans describe systems they are interesting in within design documents that agents consume these documents to build these systems.

For GenAI to address complex problems effectively, it must understand context and follow specifications laid out in design documents. This presents a challenge due to subtle differences between human communication and effective AI prompts. For example, current Large Language Models (LLMs) can be sensitive to information position within documents. Important instructions may need specific placement or emphasis for AI processing, unlike human-oriented documentation where key information can be spread throughout.

To bridge this gap, we should explore:

Structured formats catering to both humans and AI
Standardized emphasis techniques effective for both audiences
Tools for translating between human and AI-oriented documentation styles

IMO, in addition to existing research on automated prompting (e.g. DSPy, etc), more research is needed to understand how best to craft design documents that work well for both humans and AI models/agents.

Developer Habits Driven by Generative AI

As developers increasingly adopt generative AI, it's likely to influence coding habits and technology choices. This is in fact similar to how recommender algorithms are altering social dynamics (who we follow on Twitter, information we are exposed to etc) [3]. This trend is driven by two key factors:

Model Recommendations: When prompted to complete a task, AI models may default to popular technologies. For instance, when asked to plot a chart, a model might suggest Python with Matplotlib, potentially steering developers away from alternatives like TypeScript with Plotly.js.
Model Performance: AI models tend to perform more reliably with widely-used technologies due to their prevalence in training data. For example, an LLM might generate React components with fewer bugs than Svelte components, inadvertently encouraging React adoption by developers that observe these discrepancies.

This bias towards established technologies can be problematic, as it may hinder the adoption of newer, potentially superior solutions.

LLMs are better at writing code for some frameworks than others; this has implications for developer technology choices.

Model and Data Marketplaces

To address the challenge of Generative AI models “"favoring” established libraries/tools, we can envision the emergence of marketplaces aimed at enhancing AI models' performance for specific products. Two primary approaches are likely:

Training Data: Companies could offer curated datasets (potentially augmented with synthetic data) for model providers to incorporate into pre-training or fine-tuning processes.
Model Adaptors (LORA): Providers might develop LORA (Low-Rank Adaptation) adapters that can be easily integrated with existing models to improve performance for specific tools.

These marketplaces would enable interested parties to contribute artifacts that optimize AI models for their products. This could help level the playing field for newer or niche technologies, ensuring they receive fair consideration in AI-assisted development environments.

Junior Engineer Jobs Will Evolve

Our previous observations on what AI can do today suggest that AI excels at small, compact tasks with high reliability—tasks typically assigned to junior engineers. Consequently, we may see junior roles pivot towards AI integration, verification, benchmarking, and red teaming as discussed earlier.

However, this shift presents a conundrum. If junior engineers aren't writing code or building systems from the ground up, it may interfere with the traditional learning path that develops them into senior engineers. In addition, this hands-on expertise is crucial for effectively creating the verification and benchmarking systems needed in an AI-augmented development environment.

As the industry adapts to these changes, finding new ways to cultivate deep technical understanding and problem-solving skills in early-career developers will be critical. This may involve reimagining software engineering education and creating novel on-the-job training approaches that blend AI-assisted development with foundational coding experiences.

What Should You Do About it?

Given all the changes that might occur, what can individual engineers, engineering teams and organizations do?

Individual Engineers

Learn to code, build critical thinking skills. Current AI models have limitations in synthesizing creative solutions for truly novel or niche problems, especially when training data is private or unavailable (see similar note by Yann LeCun). For these challenges, human experts who can help build such solutions will remain essential. Additionally, these coding/critical thinking are crucial for addressing emerging tasks in AI-assisted development, such as verification, benchmarking, and red teaming. Cultivating these abilities will position you to tackle both traditional and AI-related challenges in software engineering.
Improve your communication skills. We will need this to write docs that will be consumed by colleagues (other software engineers) and AI models.
Use Generative AI models - One more tool in your tool belt Learn about what they can do,?integrate them into your workflow and learn when they lead to the best productivity benefits.
Remain laser focused on the user or business problem Solving problems creates value. The most effective tool is one that meets the user's requirements. In many instances, addressing a user's need might not necessitate the use of generative AI models or agents.
Stay authentic, human. In a future where AI is useful, the things it cant do - creativity, personality, etc will remain important.

With AI, we can expect a rise in superficially appealing but low-quality content. But that doesn’t mean there’s no place for craftsmanship.

Engineering Teams

Engineering teams face both challenges and opportunities in the era of AI-assisted software development. To navigate this landscape effectively, teams should consider the following strategies:

Task Delegation and Human Oversight: Develop a clear framework for identifying tasks that can be reliably delegated to AI and those that require human supervision. This involves:
Performance Benchmarking and Model Evaluation: Build robust systems to assess and improve AI model performance:
Enhancing AI Capabilities: Invest in infrastructure and practices that elevate AI to senior engineer level:
Adapting Team Culture and Processes: Foster a culture that embraces AI as a tool while maintaining human expertise:
Ethical Considerations and Bias Mitigation: Address the ethical implications of AI in software development:
Future-Proofing and Continuous Learning: Prepare the team for ongoing AI advancements:
Collaboration with AI Tool Providers: For teams developing new tools or frameworks:

By implementing these strategies, engineering teams can harness the power of AI while maintaining high standards of code quality, fostering innovation, and ensuring the continuous growth of human expertise.

Conclusion

Generative AI is rapidly changing the landscape of software engineering, but it's not likely to fully replace human engineers in the near future. While AI can reliably handle many coding tasks, from writing simple functions to generating medium-complexity applications, it still struggles with niche problems, complex end-to-end systems, and tasks requiring deep contextual understanding.

What is the future career path for junior engineers.

I tend to get the question “is a software engineering degree still worth it”. My answer is - what engineers do will likely change, but the degree and learning will remain valuable!

In my opinion, the software engineering profession will evolve rather than become obsolete, with a shift towards higher-level design, system architecture, and AI supervision. Software engineering degrees will remain relevant, but their focus may change to include AI literacy and integration skills.

The future of software engineering will likely involve a blend of traditional coding (Software 1.0), machine learning (Software 2.0), and AI-driven development including agents (Software 3.0+).

Engineers (especially junior engineers) will be well-served by focusing on critical thinking, problem-solving, and effective communication of designs and requirements to AI systems. They'll also need to develop skills in AI prompt engineering, verification, and benchmarking.

However, this shift raises potential issues, such as the risk of deskilling junior engineers as AI takes over more basic coding tasks. There's a concern about how future senior engineers will develop if they don't have the opportunity to start as juniors writing code from scratch.

References

Introducing LTM-1 - has a 5M token context window, allowing it to see your entire repository of code.
AI and the Evolution of Developer Experience. Lee Hutchinson, Emily Freeman, Victor Dibia, and Will Shulman. Produced by Arstechnica, with MongoDB
Brinkmann, Levin, Fabian Baumann, Jean-Fran?ois Bonnefon, Maxime Derex, Thomas F. Müller, Anne-Marie Nussberger, Agnieszka Czaplicka et al. "Machine culture." Nature Human Behaviour 7, no. 11 (2023): 1855-1868.
Can an AI make a data-driven, visual story? By Russell Samora and Michelle Pera-McGhee
Microsoft New Future of Work Report 2023
Wen, Jiaxin, Ruiqi Zhong, Pei Ke, Zhihong Shao, Hongning Wang, and Minlie Huang. "Learning Task Decomposition to Assist Humans in Competitive Programming." arXiv preprint arXiv:2406.04604 (2024).

Designing with ML/AI

3,701 位关注者

Kenan Causevic

freelancer

3 个月

ebooksbyai.com AI fixes this Advances in AI code-writing discussed.

Henry Ruiz, Ph.D

Research Scientist @ Texas A&M University | AgriLife Research | AI, Machine Learning, Deep Learning

6 个月

Insightful post and much needed in these times!

Ajibola Aiyedogbon

Engineering Manager (Machine Learning)

6 个月

Thanks for sharing your insights on this very pertinent topic. I wonder what impact it will have on the capability of fresh grads to make the leap to senior on the traditional path if their learning was AI assisted. I think the traditional path will need to fully evolve. Very likely we will have trusted AI foundational models and frameworks like we have for certain libraries and development frameworks used in many languages and domains today and the next gen developer will just build on top of them, while only a small number of people iterate on the next leap of these models and frameworks.

1 次回应

Margaret M.

AI, art & design

6 个月

A very well written post, Victor Dibia, PhD! I highly recommend this for anyone studying, entering or working in the software engineer field.

1 次回应

James Wolf

Staff AI Engineer at LendingTree

6 个月

I think this is the most clear-eyed imagining of how software engineering will change, worth the read

2 次回应

查看更多评论

要查看或添加评论，请登录

Victor Dibia, PhD的更多文章

AutoGen Studio v0.4.1 Release Notes: Declarative Configuration, Team Testing, and Enhanced Agent Gallery

2025年2月11日

AutoGen Studio v0.4.1 Release Notes: Declarative Configuration, Team Testing, and Enhanced Agent Gallery

We just released v0.4.

10 条评论
New AutoGen Release - v0.4.4: Serializable Agent Configuration, Support for Azure Hosted Models and More

2025年1月29日

New AutoGen Release - v0.4.4: Serializable Agent Configuration, Support for Azure Hosted Models and More

We just released v0.4.

4 条评论
AI Agents 2024 Rewind - A Year of Building and Learning

2025年1月8日

AI Agents 2024 Rewind - A Year of Building and Learning

2024 was quite an eventful year for generative AI and agents! I spent sometime curating the most interesting updates I…

4 条评论
Multi-Agent Week Recap - π0, OmniParser, ARIA, Anthropic Computer Use, OpenAI Swarm ..

2024年11月1日

Multi-Agent Week Recap - π0, OmniParser, ARIA, Anthropic Computer Use, OpenAI Swarm ..

Another light-weight recap of some of the rather interesting things that have happened in the multi-agent space in the…

3 条评论
Using LLMs as Context-Aware Text Embedding Models - NV-Embed Paper Review

2024年10月9日

Using LLMs as Context-Aware Text Embedding Models - NV-Embed Paper Review

Can you harness the immense language understanding capabilities of generative models (e.g.
Multi-Agent Week Recap - MolMo Model, Letta, Core Bench.

2024年10月1日

Multi-Agent Week Recap - MolMo Model, Letta, Core Bench.

A recap of some announcements/papers in the last week (Sept 16 - 24) that I found interesting. Past news items (and…
Multi-Agent Week Recap - Microsoft Copilot Agents, Salesforce Agentforce, WindowsArena, Paper2QA

2024年9月16日

Multi-Agent Week Recap - Microsoft Copilot Agents, Salesforce Agentforce, WindowsArena, Paper2QA

The last week (Sept 8 - 16) has been eventful wrt to agents and multi-agent systems with a few interesting…

2 条评论
Announcing A New Book - Multi-Agent Systems with AutoGen!

2024年7月19日

Announcing A New Book - Multi-Agent Systems with AutoGen!

I'm excited to announce a significant milestone in a project I've been working on: I'm writing a book titled…

89 条评论
Introducing Anomagram?-?An Interactive Visualization of Autoencoders Applied to the Task of Anomaly Detection.

2020年1月8日

Introducing Anomagram?-?An Interactive Visualization of Autoencoders Applied to the Task of Anomaly Detection.

Across many business use cases that generate data, it is frequently desirable to automatically identify data samples…

3 条评论
All I learned about Social Science Research Writing, I learned from Blogging.

2014年12月28日

All I learned about Social Science Research Writing, I learned from Blogging.

Photo Credit : Theslaeslion By the middle of my first year in a Social Science PhD program (Information Systems), I…

8 条评论

See all articles

How will software engineering change in the age of strong AI models that can write code and what should individual engineers and teams do to adapt?

What Can Generative AI Do?

How Will Software Engineering Change?

From Software 1.0 (writing code to solve problems) to Software 3.0+ (defining agents that address tasks)

A Shifting Focus to Verification, Benchmarking, Red Teaming:

Writing Design Docs for Humans AND Machines:

Developer Habits Driven by Generative AI

Model and Data Marketplaces

Junior Engineer Jobs Will Evolve

What Should You Do About it?

Individual Engineers

Engineering Teams

Conclusion

References

Designing with ML/AI

3,701 位关注者

Victor Dibia, PhD的更多文章

AutoGen Studio v0.4.1 Release Notes: Declarative Configuration, Team Testing, and Enhanced Agent Gallery

New AutoGen Release - v0.4.4: Serializable Agent Configuration, Support for Azure Hosted Models and More

AI Agents 2024 Rewind - A Year of Building and Learning

Multi-Agent Week Recap - π0, OmniParser, ARIA, Anthropic Computer Use, OpenAI Swarm ..

Using LLMs as Context-Aware Text Embedding Models - NV-Embed Paper Review

Multi-Agent Week Recap - MolMo Model, Letta, Core Bench.

Multi-Agent Week Recap - Microsoft Copilot Agents, Salesforce Agentforce, WindowsArena, Paper2QA

Announcing A New Book - Multi-Agent Systems with AutoGen!

Introducing Anomagram?-?An Interactive Visualization of Autoencoders Applied to the Task of Anomaly Detection.

All I learned about Social Science Research Writing, I learned from Blogging.