登录查看更多内容

Measuring the Business Value of GitHub Copilot

Matt Gunter

Leading with Clarity

发布日期: 2025年2月28日

The most common benefit Developers see from the use of GitHub Copilot is time savings.

It's easy for Developers to quantify the Time Saved from using Github Copilot by reflecting on work completed and making a counterfactual estimate of the percentage impact. Copilot made the Task 10% faster or, the task would have taken twice as long without Copilot (50% faster).

Developers use copilot in many different ways and for different subtasks and that leads to different levels of impact.

Example Counterfactual Estimates from developers:

11-20% Time Savings: “Asked Copilot Chat how to use a specific command and got my answer—no need to search through CLI documentation.”
21-30% Time Savings: “For about half of this PR, I completed the first quarter manually, then told GitHub Copilot, ‘See what I did at lines M–N in this file? Do that for…’ It wrote the next quarter of the code for me.”
31-40% Time Savings: “Copilot helped format large data sets for a test—something that would have taken much longer to do manually.”
More than 41% Time Savings: “Was able to tab out the entire process.”

We can cross-check, or validate, these developer reported estimates many different ways.

1) The first way is comparing to the results from controlled studies. The graphic below shows what researchers find when they compare how long it takes AI assisted users and non-AI assisted users to complete the same tasks:

The GitHub 2022 study found Copilot can reduce task duration 55% on average.
The McKinsey study shows that the Time Savings varies by Task, but even complex tasks see up to 10% time savings.
The MIT study shows that certain subtasks (Writing Rough Drafts) see more time savings than other subtasks (Brainstorming). It also showed that some subtasks (Editing) actually use more time.

If we assume developers spend about 30% of their time coding, then we can estimate hours saved per week from their counterfactual estimates of time savings. The graphic below is one of many examples of research finding that Devs spend about 30% of their time with Code.

Simple math shows that 20-50% time savings (within 30% of a 40 hr week) is the same as 2.4-6 hrs per week. See other possible combinations and the resulting hrs saved per week in the graphic below:

We can see that the numbers reported by Developers, the findings of Researchers, and the relationship of Time Spent Coding and Time Saved all tend to tell the same story. Dev's are able to save 20-50% of dev task time by using Copilot. This impact level is consistent between controlled Studies and time savings reported as part of day to day work. This percent of time saved adds up to 2 - 10 hrs saved per week.

2) The second way to validate the developer estimates is by comparing the level of time savings estimated and the type of use reported. When we stack rank and group the type of usecase/scenario by level of impact we can easily see that users that are getting 10-20%, time savings, for example, are using copilot very differently than those getting 40-50% time savings.

Developers reporting similar impact and similar use-cases are essentially confirming each other's reports.

3) The last way to validate the self reported Time Savings is by comparing with throughput measurements like PR rate or Story-point velocity. Due to the many factors that can affect throughput metrics, only with careful controls for many factors can development orgs measure the Copilot with throughput. Such organizations need to have highly disciplined and efficient SDLC processes as well as room for growth. In such rare situations it has been possible to see estimates of Time savings to be consistent with the throughput improvement measured.

For example, in one case developers estimated an average level of timesavings of 6 hrs per week. During this time period, the org's dashboard measured an increase in Storypoints of 9% for one and 14% for another. On a side note, the orgs that are able to link Copilot to throughput are typically very disciplined and efficient in their dev practices. They also tend to have sophisticated, mostly homegrown, mature platforms for capturing and normalizing activity data.

Why can't we rely on Throughput measurements to capture Copilot Time Savings?

There are many, many reasons. First, many factors affect throughput such as the number of productive work hours per week due to holidays, vacation, outages, code freezes, end of quarters, pending deadlines, etc.

Beyond these external factors, we also can't rely on throughput measures for internal reasons. The graphic below shows that the same level of copilot time savings can be large and noticeable for a Developer, but "gets diluted" when measured at the Team level and is essentially invisible (1-3% level of impact) when viewed through an end-to-end measure like "Deployment Frequency".

The above graphic makes it clear that the value of AI to software development is at the Developer level and not at the Process or Organization level. So, why even bother with adopting GitHub Copilot?

Simple, it's the best business case you will ever see. With over 4000% ROI, no business can afford to let this opportunity pass them by.... despite the insignificant throughput improvement.

The Business Case for GitHub Copilot for 1000 Devs coding 40% of the time

In addition to tracking Developer Time Savings, other dimensions can be included to build a "causal-model" of improvement. The causal model connects the factors that lead to developers creating downstream impacts with Copilot.

It recognizes that Devs must first adopt the tool, achieve consistent activity, achieve consistent time savings, and deliberately allocate these savings towards downstream outcomes. This translates to the following ROI Roadmap.

When adoption, activity, time savings? all reach their Targets, it is then reasonable to expect that downstream impact is happening at the target level as well.

In future posts, we'll explore the roadmap in more detail and what learnings are needed by leadership and by developers to maximize impact.

Dimitar Bakardzhiev

Efficient Product Development

2 周

I am confident AI code assistants are the future, but this makes no sense: "With over 4000% ROI, no business can afford to let this opportunity pass them by.... despite the insignificant throughput improvement." Why? Because businesses sell not the time saved at Dev level but software produced at Company level. Companies cannot calculate ROI on Dev level because there is no "Return" there. To calculate "Return" on "Investment", where Copilot is the investment you need the Throughput. When developers learn how to use AI code assistants then Throughput will also increase.

1 次回应

查看更多评论

要查看或添加评论，请登录

Matt Gunter的更多文章

A case for Bayesian Reasoning

2025年3月14日

A case for Bayesian Reasoning

The book "Everything Is Predictable" by Tom Chivers provides a compelling argument for the superiority of Bayesian…
How AI Code Assist Tools Create Value

2025年2月16日

How AI Code Assist Tools Create Value

Before we can know if a new tool or practice or process is helping we have to anticipate what advantage or leverage it…

6 条评论
An Inspiring Story of Repair, Improvement, Surprising Possibilities...

2024年12月29日

An Inspiring Story of Repair, Improvement, Surprising Possibilities...

?? Watch The Last Repair Shop An Inspiring Short Film That Challenges Our Understanding of Systems ?? Theme: This…

1 条评论
Three Ways Throughput Can "Transform" Your Business: A Satirical Allegory

2024年12月12日

Three Ways Throughput Can "Transform" Your Business: A Satirical Allegory

The moral (and humor) in this story is that: Structure matters. Coordination determines what structure is possible.

9 条评论
Measuring more but learning less

2024年12月11日

Measuring more but learning less

Driving continuous improvement and making better decisions is something I think everyone can agree on. If individuals…
Four Ways to Fail at improving software development

2024年11月14日

Four Ways to Fail at improving software development

Rely on Activity Metrics and Promote the Idea that More Activity is More Valuable. Focusing on activity metrics (e.
Average Limitations

2024年11月7日

Average Limitations

When averages misinform and mislead —precision, causality, and predictability provide a repeatable path to better…
The Misguided Focus on Throughput in Knowledge Work

2024年8月17日

The Misguided Focus on Throughput in Knowledge Work

In the world of manufacturing, the Theory of Constraints (ToC) has long been a cornerstone of improving efficiency and…

84 条评论
Maximizing Outcomes with AI

2024年6月9日

Maximizing Outcomes with AI

In a world where automation (AI enabled tools) handle an increasing number of tasks, human decision-making remains…

1 条评论
Rediscovering Agency...

2024年5月31日

Rediscovering Agency...

Depicting individuals who were usually isolated and disconnected from their environments, in the Nighthawks Hopper…

1 条评论

See all articles

Matt Gunter的更多文章

A case for Bayesian Reasoning

How AI Code Assist Tools Create Value

An Inspiring Story of Repair, Improvement, Surprising Possibilities...

Three Ways Throughput Can "Transform" Your Business: A Satirical Allegory

Measuring more but learning less

Four Ways to Fail at improving software development

Average Limitations

The Misguided Focus on Throughput in Knowledge Work

Maximizing Outcomes with AI

Rediscovering Agency...