Measuring the Business Value of GitHub Copilot
The most common benefit Developers see from the use of GitHub Copilot is time savings.
It's easy for Developers to quantify the Time Saved from using Github Copilot by reflecting on work completed and making a counterfactual estimate of the percentage impact. Copilot made the Task 10% faster or, the task would have taken twice as long without Copilot (50% faster).
Developers use copilot in many different ways and for different subtasks and that leads to different levels of impact.
Example Counterfactual Estimates from developers:
We can cross-check, or validate, these developer reported estimates many different ways.
1) The first way is comparing to the results from controlled studies. The graphic below shows what researchers find when they compare how long it takes AI assisted users and non-AI assisted users to complete the same tasks:
If we assume developers spend about 30% of their time coding, then we can estimate hours saved per week from their counterfactual estimates of time savings. The graphic below is one of many examples of research finding that Devs spend about 30% of their time with Code.
Simple math shows that 20-50% time savings (within 30% of a 40 hr week) is the same as 2.4-6 hrs per week. See other possible combinations and the resulting hrs saved per week in the graphic below:
We can see that the numbers reported by Developers, the findings of Researchers, and the relationship of Time Spent Coding and Time Saved all tend to tell the same story. Dev's are able to save 20-50% of dev task time by using Copilot. This impact level is consistent between controlled Studies and time savings reported as part of day to day work. This percent of time saved adds up to 2 - 10 hrs saved per week.
2) The second way to validate the developer estimates is by comparing the level of time savings estimated and the type of use reported. When we stack rank and group the type of usecase/scenario by level of impact we can easily see that users that are getting 10-20%, time savings, for example, are using copilot very differently than those getting 40-50% time savings.
Developers reporting similar impact and similar use-cases are essentially confirming each other's reports.
3) The last way to validate the self reported Time Savings is by comparing with throughput measurements like PR rate or Story-point velocity. Due to the many factors that can affect throughput metrics, only with careful controls for many factors can development orgs measure the Copilot with throughput. Such organizations need to have highly disciplined and efficient SDLC processes as well as room for growth. In such rare situations it has been possible to see estimates of Time savings to be consistent with the throughput improvement measured.
For example, in one case developers estimated an average level of timesavings of 6 hrs per week. During this time period, the org's dashboard measured an increase in Storypoints of 9% for one and 14% for another. On a side note, the orgs that are able to link Copilot to throughput are typically very disciplined and efficient in their dev practices. They also tend to have sophisticated, mostly homegrown, mature platforms for capturing and normalizing activity data.
Why can't we rely on Throughput measurements to capture Copilot Time Savings?
There are many, many reasons. First, many factors affect throughput such as the number of productive work hours per week due to holidays, vacation, outages, code freezes, end of quarters, pending deadlines, etc.
Beyond these external factors, we also can't rely on throughput measures for internal reasons. The graphic below shows that the same level of copilot time savings can be large and noticeable for a Developer, but "gets diluted" when measured at the Team level and is essentially invisible (1-3% level of impact) when viewed through an end-to-end measure like "Deployment Frequency".
The above graphic makes it clear that the value of AI to software development is at the Developer level and not at the Process or Organization level. So, why even bother with adopting GitHub Copilot?
Simple, it's the best business case you will ever see. With over 4000% ROI, no business can afford to let this opportunity pass them by.... despite the insignificant throughput improvement.
In addition to tracking Developer Time Savings, other dimensions can be included to build a "causal-model" of improvement. The causal model connects the factors that lead to developers creating downstream impacts with Copilot.
It recognizes that Devs must first adopt the tool, achieve consistent activity, achieve consistent time savings, and deliberately allocate these savings towards downstream outcomes. This translates to the following ROI Roadmap.
When adoption, activity, time savings? all reach their Targets, it is then reasonable to expect that downstream impact is happening at the target level as well.
In future posts, we'll explore the roadmap in more detail and what learnings are needed by leadership and by developers to maximize impact.
Efficient Product Development
2 周I am confident AI code assistants are the future, but this makes no sense: "With over 4000% ROI, no business can afford to let this opportunity pass them by.... despite the insignificant throughput improvement." Why? Because businesses sell not the time saved at Dev level but software produced at Company level. Companies cannot calculate ROI on Dev level because there is no "Return" there. To calculate "Return" on "Investment", where Copilot is the investment you need the Throughput. When developers learn how to use AI code assistants then Throughput will also increase.