Code reviews and software quality, empirical research results.
OpenAIRE Workshop "Open Peer Review" (June 2016, G?ttingen/Germany) photo by a deppe (https://www.flickr.com/photos/142136378@N03/)

Code reviews and software quality, empirical research results.

In previous articles, I wrote about the effects of unit testing and engineering organization structure on software quality. Unit testing is a tool which can be applied by an individual software engineer while changing the shape of the engineering organization is a company-wide initiative. In this article, I will address the effects of code reviews on software quality. Code review is a practice which can be leveraged by a small engineering team. The contemporary code review practice has been adopted by many organizations and open source projects as a quality assurance measure and typically involves a software tool like review board. In my experience, the process includes uploading a changeset for review and asking teammates to examine the code and provide comments on the implementation. Since this is a broadly adopted practice, it might be useful to understand more about code review effects on software quality before deploying it in your team. Below I have summarized the empirical research on peer code reviews from large software engineering organizations.

Code review, empirical research summary

In [1] Bacchelli and Bird report on the use of modern code reviews at Microsoft. In this paper, the researchers survey managers and engineers working at Microsoft and look to understand the motivations for conducting the code reviews and compare the motivations with the actual output of code reviews. 

Bacchelli and Bird conclude that finding defects is the top motivating factor for doing code reviews however they find that the highest contribution from code reviews is code improvements. They further conclude that this might be because finding defects in a code review requires a detailed understanding of the system under consideration. 

The researchers conclude with some recommendations and implications. Using code reviews for quality assurance is not good enough since reviews don’t identify defects and rarely identify subtle defects. Understanding is key to a successful code review, context is important. Code reviews are useful for tasks other than finding defects for example knowledge sharing or finding alternative solutions. Communication is an import aspect of a successful code review and needs to be supported by the tooling.

In [2] Buso, Greiler, and Bird report on the usefulness of code review comments. In this paper, the researchers interview engineers working at Microsoft to get an understanding of what review comments they find useful. Engineers at Microsoft find comments that relate to the correctness of the implementation useful, and those that relate to the structure and alternative approaches somewhat useful.

The researchers then went on to create a machine learned classifier to identify useful comments and applied it to five projects within Microsoft code base. A total of 1,496,340 review comments were categorized from 190,050 review requests. Across the five project analyzed 64% to 68% of reviews were classified as useful. This data set was further analyzed for insights into the attributes of reviewers and changesets which lead to a higher density of useful comments. The researchers observed that reviewers experienced with the code under review give more useful comments than reviewers looking at the code for the first time. Base on this information they recommend carefully picking reviewers and caution that new reviewers need to be included in order to gain experience with the code base.

The researchers also observed that smaller changesets get a higher density of useful comments and that source code gets a higher density of useful comments than build and configuration files. Based on this information they recommend breaking up changesets into small incremental changesets when possible and paying particular attention to build and configuration files under review. Finally, the researchers observed that the useful comment density stabilizes over time for a code base and dips in usefulness can be analyzed by teams to understand issues with the review process.

In [3] Cohen reports on an extensive case study of a code review process applied at Cisco Systems. This paper reports many results on the effectiveness of the code review process as it applies to finding defects. This study defines a defect as “...When a reviewer or consensus of reviewers determines that code must be changed before it is acceptable, it is a “defect”...” That is to say that these are not defects that have been observed in the field or during the post-development testing phase. Cohen provides the following insight into what can be done to have an effective code review: Lines of code under review need to be under 200 and not above 400 or the reviewers start to miss defects. Inspecting at a rate of less than 300 LOC per hour is best for defect detection. Total time spend on the review needs to be around 60 minutes and not exceed 90 minutes.

In practice

Facebook leverages peer code reviews to build the main Facebook.com web application. In [4] Feitelson and team share that Facebook developed its own code review system which integrates version control to give historical context to the reviewers, ability to discuss suggested code changes in the tool, and bug and task tracking.

Google also leverage peer code reviews and they have integrated Tricorder, a tool for large-scale program analysis, into their code review process as reported in [5] by Sadowski and team. Tricorder provides actionable fixes that can be applied right in the code review tool. The reviewers have the ability to ask the person submitting the review to address the issue raised by Tricorder. 

Next Steps

It appears that peer code reviews are better suited for knowledge sharing and code improvements than for eliminating defects. The effectiveness of the process seems to depend on the experience of the reviewer with the code, the size of the changeset, and the rate at which the review is conducted. I have found code reviews to be an excellent way to learn a new code base and to share knowledge with new teammates which according to the empirical results is one of the motivations for adopting this process within an organization. When I review code I prefer to apply the changeset to my workspace and use an IDE to examine the changes. This setup gives me access to the larger context necessary to understand the change. I am curious to learn how other engineers conduct code review and what they get out of the process. Does your team perform peer code reviews? What types of issues do you tend to find when you review code? Let me know in the comments.

References

  1. Bacchelli, Alberto, and Christian Bird. "Expectations, outcomes, and challenges of modern code review." Proceedings of the 2013 international conference on software engineering. IEEE Press, 2013.
  2. Bosu, Amiangshu, Michaela Greiler, and Christian Bird. "Characteristics of useful code reviews: an empirical study at Microsoft." Proceedings of the 12th Working Conference on Mining Software Repositories. IEEE Press, 2015.
  3. Cohen, Jason, et al. "Best kept secrets of peer code review." (2006).
  4. Feitelson, Dror G., Eitan Frachtenberg, and Kent L. Beck. "Development and deployment at Facebook." IEEE Internet Computing 4 (2013): 8-17.
  5. Sadowski, Caitlin, et al. "Tricorder: Building a Program Analysis Ecosystem."
Mahmud A.

Creative Developer, Entrepreneur & Opportunist

1 年

Great article! Thank you for sharing your insights on the effects of code reviews on software quality. It's interesting to learn that while finding defects is the top motivating factor for conducting code reviews, the highest contribution from reviews is code improvements. Your summary of the empirical research on peer code reviews from large software engineering organizations is very informative and helpful for understanding the key factors that make code reviews effective. It's good to know that communication is an important aspect of a successful code review and that careful selection of reviewers is recommended. The tips and recommendations provided by the researchers based on their analysis of review comments and review processes are very useful for improving code review effectiveness. Thank you for sharing your personal experience with conducting code reviews as well. I agree that code reviews can be an excellent way to learn a new code base and to share knowledge with new teammates. Your insights and recommendations will be very helpful for organizations and teams looking to implement a successful code review process.

回复
Arun Varma

Account Executive at Otter.ai

5 年

Tammy Denney?I came across this article when looking at your profile and found it pretty interesting. Nikolai says, "Using code reviews for quality assurance is not good enough since reviews don’t identify defects and rarely identify subtle defects." I agree. What are your thoughts on this?

回复
Jo?o Cavaleiro

IT Specialist & Project Manager | CSPO

7 年

Very useful article. Thank you Nikolai.

回复
Carlos Colón-Maldonado

Senior Software Engineer at OpalSoft | Army Veteran

7 年

I might add that, while code reviews do not guarantee bug fixes, it's existence in defect finding remains. Development of software engineers are an intrical part of making sure detects aren't introduced from inexperience and bad habits. I've found that peer programming helps moderators, a.k.a. gate keepers, ensure builds remain intact in a fast-paced development team. While peer reviews aren't the same as code reviews, they still are an essential step in the development process. There's a reason accredited institutions still teach it.

回复
Dwight Spencer, Ph.D.

RETIRED. Former remote Angular / Ionic Hybrid / Web-Native Front End Specialist. Now playing tennis . . . join me!

7 年

THIS is the code review that I have evolved to over many years that makes the most sense: (1) Developer schedules a code review with me. (2) Developer organizes an informal "show me" demo/script beforehand. (3) Developer presents his demo to me. It should cover FOUR issues: correctness in processing good input, appropriately responsive to bad input, algorithmic performance sufficient for client, and maintainability (readable, structured, best practices and safe coding patterns, standards adherence). The goal is for the developer to SELL ME on his code. Goal is to get the review session done in 15 minutes. If it's not a "pass", I will made "suggestions" for improvements. Then I WALK AWAY. If I can't trust the guy, he shouldn't be on the team.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了