Practical Application of Test Intelligence
“A significant part of my consulting practice involves conducting autopsies on large, failed IT projects. Appallingly, most of the disasters could have been anticipated from the data available during the project.” Payson Hall (Consulting project manager)
Intro
Somewhere in July 2019, Vladimir and I were discussing common issues in Testing. Vladimir stopped me at some point and stated that they've been solving the same problems but for a different side of the business in BI and asked if we should try to implement Business Intelligence approach to Testing as well.
The next day we started working on this idea and even submitted our application to Testing Days Conference. We were lucky enough to be approved by the committee and be able to share this idea with Brisbane’s QA community. This article is a summary of my talk happened on 15th November, and Vladimir’s part will be posted in another article (URL will be confirmed later).
In this article, I’m going to share common issues with the quality and testing various professionals have had on their projects and describe the possible implementations of Test Intelligence in an organisation.
And before we start, I’d like to thank everyone who responded to the survey we conducted on LinkedIn. Your responses supported the idea of Test Intelligence and made this article possible.
So, let’s start with common issues.
Testers
The main problem for testers was a lack of understanding of what’s happening on the background of an application. And as a result, some critical bugs were promoted to higher environments. Most of such bugs could be only picked if testers had access to the log files or operational data. However, such access not always possible because of files complexity, security policies and configuration issues.
Test Analyst
For TA, most responses were related to the issues with test cases design and test data preparation. Test cases may be invalid due to misunderstanding of the requirements, and the same is true for incorrect test data. As well as some gaps in testing may be seen by missing critical test targets.
Test Lead/Manager
Test managers should operate complete and timely data, and without a proper understanding of the project progress and test design/execution status, there may be incorrect planning and prioritisation. This may lead to unexpected costs draining your project/testing budget and making it impossible to redo the testing.
Developers
For developers, the common issue was no visibility if the test cases cover the functions they coded. Also, most of the time, developers don’t get prompt feedback on how the recent commit affected the system.
Managers & Executives
And finally, executives didn’t have a clear picture of how good and stable their product actually was.
Solution?
So, what could be a solution to all these issues? Probably a system or a toolset which will gather and prepare data from various sources and provide easy access to this consolidated data.
Something which can identify possible gaps and risk areas; which will also show us reports and meaningful dashboards, as well as alerts us if something goes wrong.
Test Intelligence
Can Test Intelligence be such solution? Definitely, it can.
Test Intelligence goal is to improve software quality by giving easy access to consolidated operation and test execution data and support advanced analytics as well as alert project members.
So, Test Intelligence should help us to make a better decision, optimise our testing process and improve our team productivity.
Let’s speak about this in more details.
Toolset for Test Intelligence
First, let’s identify the tools we need to build Test Intelligence. There will be:
· Source system or Logging layer, which can log locally or/and push it to a server. This should be something easy to use, so the developers won’t struggle to write proper logging mechanism.
· Some data processing module. The module could be self-hosted or on a cloud. It is responsible for data gathering, consolidation, processing and calculation as well as storing the data.
· And we need beautiful dashboards to visualise our computed data in real-time.
Testing through Logging
Starting with Testing through Logging (TTL) method. As was mentioned before, there’s a lot of hidden “gems” in the logs and processing/operational data. Unfortunately, most of the times, testers don’t have access to this low-level information. And even if they do, logs may be difficult to understand and search through, because of file sizes and formatting. Look at the examples below, probably, not something you’d like to read.
Testing through Logging Benefits
But by implementing TTL, you will provide your testers with simple access to log and operational data. So they will get a better understanding of the background processes and identify the errors you cannot see by testing the black box.
Testing through Logging - Dashboard
Here’s an example of how the log visualisation may look like. Let’s assume that the example below is real-time data handled by our log management system. The dashboard shows statistics of our test execution steps, and we can see the total log entries recorded as well as warnings and the details. But most importantly, we see all the exception and know which steps caused them.
Simple but extremely useful.
Text logs vs Structured logs
Before we move to the next method, I’d like to highlight two approaches of logging: Text logging and Structured logging.
As you may see, text logs are easier to read by a human because it’s just plain text, but the structured logs are more efficient to handle by a computer. The red box on the picture shows that a few lines from the text logs may be converted to a massive JSON in the structured log file.
Hence, choosing a logging framework, you should probably consider one which can do both types of logs.
Performance Engineering
Next method I’d like to speak about is Performance Engineering (PE).
PE is not a complete replacement for Performance Testing, but this method will support your application when you cannot run a proper performance test.
Performance testing is an expensive task which also requires highly skilled professionals. Moreover, you must dedicate another environment or ask the team to postpone their duties while the environments are unavailable due to the upcoming testing.
That’s why organisations don’t run performance testing daily. However, ignored performance gaps may lead to random errors and/or users’ dissatisfaction.
But what if we have a tool which can show us performance improvement or degradation after each commit? And Performance Engineering as a part of Test Intelligence toolkit is such a tool.
Some benefits of PE:
· A profitable investment
· Doesn’t require PT professionals
· Doesn’t need a particular environment
· Runs on the background.
How to build Performance Engineering
Let’s start with saying that implementing PE is not rocket science and actually, something quite easy to do. That’s why I put a relaxing stickman doodle below ??
So, all we need to do is to set up a proper logging framework with a defined structure and configure the visualisation tool. Here’s also a code snippet showing an implementation example.
Performance Engineering - Dashboards
The gauges represent the latest build stats, and the bar graph at the bottom shows historical performance data. In this case, we can see that Client form functionality which is green, has been improved since the latest build, however, Get Client Details and Update Client which are red and orange experience performance degradation.
We define all the performance metrics as well as set alerts on performance anomalies events. Hence, our developers aware and fix the issue before it is promoted to higher environments.
Defining the Confidence Level
How could we be confident that our new build is ready to be promoted to production? And how often in your practice you found issues in prod, which were easily missed during the pre-deployment procedures.
In other words, how can we be sure that:
· Our freshly backed candidate won’t crash after deployment to the production
· Its performance fits the requirements
· It doesn’t have open major/critical bugs
· All development tasks were completed
· All the test suites were executed
· No exceptions happening on the background
· There’re no confidential nor private data leaks
· Static analysing tools don’t report any severe errors in the code
· There have been no major releases from vendors/targeted platforms.
Probably, by gathering information from required sources and building a meaningful dashboard with the consolidated data necessary to make the correct decision “TO GO LIVE OR NOT TO GO”.
This useful data may come from Project and Test Management systems, Config Management, CI/CD tools, Logs, etc.
Confidence Level – Dashboard
In this example, we got pretty high confidence level, because the dashboard shows that our test’s pass rate is excellent, performance is even better than our expectations, the configuration checklist is completed 100%, no open bugs, etc. Hence, we’re pretty confident to go live with this build.
We also believe that Calculating Confidence Level could be extended significantly by applying Machine Learning and Artificial Intelligence practices and tools. So, the machine will advise us even further by applying historical data from the knowledge base.
Identifying high-risk areas
As a leader, you require complete and timely information on the health of your project and development status. Identifying high-risk areas is another method in Test Intelligence which helps you with access to such information. With such evidence on our hands, it’s easier to make more appropriate decision and reconsider priorities if required.
So, based on:
· The complexity of the story (by story points, amount of the tasks)
· Number of Unit & Integration tests compared to the complexity
· Number of test cases compared to the complexity
· Previous similar deploys/projects (knowledge base)
· Number of raised bugs
· The complexity of the bugs
· Anonymous team’s quality rank (number of defects discovered, static analysis errors reports after commits, broken builds, etc.)
· Upcoming major releases from vendors/target platforms.
We can identify the function which may cause the troubles on beautiful and informative dashboards.
High-Risk Areas – Dashboards
Let’s assume the dashboard above represents six functions we’re developing this sprint and with their calculated risk probability. Where 0 means no risks, and 100 means that we will have troubles with this particular function if we don’t do any actions right now.
Let’s have a closer look at Insurance Suspension Rank.
There are still 16 days to complete this function, and according to our completion rate, we’re doing great. However, because there’s low coverage of unit and integration tests as well as six raised bugs, the story got a score of 28 points, which positions it between slight to medium risk chance. There’s no urgent action required. However, it’s advisable to keep an eye on this function to ensure the risk probability rank is not growing.
Improving Defect Analytics
To gain a deeper understanding of the software development effectiveness, it is essential to examine the details of the defects you previously found.
Another area where we can successfully utilise Test Intelligence is Defect Analytics.
How can Test Intelligence help? TI provides simple access to execution and operational data from various systems. Hence, it simplifies the analysis process. For instance, if an issue found during a test case execution, testers will see not only the underlying exception details but also log records before the exception. Therefore, testers will be able to provide solid bug report with detailed description reducing the time developers would spend on these bug investigation activities.
Test Intelligence Requirements
The four dashboards/methods described in this article is a good starting point for any organisation, IMHO.
Though, analysing your team requirements, you may find more advanced methods, maybe even with ML and AI implementation. Regardless of TI methods, if you decide to implement Test Intelligence on your project/organisation, the first thing that should be done is Test Automation. With proper configuration of building pipelines, Test Automation will examine your freshly baked build and generate tons of useful execution data.
Of course, a proper logging mechanism should be implemented in your code. For this, the logging framework should be easy to configure and trivial to use.
No recommendation on a data visualisation tool, try to reuse one you already have in your organisation.
I hope you found this article useful. Thank you for reading. Let me know if you have questions and good luck with your Quality Improvement process.
Quality to everyone! ??????
Senior Full Stack Software Engineer | Mentor | Team Leader | GTD | WFH
5 年Hey David Chen Davo, may be you'll find something interesting )