Why Quality Software Is Impossible Without Proper Root Cause Analysis (RCA)

Originally published: November 19, 2019 Updated: December 29, 2022 11 min. read

Software Engineering

Any qualified medical doctor will undoubtedly confirm that in rare cases treatment of symptoms can be effective enough for the patient to recover after a serious illness. However, before actually starting the treatment, a competent doctor will collect tests, gain a medical history of the patient, and make the correct diagnosis. A similar analogy is often mentioned in the case of software quality assurance because a qualified test engineer follows the very same process. To find out the main reason of a defect or determine the reason behind it, QA conducts Root Cause Analysis (RCA) – consciously or arbitrarily.

In the following article, we demonstrate some of the factors and causes of the “problem-malfunction” sequence and explain what can be done to prevent further occurrences of such incidents in the software under test.

RCA: from “Symptoms” to Diagnosis

Root Cause Analysis, in its essence, helps to discover where the problem originated and how it grew into the symptoms faced by the test engineer. The main goal of RCA is to completely localize and pinpoint the problem, preventing it from recurring in the future. For example, if after RCA we discover that the defect occurred as the result of requirements- or design flaws, the documentation should be revised to avoid further defects. If, for example, a defect was released to the production environment because the testing department missed it, then test cases and test suits should be reviewed and refined to conduct regression testing once again.

Based on this, it is possible to distinguish the following phases or, rather, levels (according to the areas of responsibility), at which a defect may occur:

Management (decision-making level mistakes)
Environment (hardware, software or configurations issues)
Requirement (poor, conflicting or out-of-date)
Design (architecture flaws)
Development (bad coding mistakes)
Testing (ineffective or incomplete test cases)

Errors in the system may occur for many reasons, such as:

Time constraints
Human factor
Inexperienced or insufficiently skilled project participants
Miscommunication between project participants, especially at requirements and design levels
Complexity of the code, design, architecture, technologies, and much more

To identify the root cause of the problem and find out at which level it has occurred, you can use a certain set of steps that can be grouped into the following:

Why Quality Software Is Impossible Without Proper Root Cause Analysis (RCA) - 1 - Infopulse

At the same time, a QA engineer needs to be careful and take into account the level of his own competence when providing recommendations to resolve the problem. If a specialist cannot see the full picture of implementation and does not consider the effect this decision may cause, such recommendations can only make things worse. Implementing these recommendations may lead to discarding better and more effective solutions.

Top 5 ‘WHYs’ of RCA Technique

To determine the cause-and-effect relationships that underlie a particular problem, we often utilize a technique of ‘5 Whys’. The essence of this technique is that the root causes of the problem are searched for by repeating the “Why?” question five times. Each subsequent “Why” is based on the answer to the previous question.

The number of steps in this technique is not strictly limited to five. Sometimes two questions are enough to find the root cause of the problem, while in another case many more analysis steps may be required to find the truth.
‘5 Whys’ can be represented in the form of an infographic, as seen below:

Why Quality Software Is Impossible Without Proper Root Cause Analysis (RCA) - 2 - Infopulse

Naturally, alongside its advantages, such as simplicity and speed, this technique has some drawbacks. For example, the results of the findings can be non-repeatable, since the reasons underlying a problem may differ for each specialist using ‘5 whys’. When using this technique, there is a high probability that at any stage, any one of the causes or none at all will be highlighted. In fact, the dependence structure of the causes may frequently have the form of a tree with numerous branches. Besides, if at the first stage the intermediate cause was determined incorrectly, all subsequent steps would be invalid and lead to an incorrect root cause.

Micro & Macro RCA

RCA can be conditionally divided into micro- and macro levels.

At the micro-level, the object of analysis is most likely the defect in the system itself. It means that technical analysis should be made mostly to search for the root causes of some bugs occurrence, e.g., check errors in Web Browser DevTools, investigate the logs, review code, verify whether the implementation meets the requirements, etc. Have a look at the example below:

Why Quality Software Is Impossible Without Proper Root Cause Analysis (RCA) - 3 - Infopulse

Macro RCA is a more global analysis that helps to perform a comprehensive, system-wide review of significant problems. It explores why, in principle, a problem may occur. For example, why a test engineer does not have time to finish regression testing that led to inoperative functionality deployment to UAT-environment? Why is unit test coverage so low and there are no code reviews of complex features in the project? Why does the developer implement improvements in a certain way and does not consider other approaches? Why do inconsistencies and inaccuracies arise in the requirements leading to numerous bugs or raise of resources costs for code rewriting? See a detailed example below:

Often, just after RCA on the macro level, it is possible to find the true root cause and (in parallel with a competent retrospective of previous bugs) to prevent the emergence of new problems.

There are also several different techniques and methods for the root causes analysis of the problem – CATWOE, Cause and Effect Analysis, Drill Down, etc. All of them are also aimed at detecting defects in order to further identify the best options for problem solving and improve both the product and the development process.

Summary: the Importance of RCA

RCA is an excellent method to avoid product and process defects in the earliest stages of software development. It helps to manage the quality of the software product in a “sooner and cheaper” manner. As you may know, the cost of fixing an issue increases exponentially as the software moves forward in the SDLC, thus prevention is always better than a cure.

As already mentioned above, after finding the true cause of the problem, it is possible to prevent the recurrence of such “bad situations” in the future. Depending on the root cause of the defect, we would need to utilize different methods, e.g., static testing, requirement specification review, design and code review, defects analysis, etc. In some cases, it is advised to establish frequent code refactoring practices to improve and redesign the structure of the already existing code without modifying its fundamental behavior. Another solution would be to integrate some automated exception tracking and reporting tool, which would provide real-time error monitoring and reporting for the team participants.

Besides, for test engineers RCA could be a good way to avoid professional burnout. RCA requires close communication with project participants, which allows QA specialists to escape from the routine. Besides, finding the true cause of a complex defect and applying the results of the investigation to prevent future defects, is a very rewarding feeling indeed.

At Infopulse, we utilize this approach to assure the quality of software starting from the earliest stages of SDLC, as we strive to provide the best value for our clients. A qualified “RCA doctor” can collect all possible information about problems root causes and implement effective ways to prevent such issues in the software under test.

May the RCA be with you!