When to take the risk in software testing: A guide
In today’s faster, smarter, cheaper centric world, with the technologies, tools and experience that software companies have at their disposal, you might assume that businesses would be able to create and release software that is (almost) defect free.
But, in truth, this is never the case. Many organizations knowingly release products with defects and others time box testing and release with areas of functionality unproven.
There are a lot of good reasons, from a business perspective, as to why this risk is taken by businesses. Not only do they always have deadlines to meet but, with limited resources dedicated to each project, going over budget is often less appealing than releasing a product in which there may be issues. In these cases, software with bugs is better than no software at all. This rationale is perfectly valid but the key is understanding clearly what’s been proven, what has not been proven and the potential impact of the later manifesting it’s self as an issue in the released product.
We, as the testing community, need to get better at providing the decision makers with the correct information to allow them to take the call on “when to take the risk” of releasing or deploying a product.
Testing as a part of the risk mitigation business
“Risk” is defined as a probability or threat of damage, injury, liability, loss, or any other negative occurrence that could have been avoided by using pre-emptive actions. In software development, the pre-emptive action required to avoid any risk is testing.
There are going to be risks involved with every software release. Things can be and are missed. This decision, to release prior to 100% of testing completing and associated issues found, is the key to success in the drive for cost effective, timely and appropriately tested software releases. In a nutshell, these are two questions: “when has enough testing been done?” and, as we shall see below, directly related to this, “when has enough risk been mitigated?”.
Capturing all the “real” risks
The assumption made here is that our project, be it agile, waterfall or any of the myriad of variations in between, has decided to identify, collate, profile and track risks. The case made by some of the agile advocates that the agile process is designed to handle unexpected issues and events and that therefore effectively there is no need it identify risk, just deal with them when they become issues, is parked for debate another day.
There are two types of risk we need to identify and profile to maximize the probability of our decision makers getting the big call on when enough is enough right; traditional or natural risks, and product risks. Let’s look at them both.
These are examples of what we would expect to see on our traditional risk logs:
- The test environment will be commandeered for a production fix
- Development will over run impacting test start date
- Higher than estimated defect density will occur impacting test end date
- Key test resources will become unavailable
But we also need to consider product risks such as –
- The new single sign on portal will allow access to users incorrectly
- The link to the 3rd party payments collection site fails
- When switching to “Super Intergalactic Turbo Mode” the game fails to render on iPad
- The shade of blue displayed in the game sky is not consistent across browsers
- The wrong regional headers and footers are applied to letters and statements
Ah but wait, I hear you cry, those are just requirements! Correct, they are, but to ensure we get the big call right we need to treat them as risks. Or more accurately profile the impact of them failing in live as we do our Traditional or Natural risks.
We initially, as part of project initiation, define the level of requirement that will be classified as a Product Risk and then, using our risk profiling tool/method of choice, determine a numeric value for each risk. The sum of these represents our project Risk Exposure value.
Now the easy part!
We plan our execution following a traditional risk-based testing approach which should ensure, as much as is possible or practical, that the tests associated with the risks which have the highest Risk Exposure value, are executed first.
As we progress through execution we plot not only the burn down of our outstanding tests but also the burn down of the total Risk Exposure value. In the image below, the red line is badged as SP or Story Points, but it could also represent number of test scripts to be executed.
NB. The horizontal axis represents whatever time measurement is applicable to our project (Days, Weeks, Sprints, cycles, etc.). The upward spike represents a CR or new piece of functionality being accepted into scope.
The above burn down chart shows us that it may be the case that the decision to stop testing based on amount of risk mitigated could be taken significantly earlier than we might expect.
What else should be considered when making the big decision
There is one further part of the risk and risk-based testing process that needs to be considered and it is one that is often missed when the “big decisions” are being taken and the deadline date is rapidly approaching.
It is ensuring that the impact of a Product Risk coming to fruition in live is understood by the actual person (business owner) who will be impacted by it.
This should be addressed by appropriate representation at the initial risk definition workshops and profiling activities on the project. However, it is not sufficient to “communicate” the risk-based decisions to potentially impacted stakeholders via meeting minutes or status reports. In today’s high-pressured workplace environment, senior stakeholder’s awareness of, or even acceptance of, does not always equate to understanding of.
This approach should not be seen as “do less testing” but, if rigorously applied, will allow risk-based decisions on when sufficient testing has been done, to be taken based on quantified, repeatable science.