Stability

brown wooden blocks on black table

How stable is your solution? This can be a difficult thing to measure, especially when few users take the time to report bugs. Rather than count bugs, then, one can instead monitor for flaws in quality control gates across the Software Development Life Cycle. For each bug that is caught, there are probably several more that remain undetected.

One can thus assume that the longer bugs persist undetected in a system, the more quality control gates (validation checkpoints) have failed, and thus the more undetected bugs are slipping past.

Quantified Tasks answers this problem through the Stability measures: Origin, Caught, and Volatility, as well as the Impact (from Planning).

The three Stability measures should only be scored for issues (a.k.a. “bugs”), which we officially define as “any behavior which is unexpected to the intended end user.”

Caught

Caught icon by Jason C. McDonald is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

The Caught scores map to the five major phases of the Software Development Life Cycle [SDLC], and represent the phase at which the issue was detected.

  • c1: Planning

  • c2: Design

  • c3: Implementation

  • c4: Verification

  • c5: Production

NOTE: In prior versions of Quantified Tasks, this metric was known as Detected.

In general, public-facing issue trackers should automatically populate this field with c5 for user-reported issues, as those were detected in production. Any issues caught in alpha, beta, or “preview” releases, as well as any issues caught by the QA team, should be reported as c4.

An average Caught score of 5 should generally raise alarms, as it may mean that quality controls have failed and the majority of issues are being detected in Production. However, this average can also appear if tickets are not being created on the issue tracker for issues detected in other phases. One way to prevent this from becoming a problem is to implement the Rule of Issues.

Origin

Origin icon by Jason C. McDonald is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

The Origin scores map to the same SDLC phases as Caught, but instead represent the phase at which the issue was introduced.

  • o1: Planning

  • o2: Design

  • o3: Implementation

  • o4: Verification

  • o5: Production

As with all Quantified Task measures, a score of five is special. In this case, o5 should be flagged for special attention, as it indicates a change made without any form of planning, design, or verification whatsoever. You should always treat o5 issues as a significant problem!

To ensure that Origin is actually identified, we do NOT recommend having a “Triage” value. Instead, the default value should be o1. This ensures that if the team skips evaluating the Origin of an issue, the Volatility (see below) will be high. (See the Rule of Issues.)

Quality Control Gate Monitoring

You should have multiple quality control gates throughout your development workflow. Here are a few to consider at each phase of the SDLC:

  • o1: Planning

    • Requirements validation.

  • o2: Design

    • Design validation.

  • o3: Implementation

    • Coding standards

    • Static analysis tools (e.g. linters)

    • Unit testing

    • Code review

  • o4: Validation

    • Acceptance testing

    • Manual testing

    • Visual Quality Assurance (VQA)

    • User testing (e.g. beta releases)

  • o5: Production

    • Chaos engineering

    • Logging

    • Bug bounties

We strongly recommend documenting what quality control gates you have for each Origin. Statistics around issue Origin can warn you when quality control gates are faulty or inefficient.

black metal gate
Photo by Rohit on Unsplash

Impact and Stability

Impact icon by Jason C. McDonald is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

The Impact measure from Planning should be determined for every Issue. In practice, this is not difficult, as you only need to ask two questions:

  • Which use case(s) (epics) are affected by this issue?

  • Does this issue block, impede, degrade, or scuff the use case(s)?

Impact should reflect the greatest effect determined. If an issue degrades an Ei5 Epic (i3), but blocks an Ei4 Epic (i4), the Impact on the issue should be i4.

Volatility

Volatility icon by Jason C. McDonald is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

The Volatility score is calculated from Caught, Origin, and Impact as follows:

Volatility = (Caught - Origin) * Impact

The general principle is that the longer an issue goes undetected, the more likely there are quality control gate issues. It is especially important that high Impact issues be caught. There is no such thing as bug-free software, but good quality controls should ensure that more essential and common use cases are free from defects.

Volatility scores range from 0, if the issue is caught at its origin, to 20, a flaw in planning an essential use case that ships to production. In a healthy project, the average Volatility will hover somewhere around 5.

If you set up automation to calculate Volatility, consider interpreting iT (Traige Impact) as i5 in the calculation. This will ensure that failure to evaluate the Impact of an issue causes Volatility to be high.

Solution Volatility

Your Solution Volatility [SV] is the average (mean) Volatility of all issues for your Solution. The higher the SV, the more likely there are issues that have evaded detection. In general, an SV >10 indicates an unstable solution, and an SV >15 represents a severe breakdown in quality control.

Drawing Conclusions About Stability

When monitoring Stability with Quantified Tasks, there are three things to watch:

  • Solution Volatility should hover around 5, and should not exceed 10.

  • A high average Origin, or any single Origin of o5, indicates that quality control gates are being bypassed. A high average could also indicate that there are a lack of quality control gates prior to the Validation phase of the SDLC.

  • A high average Caught indicates that quality control gates are consistently failing, or the Rule of Issues is consistently being violated.