Measuring Improvement

WRITTEN BY Dave Nicolette

Clients engage companies like ours because they are interested in improving their performance. “Performance” for a corporate IT department means effectiveness in delivering application software functionality to support business objectives, maintaining and operating the technical infrastructure, and handling any issues that arise in production.

Most of the organizations we talk to have no idea how to measure their performance. Leadership has a vague feeling that things could be better, but they can’t quantify what is “wrong.” They feel pressure from business stakeholders to do more, but they really aren’t sure more of what or how much is enough. Typically, they aren’t really sure how much they can do right now, let alone how to improve.

Nor are they quite sure what the root causes of the discomfort may be. Do business stakeholders simply expect too much, or is the problem that the IT department lacks the capacity to meet genuine business needs? Those are very different issues, and solving the wrong one would be a waste of time (and, as we all know, time is money).

Some of the popular approaches to achieving agility at scale appear to gloss over this issue; or at least, companies tend to implement them without considering how to measure progress. This often occurs when SAFe or LeSS is implemented, or an organization crafts its own method around agile principles. The focus tends to be on tracking the level of adoption of recommended practices. Tracking actual delivery performance takes a back seat, assuming it is in the car at all. The back seat may be filled with posters, sticky notes, yarn, and snacks, leaving no space for Mr. and Mrs. Metrics to ride.

The result may be a sort of family road trip: The kids are happy at first, occupied with games and snacks, but eventually, they grow cranky and want to get out of the car. This outcome doesn’t resemble what was sold, and organizational leaders become wary of agile scaling methods.

Other approaches are explicit about the need to stabilize current performance and get measurements around it, as a baseline to monitor the effects of improvement efforts. David Anderson’s Kanban Method, Scott Ambler’s Disciplined Agile, and LeadingAgile’s Expedition model all begin by stabilizing the current delivery process and establishing meaningful metrics.

The main theme of LeadingAgile’s Basecamp 1 is to get predictable. That involves stabilizing the existing process and measuring performance, among other things. Organizations then have an objective basis against which to gauge progress toward their performance goals.

Absent this sort of baseline measurement, there’s no practical way to know whether improvement efforts are yielding fruit. Thanks to the Hawthorne Effect, people in the organization may feel enthusiastic about the changes. They will report that things are improving, but this may be due to the fact something interesting is happening for a change, and that management seems to be actively interested in what’s going on, at last. There may or may not be objective improvement in delivery performance.

It’s been my experience that there are two main reasons to measure delivery performance: To steer planned work, and to see the effects of process improvement efforts. It’s fine to use metrics that depend on the methods and processes currently in use for the purpose of steering work in progress. But improving a process means changing it. In that case, process-sensitive metrics won’t help us.

This is axiomatic: The outcomes you achieve are the result of the actions you take; therefore, to achieve different outcomes you must take different actions. To track the effects of improvement efforts we need to select metrics that provide consistent information regardless of the methods and processes the organization uses, because those methods and processes will change.

For example, a traditional organization may use a linear process model to deliver software. Their transformational journey may involve, among other things, switching to an iterative process model. If we measure performance using metrics that depend on a linear process model, we will not obtain dependable information about the effects of switching to a different model. Similarly, if we begin by measuring performance using metrics that depend on an iterative model while the organization is still using a linear model, our baseline measurements will be meaningless.

This is exactly what happens in organizations that try to adopt a process such as SAFe or LeSS, and they measure performance using Velocity at the outset, while the organization still thinks and acts in accordance with assumptions aligned with a linear model. They end up with no usable baseline performance measurements, and no way to gauge improvement.

Sooner or later, the kids are screaming to get out of the car. Some of them just jump out the window (that’s a metaphor for changing jobs; clever, eh?). And the parents may well conclude that family road trips are overrated (that’s a metaphor for leadership saying, “We tried that and it didn’t work”; sometimes I’m so clever it hurts).

I’ve found three basic metrics to be useful in tracking the effects of process change:

  • Cycle Time – how long it takes to get something done
  • Throughput – how many things you get done in a unit of time
  • Process Cycle Efficiency – what proportion of time is spent adding value

All these come from the Lean school of thought. They’ve been adapted from lean manufacturing to suit the realities of software delivery and IT operations. All of them directly measure delivery performance, and none is dependent on any particular methodology or practices. This makes them useful for the purpose of monitoring an organization’s process improvement efforts.

They aren’t particularly hard to understand or difficult to use, but the details are out of scope for a blog post (even the long ones I tend to write). If you aren’t familiar with these metrics, I suggest you do an Internet search for more information or look for a book that covers them. Better still: Give us a call.

leave a comment

Leave a comment

Your email address will not be published. Required fields are marked *

3 comments on “Measuring Improvement”

  1. Srikanth Ramanujam

    I totally agree with you about the metric driven transformation tied to Business outcomes. However, I don’t agree with your bundling of Large Scale Scrum (LeSS) as something that is rolled out like SAFe. If you actually practiced it, you would realize that’s it is a key part of Lean Software Development and Lean Product Management practices as Lean as Lean can be.

    You can continue to do your work and claim it is the best since sliced bread, without dissing other frameworks. Except for your claim, your approach might not be infallible too.

    I don’t agree with your views on DAD and Kanban as metrics driven. They are only as good as how they are rolled out and Scott Ambler is not there in every engagement to roll it out by his book, just like I might be bad at rolling out the “Leading Agile” method.

    Assume that you don’t work for “Leading Agile” anymore, what would you do?

  2. Greg Hutchings

    Hi Dave,
    How’re you doing? I enjoyed reading your article, which concludes with some suggested metrics that we both like: cycle time and PCE. Throughput, though, depends like all metrics on how you measure it – complexity points, for example, don’t represent value flow, only effort flow. Most people and orgs trying to use agile don’t focus enough on value estimation; I prefer to measure flow of value relative to cost, than just the production of points. What’s your view on how best to measure productivity?

    While we mostly agree on the metrics suggestions, a couple of times in your article you criticize SAFe and LeSS. SAFe’s suggestion to measure cost of delay, while it may not always fit the objectives of an organisation, is still not bad. LeSS is not prescriptive regarding metrics, and as a long-time LeSS practitioner, have often used cycle time, pce and value flow as reasonable metrics – along with customer centric ones, e.g. NPS and for this reason also I like impact mapping, A/B testing and other post deployment focussed measures.

    The basis of your criticism of SAFe and LeSS seems to me not to be well established logically or based on facts – what is your experience with these frameworks which leads to these conclusions?