Work vs Delivery: It’s All About What We Measure

8 min readNov 9, 2023

Many global organizations have embraced Agile frameworks in their software product delivery pipeline. Despite incorporating various Agile practices, many still struggle with attaining a maturity level beyond the basics.

In my view, the heart of the matter lies in the finer details. Agile frameworks provide overall guidelines and boundaries, leaving organizations to tailor the specifics to their unique attributes, encompassing culture, technology, and size.

This post aims to explore a contentious aspect of Agile adoption — what we choose to measure.

Let’s commence by outlining a typical Agile adoption scenario within an organization and the decision-making flow at the management level. The foundational elements include the adoption of Scrum and/or Kanban at the team level. Subsequently, the organization selects a scaled-up framework (or devises its own) to oversee multi-team products and portfolios. Based on this, metrics, KPIs/OKRs are established, and management dashboards are crafted for transparency, shaping decisions grounded in the data from these dashboards.

The power of Scrum reveals the pitfalls in the organization. As Ken Schweber once said, “Scrum is like your mother-in-law, it points out ALL your faults”. Once those pitfalls are revealed, the organization wants to improve itself and starts analyzing the data in order to get to practical action items for improvement. The data is crucial for the decision-making process. One of the main data areas for decision-making is velocity (team velocity, group velocity, product velocity, etc). I find many organizations struggle with HOW to measure activities rather than focusing on WHAT activities to measure.

The power of Scrum lies in its ability to spotlight organizational pitfalls. As Ken Schweber aptly put it, “Scrum is like your mother-in-law; it points out ALL your faults.” Once these shortcomings are laid bare, the organization aspires to enhance itself, delving into data analysis to derive practical action items for improvement. Data proves instrumental in the decision-making process, with velocity (team, group, product, etc.) standing out as a key metric. However, many organizations are focused too much not on what activities to measure, but rather on how to measure them — whether through time measurements (encouraging individual absolute sizing) or story point measurements (encouraging group-wide relative sizing).

In essence, my assertion is this: the focus should be on WHAT we measure rather than HOW we measure. ‘HOW’ here refers to the method of measurement, be it time-based (favoring individual absolute sizing) or story point-based (favoring group-wide relative sizing). Conversely, ‘WHAT’ refers to the selection of activities considered worthy of measurement.

What Should We Measure

Let’s kick off with William Winslow Taylor, the pioneer of scientific management during the Industrial Revolution. Taylor’s ideas proposed that delivery is a direct outcome of optimizing workers’ time — time equals money. While applicable in industrial engineering, these notions fall short in the domain of software engineering.

The Cynefin model underscores the separation between software and industrial engineering, placing the former in the complex domain and the latter in the complicated domain. This implies that concepts effective in industrial engineering may not seamlessly translate to software engineering.

Consider Lionel Messi as an example — observing him off the ball might suggest a lower work utilization, as he doesn’t seem to run as much as his counterparts. Yet, Messi’s delivery, reflected in numerous goal scores, tells a different story. This underscores the importance of focusing on delivery rather than being obsessed with work utilization.

Activities: Work vs Delivery

Now, let’s distinguish between work and delivery in the software engineering domain by examining various activities performed during the software delivery cycle:

Product Management content: These activities directly contribute to product delivery by introducing new functionalities or enhancing system performance.
Architecture runway: Tasks like code re-factoring and infrastructure development directly contribute to the product’s code, serving as enablers for future content delivery.
Lab maintenance, regression testing, and release activities: While crucial, these activities don’t directly contribute to product delivery.
Weekly meetings, coffee breaks, design meetings, HR events, and Scrum events: Important for team cohesion, but they don’t directly contribute to product delivery.
Unexpected production incidents: Handling incidents promptly is essential, but they represent waste rather than delivery. Read more on that here.
Vacation days: This off-work activity doesn’t require work effort and doesn’t directly contribute to product delivery.

When measuring work, we estimate all mentioned activities and align them with team velocity. Conversely, when measuring delivery, the focus narrows to activities directly contributing to product delivery — specifically, product management content and architecture runway tasks. This approach simplifies the evaluation process.

Planning: Work vs Delivery

Now, let’s lay open the fine distinction between work and delivery, particularly in the planning domain. A standard velocity graph describes the anticipated effort versus the actual completed effort.

When measuring work, both expected and actual efforts encompass activities contributing directly to delivery and other miscellaneous tasks. However, when handed a backlog of requirements from stakeholders, it mostly consists of activities directly contributing to the product. This creates a challenge when Product Management queries how much stakeholder content we can incorporate into the sprint, as our velocity encompasses more than just stakeholder content.

On the flip side, measuring delivery-oriented work involves tracking only the anticipated and actual efforts tied directly to delivery. Consequently, when Product Management inquires about how much stakeholder content we can integrate into the sprint, the response is straightforward and unambiguous.

I acknowledge the miscalculation that might result from the existence of architecture runway activities in our velocity, which may not be present in a stakeholder backlog. However, addressing this is simple. Prior to sprint planning, establishing guidelines on the percentage allocation between stakeholders' content and architecture runway activities is key. For instance, adopting an 80–20 approach implies dedicating 80% of our velocity to the stakeholders’ backlog. This clarifies the understanding of how much stakeholder content can be seamlessly incorporated into a sprint and is easily calculated using the velocity graph numbers.

Visibility and Transparency: Work vs Delivery

Determining what to measure holds significant implications for visibility and transparency, especially within the software engineering domain. Before deep diving into the visibility aspect, let’s emphasize the crucial role it plays in software engineering. In industrial engineering, everything is tangible and observable — you witness materials moving through the process. However, in software engineering, everything is intangible, existing in a virtual domain. This is where tools like Scrum boards and Kanban boards become invaluable; they visualize virtual activities, enabling informed decision-making.

Now, let’s revisit the velocity graph. Consider a 10-day sprint with a team of 5, totaling 50 days. In the expected effort, we account for delivery and other work-related activities. On the completed side, we include delivery activities, other work-related activities, and unforeseen activities that emerged during the sprint. The traditional approach, as advocated by Taylor, urges maximizing work to hit the 50-day mark. However, in the software engineering domain, this metric holds little relevance. The correlation between team effort and actual delivery is nonexistent. Work utilization isn’t associated with a successful delivery.

However, a delivery-oriented velocity graph exposes immediately the difference between expected and actual delivery, bringing the issue to light. Making this gap visible initiates the recovery process.

Suppose we agree that measuring delivery is the way forward. The next question circles around planning without estimating or measuring other activities, such as bugs or miscellaneous tasks.

Let’s start with bugs. Estimating bugs isn’t necessary for planning; instead, we should monitor the defect trend — observing the number of open bugs over time. If the trend rises, it signals a need for increased investment in quality. Sprint planning isn’t about assigning specific days or solving a predetermined number of bugs; it spins around understanding the defect trend. If the defect trend is on the rise, we pull less content in the next sprint to accommodate more bug fixes. If the trend decreases, we can adjust content and pull more content items into the sprint.

Addressing the second issue — activities not explicitly measured (e.g., release activities, meetings) — we analyze the average of actual completed work (velocity) over the last 3–4 sprints. The actual completed work is based on the fact that all non-delivery activities happen during the sprint, including releases, bugs, and unexpected incidents. This is why delivery-oriented velocity provides a simple and solid basis for planning.

Lastly, considering exceptions like holidays during a sprint, if a holiday consumes half the sprint, commit to half of the velocity to maintain realistic expectations.

Conclusion

In conclusion, our journey began with Taylor’s Industrial Revolution wisdom, emphasizing the direct correlation between worker busyness and delivery. While keeping workers busy has its value, it’s not a sustainable path to enhanced delivery. The key lies in transitioning to a delivery-oriented approach — maintaining focus, identifying problems, tackling challenges, and consistently improving our delivery.

The heart of the matter is not the method of measurement but the choice of what to measure. Instead of emulating Taylor’s approach, let’s shift towards a methodology that prioritizes impactful metrics.

In the domain of delivery, let’s take a reminder from Messi — don’t just aim to stay busy like Taylor advocated, but aspire to deliver with the skillfulness and precision of a skilled player.

It is not about HOW you measure, it is about WHAT you measure.
Don’t utilize like Taylor, deliver like Messi.