Story Points as Strawberries

14 min readNov 12, 2023

Let’s straighten out the enigma surrounding story points through a tale about strawberries — courtesy of my colleague, Vijay Baholwani. Envision yourself wandering around a busy fruit market, attracted to the lively display of red strawberries. Now, let’s imagine we are curious about the vendor’s daily sales. As we approach the vendor, rather than asking about the hours spent selling strawberries, we use a more relevant question: the quantity of strawberries sold. In this narrative, the strawberry vendor becomes our agile team and the strawberries are our software development requirements.

Just as the strawberry vendor measures success in kilograms sold, we, too, assess our agile team’s accomplishments using story points. These story points, similar to the weight of strawberries, rise above the constraints of time, allowing us to explore the vast potential within a given timeframe. Whether it’s 10 story points, 100 story points, or 100,000 story points, the beauty lies in the boundless capacity to deliver valuable software within our agile timeframe — a sprint.

In understanding story points as a representation of software development weight rather than a fixed time, we unlock a paradigm shift in our approach to software delivery estimation. Story points are a departure from the traditional domain of time-based measurements. Unlike hours or days, story points don’t quantify the fixed-time investment in a task. Instead, they encapsulate the complexity, effort, and uncertainties associated with a particular piece of work. It is like allocating a weight to a software requirement — a weight that is similar regardless of who implements the software requirement. 1 Kg of software requirement is always 1 Kg — no matter which engineer develops it. On the other hand, understanding how much time it takes to complete a software requirement is very subjective — it depends on the engineer doing it.

Advantages

Story points offer a multitude of advantages — let’s delve into the key ones.

Visibility of Improvement

Consider a hypothetical A-team — they’re tireless, unaffected by illness, don’t take vacations, code flawlessly, and work 24 hours in a 24-hour day. Remarkable, right? They consistently deliver 50 days of work out of 50. Yet, despite their continuous improvement, the velocity graph remains static. Why? Because time becomes an obstacle; you can’t deliver more than 50 days in 50 days timeframe. This is where the weakness in relying solely on time becomes evident. It’s the point in time where the shift to story points becomes crucial.

As we shift from measuring time to measuring delivery through story points, the enhanced performance of our A-team becomes evident. The true strength of story points emerges when faced with time constraints. Unlike time-based metrics, story points liberate us from a fixed limit on how many can be accomplished within a specific timeframe. As our team progresses, the count of delivered story points rises — showcasing an increased velocity to deliver more within the same timeframe. This dynamic underscores why there is no fixed correlation between story points and time; rather, it evolves. With each step in improvement, the team efficiently delivers each story point in a shorter timeframe.

Avoiding Non-Productive Arguments

Incorporating story points into our development process is mitigating the longstanding debates between our product management and R&D teams regarding the time required for requirement development. Story points offer us a more fine and collaborative approach, allowing us to focus on the complexity and effort involved in each task rather than being obsessed with time estimates. This shift has encouraged a shared understanding between product management and R&D, promoting a culture of cooperation and transparency. By embracing story points, we’ve gone beyond the traditional pitfalls of time-centric disagreements, enabling both teams to work harmoniously toward delivering high-quality results.

Using the Crowd Wisdom

Agile story points serve as a catalyst for using crowd wisdom within our development teams, encouraging collaboration and collective decision-making. One powerful technique that increases this collaborative spirit is Planning Poker (read the Appendix for more details).

Using time estimation in the context of Planning Poker is not suitable primarily because it introduces the potential for bias and inconsistency. Time-based estimates tend to be subjective and influenced by individual perceptions of what constitutes a reasonable amount of time for a task, as different people have different skill sets and technology experiences. This subjectivity can lead to varying interpretations about how much time it takes to deliver a task. It results in difficulties in achieving a shared agreement among team members about a task’s effort estimation.

Planning Poker relies on relative sizing, where tasks are compared in relation to each other rather than being assigned specific time values. This approach allows for a more objective and standardized assessment of the complexity and effort involved in each task. Team members discuss and debate the relative difficulty of tasks, reaching a consensus on their relative size through open and collaborative dialogue.

There is a tension between crowd mutual estimation through collaborative decision-making, and relative sizing, which relies on past delivery activities to inform estimations. While mutual estimation fosters teamwork, transparency, and shared ownership, relative sizing introduces the risk of anchoring bias and compromises the independence of estimations. To reconcile this conflict, teams must strike a balance between leveraging past experiences and preserving the collaborative spirit of team poker planning.

Increasing Speed

The use of story points in estimation, particularly through relative sizing, has been shown to expedite the estimation process compared to time-based or absolute sizing. Story points represent a unit of measure that abstracts away from specific time units (hours, days) and focuses on the relative complexity or effort involved in a task. It allows teams to avoid getting bogged down by the quest for precision. The process of relative sizing in story points involves comparing tasks to one another. This comparative approach encourages quicker consensus building, as team members can often agree on the relative difficulty of tasks more readily than on specific time estimates.

The principles behind the speed and effectiveness of story points in agile estimation are widely supported by agile practitioners and industry experts. The emphasis on collaboration, abstraction from time units, and the elimination of precision-related debates contribute to a more streamlined and efficient estimation process.

Estimations using Planning Poker

Planning Poker is a collaborative estimation technique used in agile and Scrum software development methodologies to estimate the effort or complexity of user stories or tasks. It is a way for teams to collectively determine the amount of work required to complete a particular piece of functionality. Here’s how Planning Poker typically works:

Preparation:
The team gathers to estimate user stories or tasks. Each member of the estimating team is given a deck of cards, usually with values like 0, 1, 2, 3, 5, 8, 13, 20, 40, 100, and a “?” card.
Relative Sizing:
Instead of estimating in hours, days, or other time units, Planning Poker uses a relative sizing approach (story points). The values on the cards represent a measure of effort or complexity relative to other tasks.
Discussion:
The team discusses each user story or task to ensure a shared understanding of the requirements and potential challenges.
Any questions or concerns are addressed during this discussion.
Estimation:
After the discussion, each team member privately selects a card representing their estimate for the effort or complexity of the task.
Reveal and Consensus:
All team members reveal their cards simultaneously.
If there is a wide range of estimates, the team engages in a discussion to understand the reasoning behind each estimate and work towards a consensus.
Repeat if Necessary:
The process is repeated until the team reaches a consensus on the estimates.

By leveraging the collective intelligence of the team, Planning Poker helps ensure that everyone has a shared understanding of the work ahead, leading to more accurate and realistic estimates.

Anchored Bias

Anchoring bias is a cognitive bias that occurs when individuals rely too heavily on initial information or “anchors” when making decisions or estimations. This bias occurs because the first piece of information encountered serves as a reference point or anchor, which subsequently influences subsequent judgments or estimations, even if the anchor is irrelevant or misleading.

In the context of estimation, anchoring bias can manifest when individuals are exposed to a specific value or estimate early in the process, causing them to unconsciously adjust their subsequent estimates closer to the anchor value. This adjustment occurs even if there is no logical connection between the anchor value and the actual task being estimated.

For example, in a Scrum poker planning session, if one team member suggests an estimate of 8 story points for a particular task based on their past experiences with similar tasks, other team members may subconsciously adjust their estimates closer to 8, even if they initially had different estimates in mind. This adjustment occurs because the anchor value of 8 has primed their brains, making it difficult for them to deviate significantly from that value.

Anchoring bias can lead to inaccurate estimations and decision-making, as it limits the exploration of alternative perspectives and considerations. It can also contribute to groupthink, where individuals conform to the anchor value rather than critically evaluating the task’s complexity or risk factors.

An Example

One classic experiment that demonstrates anchoring bias in estimations is the “Anchoring and Adjustment” experiment conducted by Tversky and Kahneman in 1974. Here’s a simplified version of the experiment that you can share with your team:

Experiment Description:
Participants were asked to estimate the percentage of African countries in the United Nations. However, before making their estimate, they were presented with a randomly generated number, either high or low, as a reference point. This number was intended to serve as an anchor. After being exposed to the anchor, participants were then asked to provide their estimates of the percentage of African countries in the United Nations.
Results:
The researchers found that participants’ estimates were systematically influenced by the anchor provided to them. Specifically, participants who were exposed to a higher anchor provided higher estimates, while those who were exposed to a lower anchor provided lower estimates. Importantly, even when the anchor was clearly arbitrary and irrelevant to the estimation task, its influence persisted.
Implications:
This experiment demonstrates how anchoring bias can impact estimations, even in situations where individuals are explicitly instructed to disregard the anchor or where the anchor is clearly unrelated to the task at hand. It highlights the importance of being aware of anchoring effects and employing strategies to mitigate their impact in decision-making processes.

In Conclusion

To mitigate anchoring bias in estimation processes like Scrum poker planning, teams can implement strategies such as blind estimation, where past estimates are concealed from team members during the estimation process. However, we need to remember that relative sizing inherently involves the risk of anchoring bias due to its reliance on past experiences to inform current estimations.

Dilemmas While Using Story Points

While using real story points, some dilemmas might be raised.

Starting Point

Starting the journey of implementing story points in the first sprint introduces a unique set of challenges for any team. The essence of story points lies in their ability to facilitate relative sizing, and the initial difficulty often goes around defining what precisely constitutes one story point. Once this foundational understanding is established, the team can effectively measure and size other items in relation to this fundamental unit.

Two techniques are commonly employed to establish the basis for one story point:

Smallest Item Approach: In this method, teams select the smallest item from their backlog and designate it as equivalent to one story point. This approach sets the stage for relative sizing by using the smallest identifiable unit of work as the benchmark. The advantage of this technique lies in its simplicity, as it provides a tangible reference point for the team to anchor their sizing estimates.
One-Day Work Approach: Alternatively, teams may adopt the one-day work approach. Here, a specific item that is expected to take approximately one day of effort, encompassing all tasks until the Definition of Done is achieved, is designated as one story point. It’s crucial to emphasize that, beyond this initial setup, the correlation between time and story points is intentionally cut off. Beyond this initial step, the team is encouraged to transition away from time-based estimations and embrace the inherent flexibility and abstraction that story points offer.

Both techniques aim to establish a consistent and meaningful reference point for story point estimation. The choice between them often depends on the team’s preference and the nature of their work. Regardless of the method chosen, the key is to foster a shared understanding within the team, laying the groundwork for effective and collaborative sprint planning based on the relative sizing principle of story points.

Different Domains

When teams include individuals from diverse domains, such as programmers and testers, a challenge arises during the sizing of items based on their complexity. The programmers inherently assess complexity from a programming perspective, while testers approach it from the testing standpoint. This separation often leads to a dilemma when integrating the two perspectives.

Consider a scenario where a particular item is evaluated as 5 story points by the programmers but only 2 story points by the testers. The question then emerges: How should these different estimates be harmonized? Is an average appropriate, or should the numbers be summed? The recommended approach in such cases is to take the maximum value between the two perspectives and designate that as the size of the item. In the given example, the size would be set at 5 story points. This methodology operates on the assumption that, for most items, there exists a consistent ratio of complexity between programming and testing. In other words, a 5-story point item for programmers would typically translate to a 2-story point item for testers.

To navigate this inherent tension, teams find support in the practice of relative sizing. By referring to items that have been successfully delivered in the past, teams identify comparable benchmarks and assign story points based on the most analogous item. However, should there be a perception that one domain, be it programming or testing, carries a slightly higher level of complexity, teams have the flexibility to adjust the sizing upwards. For instance, a 3 story point item might be designated as 5 story point if deemed more complex in one of the domains.

Translating Story Points to Time

Often, top management requires insights into the time spent on delivery, and when utilizing story points, a translation is necessary, as story points serve as a ‘weight’ measurement rather than a direct measure of time. The translation between story points and time is straightforward but should be performed on-demand, recognizing that the ratio between story points and time may evolve as the team improves.

Here are the guidelines for translating story points to time, on-demand:

Determine the total working days for the team in a sprint by multiplying the number of people in the team by the number of days in the sprint.
Calculate the average story point velocity of the team, based on the most recent 3–4 sprints.
Divide the total number of working days by the average velocity. The result represents the translation of one story point into days.

For instance, consider a team of 8 people working in 2-week sprints with an average velocity of 35 story points. In this scenario, one story point equals 2.3 days (8 * 10 / 35). If, a month later, the team’s average velocity improves to 40 story points, the translation adjusts accordingly, and one story point equals 2 days (8 * 10 / 40). This approach ensures a dynamic and accurate reflection of the team’s evolving efficiency in story points to time translation.

Correlation between Story Points and Time

Although Story Points are not a direct measure of time, they do bear a correlation to time. Understanding this relationship can offer insights into team improvement. Over time, we anticipate that completing a single story point will require less time. This can be a result of various factors such as enhanced skill sets, optimization of delivery pipelines, reduced impediments in the delivery cycle, improved dependency management, and clearer requirements.

A recommended approach for understanding this progress, beyond merely tracking increased velocity, involves examining the cycle time of backlog items assigned the same effort (i.e., the same story point value). For instance, monitoring a cycle time chart exclusively for delivered backlog items of 3 story points should reveal a decline in cycle time over time.

Decrease Trend of Cycle-Time of 3 SPs Backlog Items Across 3 Months Period

Analyzing the cycle time for each story point category (e.g., 1, 2, 3, 5, 8) enables a better understanding of the average duration for each category. Armed with this knowledge, teams can make informed decisions regarding requirement slicing. For instance, consider a team operating within a 2-week sprint cycle. If this team consistently delivers backlog items of 5 story points within 4–8 days and items of 8 story points within 7–12 days, it’s wise for them to limit sprint commitments to items no larger than 5 story points. Larger items should be sliced into manageable sizes.

This correlation between story points and time requires periodic examination every few sprints. Ideally, over time, completing the same story points should demand less time. For example, after six months, if the same team observes that backlog items of 5 story points now take 3–6 days and items of 8 story points take 5–7 days, they may consider adjusting their sprint commitments accordingly, and allow also items of 8 story points into the sprint.

It’s essential to note that backlog items of the same story point category may still exhibit variations in delivery time due to differences in team members’ skill sets.

Note about Delivery-Oriented Approach

It’s important to emphasize that we should encourage the measurement of delivery-oriented activities rather than attempting to quantify every aspect of our work. If we follow this guideline, story points are intentionally assigned to activities directly contributing to product delivery. Consequently, when translating story points into time estimates, it includes not only delivery-centric tasks but also non-delivery activities such as vacation days, sick leave, HR events, Scrum meetings, and the like. This is the main reason why 1 story point is not equal to 1 working day.

Conclusion

In conclusion, the adoption of story points in our estimation process has proven to be an impactful strategy, offering benefits that enhance our agile development practices. Firstly, story points provide transparency in measuring improvement over time.

Moreover, story points encourage a collaborative environment that leverages crowd wisdom, allowing the team to utilize its collective intelligence for more accurate and insightful estimations. This collaborative approach not only enhances the quality of our estimations but also promotes a shared understanding among team members, aligning everyone towards common product goals.

The emphasis on relative sizing in story points also contributes to the fast nature of our estimation process. By avoiding the difficulties of time-based debates and precision paralysis, we make quicker estimations.

It is crucial to repeat that story points are not a direct measurement of time; instead, they serve as a ‘weight’ measurement. This deliberate abstraction from time units allows for a more flexible and adaptive estimation process, ensuring that our focus remains on the essence of the work rather than becoming obsessed with exact time predictions.

Furthermore, it is possible to do on-demand translation from story points into time estimates, if needed (e.g. top management status report reflecting time investment). As the team evolves and improves, the ratio between story points and time may change.

Story points, in essence, are the strawberries of software development — not constrained by the ticking of the clock but defined by the weight of the tasks at hand. Story points empower our team to navigate the complexities of software development with precision, collaboration, speed, and agility.