Dear Construx: Story Point Inflation Causes Ever-Expanding Project!
- Posted on August 9, 2010 6:22:PM by John Clifford to Retrospectives
- Methods & Processes, Technique, Agile, Scrum, planning, estimation, Management, practices
How do you deal with the risk of Story Point inflation throughout a Scrum project?
My team goes through release planning, ending up with a backlog where each item has been estimated using story points. Based upon our estimates of story points and team velocity, we predict, and commit to, a release date. So far, so good.
As our project progresses, the Product Owner breaks larger ‘epics’ into multiple backlog items that can be completed inside a sprint in a ‘just-in-time’ manner following Lean principles. Here’s the problem: the component items are re-estimated individually, and the sum of their points is greater than the point value assigned to the original epic. The result is, our project scope increases and therefore our project slips.
How do we deal with story point inflation? Do we insist that the sum of points for all epic component stories must not exceed the story point estimate for the original estimate? I’m really beginning to dislike telling my stakeholders that my project keeps on expanding past its original ship date. Am I the only one who goes through this?
- Deflated by Inflation
As a Product Owner, a key strategy is to drive out uncertainty. This means that, while I might start a project with epics in my backlog, I am going to work hard to decompose them as quickly as I can (as early in the project as I can). My goal, as a Product Owner, is to have a complete backlog with estimated stories as soon as I responsibly can, so that I can narrow the Cone of Uncertainty early in my project and provide accurate project completion estimates to stakeholders. ‘Just in time’ is often misunderstood to be at the last possible moment. Instead, we should view it as synonymous with ‘The Last Responsible Moment,’ or the time after which we can be hurt by not knowing or deciding something. Don’t decompose epics ‘Just in time,’ break them down at ‘The Last Responsible Moment.’ Yes, this means there may be some waste in the form of additional effort spent decomposing items that won’t be done, but that expenditure buys me more certainty on my product schedule… so maybe it isn’t waste after all.
In my experience, teams want to re-estimate sprint-sized backlog items as a form of sandbagging (buffering). In other words, it's a way to buy them additional time, to let them fit less work into a sprint because the work is somehow larger. It is only natural to try to ensure that you will be able to meet your commitments, and story point inflation lets us do that not by committing to less points per sprint but by bloating stories so it looks like we're doing more. My preferred way of handling this is to NOT let teams re-estimate properly-sized user stories during a project/release. (I allow, and encourage, re-estimation at the beginning of the next release based upon relative sizing back to completed items, and re-plan accordingly.) If we have epics on the backlog that need to be decomposed, then as a Scrum Master I am fairly rigid on insisting they compare the resultant component user stories to stories we've already completed in terms of size/effort/complexity. Relative sizing is a pretty good technique to detect and combat unconscious or deliberate story point inflation.
If you size epics in a way appropriate to reflect ambiguity, e.g., multiples of the largest items you would accept into a sprint, then when you decompose them to their constituent stories the sum of those stories is usually less than the epic they sprang from. Hence, your points total shrinks as it should to indicate the reduction of uncertainty by eliminating the points that represent uncertainty. And, because we use large numbers to indicate uncertainty with story points, our epic estimates should be large. For instance, what if I were to estimate the population of New York state (a number that I don’t actually know)? If I had to come up with a single number that had to be accurate (approximate the size), I might respond “approximately 50 million.” Now, I know that is larger than what I think the number is, but I intentionally made it larger than what I think the number is to reflect the significant amount of uncertainty I have at just pulling a number out of my, er… head. A little further exploration and back-of-the-envelope calculations ("hmmm... the 5 boroughs are the majority of the state population, and I know that their total population is 10 to 15 million, so the state population should be between 20 and 30 million... new estimate is ‘approximately 25 million’") narrows the Cone of Uncertainty by half.
After I wrote the preceding paragraph, I just Googled 'population of New York state' and found it to be 19.5 million... or within one multiple of my estimate. Which means that, as I did more research, finding out the population of the 5 boroughs (8.4 million), I could again re-estimate (17 million). And then, I could start looking at other areas of New York state (Long Island, 3 million, Hudson River Valley, 1.75 million, NYC northern suburbs, 1 million, rest of upstate, 5 million), and I’d get approximately 19 million after a few minutes of ‘decomposition.’ Notice that as I decomposed my problem my numbers got more precise, but not necessarily more accurate!
What?? How can you say that 19 million isn’t more accurate than approximately 25 million? The interesting fact about using the Fibonacci Series as our numbering scheme for story point is, as we go up the scale each number is approximately 1.6x the previous number. So, if I am between 60% and 160% of the actual value, that is about as close as we can get representing the estimate with Fibonacci numbers… and 19.5 million (the actual value) is between 60% and 160% of 25 million (my first real ‘estimate’ that wasn’t a guess). Yes, 25 million is a little over. What if I were estimating the population of each of the 50 states? Then I’d be able to rely on the Law of Large Numbers (“the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed”). In other words, although I’d be over on some, and under on others, the overages and underages would average out to ensure my estimate of the US population would be very accurate… just as the overages and underages of a lot of story point estimates would average out to accurately represent the true amount of work on my project.
So, the secret here is to estimate the original epics accurately enough (not precisely enough) so that they are correct. If your epics typically decompose to component stories that, in the aggregate, comprise more story points than the original epic, you’re not estimating epics accurately enough… and this is usually because you’re trying to be too precise at the expense of accuracy.