Estimating in Agile Software Development
I've written quite a bit about various aspects of estimating in agile software development. I think it's about time I joined up the dots...
PRODUCT BACKLOG
The Product Backlog is a feature list. Or a list of User Stories if that's your approach. Either way, it is a simple list of things that are of value to a user - not technical tasks - and they are written in business language, so they can be prioritised by the Product Owner. There are no details about each feature until it is ready to be developed, just a basic description and maybe a few notes if applicable.
'POINTS MAKE SIZES'
Each item on the Product Backlog is given a points value to represent its size. Size is an intuitive mixture of effort and complexity. It's meant to represent 'how big it is'.
FIBONACCI
I like to use the Fibonacci number sequence for the points values. Fibonacci goes 1, 2, 3, 5, 8, 13 - where each number is the sum of the previous two. This builds a natural distribution curve into the estimates. The bigger something's size, the less precise the estimate can be, which is reflected in the widening range between the numbers as they get bigger.
RELATIVE ESTIMATING
Points are an abstract number. They do not convert to a unit of time. They are simply a *relative* indication of size. In other words, a 2 is about twice the size of a 1. A 5 is bigger than a 3, but smaller than an 8. Developers find it hard to estimate accurately in hours or days when they don't yet know the details of the requirements and what the solution involves. But it's easier to compare the size of two features relative to each other.
ESTIMATE AS A TEAM
The points should be assigned to each backog item as a team. The collective intelligence - or wisdom of crowds - is an important way to apply multiple people's experience to the estimate. If you have a very big team, you can split up so it's quicker to do this, but the estimating groups should ideally involve at least 3 people, so you dont just get two opposing opinions.
PLANNING POKER
Planning Poker is a fun technique to facilitate rapid estimating as a team. The team discusses a feature verbally to understand more about what it entails and how it might be done. Each team member writes what they think its size is (in points) on a card. All team members reveal their card at the same time. Differences in opinion are used to provoke further discussion. Maybe one person saw risks and complexity that others didn't. Maybe another persion saw a simpler solution. The team re-votes until there is a concensus, then moves on to the next item.
DONE MEANS DONE
During the Sprint, or iteration, the team only counts something as Done when it is completely done, i.e. tested and signed off by the Product Owner. At that time, and only at that time, the team scores the points for the item.
BURNDOWN
The team shows its commitment and daily progress on a graph, so it is measurable and visible at a glance. This is called a Burndown Chart. The burndown shows the total number of points committed to, depreciating over time to the end of the Sprint. This is the target line. It also shows the actual number of points scored each day - i.e. the sum of points for all items that are 100% done and signed off so far. The team plots this each day before their daily stand-up meeting. When the actual line is above the target line, the team is behind. When it's below, they're ahead.
VELOCITY
At the end of the Sprint, the team's score is called their Velocity. The team tracks its Velocity over time. This allows the team to see if it's improving. Of course at some point it will stabilise, if the team is stable. If not, this is an issue in itself. When Velocity is relatively stable - in my experience that will be after 3 or 4 Sprints - it can be reliably used to decide how much (i.e. how many points) the team should commit to in the next Sprint.
RELIABILITY / PREDICTABILITY
As a result, the team can measure how reliable - or how predictable - they are. The metric for this is Velocity (points scored) as a percentage of points planned. As Velocity stabilises, the team's Reliability will get better, and the team will be better at predicting what they can deliver. Ironically, the team doesn't need to get better at estimating to get better at delivering on their commitments. Even if they are terrible at estimating, as long as they are consistently terrible, with this method they will still get better at predicting what they can deliver.
POINTS VERSUS TIME
One of the benefits of points is that it does not relate to time. Resist the temptation to convert it. If a team plans on 100 points and delivers 50, can you imagine telling your stakeholders that you are only planning future Sprints for half the team's time. If a team commits to 100 points and delivers 150, imagine telling the team you're planning on doing 60 hours each per week. It just doesn't work. Points are not a measure of time. They are abstract, relative sizes, and a measure of how much can be delivered. That's why it works. It works because the team can adjust its commitment based on what its track record shows it can usually deliver.
PRODUCTIVITY
This does not measure a team's productivity. Velocity does tell you if a team is getting more or less productive. But you can't really use Velocity to compare the productivity of two teams, as their circumstances are different. And you can't use it to determine whether a team's Velocity is as high as it should be. For this, you still need to use your judgement, based on previous experience and taking into account many subjective factors.
PLAYING THE SYSTEM
Using these two metrics - Velocity and Reliability - it's hard to cheat the system. If a team commits low, they acheive Reliability but Velocity goes down. If a team commits too high, their Velocity goes up but their Reliability goes down. This is like the balanced scorecard concept. The metrics are deliberately measuring opposing things, so they can't easily be played.
Kelly.
P.S. Click one of the icons below to join the growing community of people keeping up with this blog by RSS or by email...
Photo by Steve Kay
PRODUCT BACKLOG
The Product Backlog is a feature list. Or a list of User Stories if that's your approach. Either way, it is a simple list of things that are of value to a user - not technical tasks - and they are written in business language, so they can be prioritised by the Product Owner. There are no details about each feature until it is ready to be developed, just a basic description and maybe a few notes if applicable.
'POINTS MAKE SIZES'
Each item on the Product Backlog is given a points value to represent its size. Size is an intuitive mixture of effort and complexity. It's meant to represent 'how big it is'.
FIBONACCI
I like to use the Fibonacci number sequence for the points values. Fibonacci goes 1, 2, 3, 5, 8, 13 - where each number is the sum of the previous two. This builds a natural distribution curve into the estimates. The bigger something's size, the less precise the estimate can be, which is reflected in the widening range between the numbers as they get bigger.
RELATIVE ESTIMATING
Points are an abstract number. They do not convert to a unit of time. They are simply a *relative* indication of size. In other words, a 2 is about twice the size of a 1. A 5 is bigger than a 3, but smaller than an 8. Developers find it hard to estimate accurately in hours or days when they don't yet know the details of the requirements and what the solution involves. But it's easier to compare the size of two features relative to each other.
ESTIMATE AS A TEAM
The points should be assigned to each backog item as a team. The collective intelligence - or wisdom of crowds - is an important way to apply multiple people's experience to the estimate. If you have a very big team, you can split up so it's quicker to do this, but the estimating groups should ideally involve at least 3 people, so you dont just get two opposing opinions.
PLANNING POKER
Planning Poker is a fun technique to facilitate rapid estimating as a team. The team discusses a feature verbally to understand more about what it entails and how it might be done. Each team member writes what they think its size is (in points) on a card. All team members reveal their card at the same time. Differences in opinion are used to provoke further discussion. Maybe one person saw risks and complexity that others didn't. Maybe another persion saw a simpler solution. The team re-votes until there is a concensus, then moves on to the next item.
DONE MEANS DONE
During the Sprint, or iteration, the team only counts something as Done when it is completely done, i.e. tested and signed off by the Product Owner. At that time, and only at that time, the team scores the points for the item.
BURNDOWN
The team shows its commitment and daily progress on a graph, so it is measurable and visible at a glance. This is called a Burndown Chart. The burndown shows the total number of points committed to, depreciating over time to the end of the Sprint. This is the target line. It also shows the actual number of points scored each day - i.e. the sum of points for all items that are 100% done and signed off so far. The team plots this each day before their daily stand-up meeting. When the actual line is above the target line, the team is behind. When it's below, they're ahead.
VELOCITY
At the end of the Sprint, the team's score is called their Velocity. The team tracks its Velocity over time. This allows the team to see if it's improving. Of course at some point it will stabilise, if the team is stable. If not, this is an issue in itself. When Velocity is relatively stable - in my experience that will be after 3 or 4 Sprints - it can be reliably used to decide how much (i.e. how many points) the team should commit to in the next Sprint.
RELIABILITY / PREDICTABILITY
As a result, the team can measure how reliable - or how predictable - they are. The metric for this is Velocity (points scored) as a percentage of points planned. As Velocity stabilises, the team's Reliability will get better, and the team will be better at predicting what they can deliver. Ironically, the team doesn't need to get better at estimating to get better at delivering on their commitments. Even if they are terrible at estimating, as long as they are consistently terrible, with this method they will still get better at predicting what they can deliver.
POINTS VERSUS TIME
One of the benefits of points is that it does not relate to time. Resist the temptation to convert it. If a team plans on 100 points and delivers 50, can you imagine telling your stakeholders that you are only planning future Sprints for half the team's time. If a team commits to 100 points and delivers 150, imagine telling the team you're planning on doing 60 hours each per week. It just doesn't work. Points are not a measure of time. They are abstract, relative sizes, and a measure of how much can be delivered. That's why it works. It works because the team can adjust its commitment based on what its track record shows it can usually deliver.
PRODUCTIVITY
This does not measure a team's productivity. Velocity does tell you if a team is getting more or less productive. But you can't really use Velocity to compare the productivity of two teams, as their circumstances are different. And you can't use it to determine whether a team's Velocity is as high as it should be. For this, you still need to use your judgement, based on previous experience and taking into account many subjective factors.
PLAYING THE SYSTEM
Using these two metrics - Velocity and Reliability - it's hard to cheat the system. If a team commits low, they acheive Reliability but Velocity goes down. If a team commits too high, their Velocity goes up but their Reliability goes down. This is like the balanced scorecard concept. The metrics are deliberately measuring opposing things, so they can't easily be played.
Kelly.
P.S. Click one of the icons below to join the growing community of people keeping up with this blog by RSS or by email...
Photo by Steve Kay
6 April 2009 09:22
Fascinating article. Very inventive and creative. With a highly skilled team it would produce very good results.
The problem I have with the agile approach is not agile itself but with its misuse. People who have consistently failed to deliver using every other method are misleading their clients by suggesting that they will now be able to deliver through agile methods.
The reason these people have failed in the past is their lack of basic skills and ability.
If they were unable to proper function modeling in the past they will be unable to do it now.
If they were unable to do proper data modeling in the past they will be unable to do it now.
No approach will make up for lack of basic skills. Its a huge con being perpetrated on the business world at the moment that agile will enable developers to do what they have never been able to do.
If you really want to deliver quality systems then get back to basics. Learn the fundamental skills. Skilled analysts and developers can deliver using watefall, iterative, agile or any other approach.
Bad ones can deliver using none.
6 April 2009 22:51
"As Velocity stabilises, the team's Reliability will get better, and the team will be better at predicting what they can deliver. [...] Even if they are terrible at estimating, as long as they are consistently terrible, with this method they will still get better at predicting what they can deliver."
Get better to what point? Forever until they estimate perfectly? And what happens if the team dynamic changes such a someone leaving or joining? You lose any confidence in their estimates until they have stabilised again. In my mind it is a fallacy that Agile makes us more predictable. As (using Agile principles) we break the work down into the smallest possible components and deliver value as soon as possible we minimise risk and negate the need for estimates. Suggesting anything else is snake oil.
7 April 2009 10:43
To robbowley...
You make a valid point about a team's stability potentially being temporary, but saying it's 'snakeoil' to suggest that agile estimating techniques can improve a team's ability to deliver on it commitments just isn't true.
I have now seen this estimating approach used by many teams. I have frequently seen a significant improvement in a team's ability to deliver on it's commitments within just 3-4 sprints. I have seen this in established teams that had never managed to give a reliable commitment before.
Kelly.
20 April 2009 21:31
We have a great team but I think our estimation during planning meetings are way off. Either they are too much or too less.
Is there any estimation Tools that I can introduce to the team
17 May 2009 16:22
Estimating in points make difficult to the commercial part know how long the project is. And the commercial require a time to estimate the cost of the project... Is there a technique for that? Or just the points ..? That's why I prefered the days in product backlog. Even if sounds weird later or the estimates were terrible...
18 May 2009 01:36
You might want to think about ideas to “lean” out the development process in a way that is really measurable and potentially “provable”. One idea is to have a smaller set of measurement points to choose from, like 1, 2, 4 and 8 (for a 2 week period). For each of these measurements, you would have sample stories to represent each measurement. When the team is selecting the story size to assign, they will compare to the samples. This will help remove debate on how an individual’s skill level may impact the sizing. This is more like measuring shoes – you do it against a standard (though there is men’s, lady’s, European standards). Having a smaller number of “sizes” to choose from can help make it easier to consistently choose the size. Exercises could also be run, where stories could be adjusted some and the team’s re-estimating be statistically verified (using approaches like Six Sigma).
Shane Hayes
18 May 2009 07:57
To Plucas,
Estimating in points may sound like it's hard for business people to know how long something might take, but actually you start to get a feel for how many points you can do in a sprint very quickly, which gives them a good idea.
If you want to plan further out - perhaps for a project that is being funded separately - you can simply put a size on all your in-scope backlog items, assume a velocity (number of points per sprint) based on what your team has done before, and work out how many sprints to complete the entire scope. This process is called 'release planning' - I've written a little more about this here:
https://agile-software-development.com/2008/02/agile-release-planning.html
Hope this helps,
Kelly.
24 February 2010 17:56
Thank you for the article, it touched on previous experiences I had in moving to Agile.
My team went a step further with estimating, when we moved towards a Kanban/Pull system. We had done planning poker for some time but our mistake was never going back over our stories to evaluate if a 5 point story was not really an 8 pointer. It was our fault as developers, not to insist on this evaluation but as with many shops, Time was always in short supply and management pressured us to go forward and not spend time in the past.
The Kanban development system did away with estimating large product backlogs and focused developers time on estimating one feature at a time, very lean. In the beginning, we were given a deadline for delivery and the team just worked towards delivering as much value (features) as we could in the time allotted. Over time, I repeat, Over time (a period of one year), we were able to track our cycle times for delivering features (one user story) to be about 37 days, with a standard deviation of 7 days, give or take.
The cycle time was defined as the time it took for a pair of developers to pull a story, analyze, design, test, code and integrate
their unit of work into the system. Work was not considered done until a product owner signed off on it.
Going forward, we were able to use statistical data, not wishful thinking, to accurately forecast, not estimate, how much time it would take to deliver feature sets. If the product owner needed an 'estimate' for his business unit, he would just multiply the number of features he needed to delver by 37 days.
One important part of this forecasting system or any estimation feature you use...
Management tends to hear a date and write it in stone. Management must use collaboration, (read, constant communication) to determine if the team is on schedule. A good method for this was for a manager to attend our daily scrums, in order to hear firsthand, how work was progressing, if there are any roadblocks or if the team felt we were ahead or behind our projected due date...
One final thought...There needs to be Trust between management and development in order for this to work...Management must feel the teams are competent enough to deliver and the team must feel management has clearly defined goals...Without Trust, you have nothing...
19 July 2010 09:19
Very useful blog on a topic that gets debated often.! Interesting reading!