Monday, May 20, 2013

Agile Metrics

Metrics play an important role in project management. They are the primary way to monitor and communicate the status and progress of a project to senior stakeholders. I already mentioned several metrics in previous blogs, but I’d like to sum them up all together.

The first metric, and one that was defined in Agile and has no counterpart in traditional project management, is the velocity. This is defined as the number of story points per sprint that can be delivered by the Team according to the Definition of Done. The velocity is important because it will tell you when all scope on the backlog will be done, it will tell you when you run out of scope. If the velocity drops from one sprint to the next, there should be an explanation for that. It may be that some team members fell sick. Maybe there where a few national holidays. Maybe the user stories that were put on the sprint backlog where more complicated than thought. There can be a wide variety of reasons why the velocity varies between one sprint and the next or why it deviates from the average so far. If those reasons are one-offs, you need to see if there is a way to make up for the loss to keep the project on track, or have the Product Owner come to accept the drop in scope. If the reasons are structural, you need to make the Product Owner and senior stakeholders aware that there is an issue and that expectations must be adjusted.

Burn Rate
The burn rate is the budgetary counterpart of the velocity. It is defined as the amount of man days that are spend during a single sprint, by everybody who books on the project. If your project has several scrum teams, you may want to split out the burn rate for each of the teams. If there are people booking on the project who support the teams, like business analysts, product owners, anyone who is not part of a scrum team, then these costs should be evenly distributed among the teams as to get a burn rate per team that includes indeed all costs that are made to have that team deliver software. Where the velocity will tell you when you run out of scope, the burn rate will tell you when you will run out of budget. If you have a fixed deadline, then the velocity tells you what will be delivered by that deadline and the burn rate will tell what you will have spent.

Defect Detection Rate 

The defect detection rate is the amount of defects detected per sprint. Assuming that developers produce defects at a more or less constant rate, it is correlated with the velocity; the more story points are delivered, the more defects should be found and fixed as well. Teams tend to be pretty consistent in the quality of the software they deliver, so a drop in velocity combined with a rise in the defect detection rate should trigger the alarm. Something’s cooking and you need to find out what it is. My personal opinion is that a lower defect detection rate isn’t necessarily better than a higher one. A defect more found in one of the development and test environments is a defect less that makes it into production. From that perspective, you could support the statement, the more defects the better.

Defect Closure Rate 

This is the amount of defects fixed and closed per sprint. It should be equal to the defect detection rate. If it’s not, the amount of open defects will rise as you move along with the project, leaving the largest part of the bug fixing for the end of the project. This brings me to the last metric.

Gap Between Total and Closed Defects 

This is the difference between the total amount of defects and the amount of closed defects at any one time. This number should be as low as possible. A low number indicates that the quality of the delivered software so far is good. That implies that there will be few if any surprises once UAT and release preparation starts. And that in turn implies that the velocity and burn rate you have measured are indeed reliable indicators to forecast the remainder of your project. I consider this the most important metric of all, for if it’s low, it means I can indeed rely on the other indicators.

A healthy project has a stable velocity and burn rate, combined with a stable and sufficiently high defect detection rate and a low gap between total and closed defects. The velocity and burn rate will ideally indicate that you will run out of scope before you run out of budget, and that you run out of both before the requested delivery date.

It is not possible to give here absolute numbers for any of these metrics that would indicate for a random project whether or not it is in good shape. For instance you can’t say, a project with x number of developers and y days per sprint should produce no more than z number of defects per sprint. Such statements are nonsensical. The actual values of these metrics will depend on the technology you build your systems on, the developers and testers you have, the tools and practices they use, the existing technical debt if there is any, the functional and business context the project is executed in and many, many factors more. What matters is that you determine the actual numbers that result from the execution of your project given its current context and that you know how to interpret them, so you can act accordingly.

Thursday, April 18, 2013

Agile and Prince2

Since Agile doesn’t say much about how to structure a project on a higher level, apart from release planning, it might be interesting to see if and how Agile practices can be combined with for instance a methodology like Prince2. I am going to limit this exercise to Prince2 since I have a certification in and experience with that particular methodology.

Prince2 divides a project in stages. There is the initiation stage, and then you can divide the remainder of the project in as many delivery stages as you see fit, followed by a closing stage.  Prince2 doesn’t say anything about the activities of these stages, except for the initiation stage. In that stage, a project charter, business case, risk register and so on are to be produced. This makes sense in an Agile environment too. You will have a run-up towards the first sprint where you need to define the vision, an initial story map and backlog, provide estimates, establish a fist release planning, do resource allocation, and translate this into a budget that can be included in a business case for approval by senior management. All this qualifies as an initiation stage as defined in Prince2.

During the delivery stages, you can plan whatever you see fit. This means you can take your release planning and consider the various releases of your project as stages in a Prince2 sense. You can create your stage plans as per Prince2 and describe in them the high level scope, risks and a baseline release burn-up or burn-down.  If you need to add teams or change team composition from one release to the next, that too can go in the stage plan. The stage will consist of several Sprints that make up the release. Prince2 mandates that the business case be reviewed after each stage to see if the project still makes sense and that comes very close to Agile practices indeed.

As for the roles, the central figure in Prince2 is the project manager. He has one or more teams working for him each with a team leader. There is no reason why these teams shouldn’t be Scrum teams with a Scrum Master. The project manager then becomes a sort of overall Scrum Master for the project, with all the responsibilities of a project manager in Prince2, as well as supporting the teams and other scrum masters in removing impediments. In Prince2 the users are represented in the Steering Committee by the Senior User. This person could take up the role of the Product Owner, or delegate that to someone in his organization, as long as the delegate has full autonomy and authority to make decisions the correspond to the Product Owner role.

Prince2 is based on a set of principles, that are listed in the table below.

Continued business justificationA PRINCE2 project has continued business justification
Learn from experiencePRINCE2 project teams learn from previous experience (lessons are sought, recorded and acted upon throughout the life of the project)
Defined roles and responsibilitiesA PRINCE2 project has defined and agreed roles and responsibilities with an organizational structure that engages the business, user and supplier stakeholder interests
Manage by stagesA PRINCE2 project is planned, monitored and controlled on a stage-by-stage basis
Manage by exceptionA PRINCE2 project has defined tolerances for each project objective to establish limits of delegated authority
Focus on productsA PRINCE2 project focuses on the definition and delivery of products, in particular their quality requirements
Tailor to suit the project environmentPRINCE2 is tailored to suit the project’s size, environment, complexity, importance, capability and risk

These are compatible with Agile principles and practices like iterative development, the focus on working software, clear definition of roles and empowered teams. As long as the project stays within tolerance for budget and schedule, the teams can proceed in the way that they think is best.

In short, I don’t see any fundamental incompatibilities between Prince2 and Agile. I think the combination of these two can go a long way in addressing senior management concerns on for example development teams doing whatever they want or scope not being fixed up front that you sometimes hear when an organization wants to move to Agile.

Wednesday, April 3, 2013

Estimating Part III

Sometimes I hear discussions or questions about the reasons for using the Fibonacci scale when estimating in story points. I think the main reason is that the Fibonacci scale jumps in steps of around 50% between the consecutive values. For estimating purposes this is fine, it is much easier to make a distinction between something of size 8 and 13, than between 10 and 11. To make this point more clear, let’s do a thought exercise.

Suppose we are looking at 4 buildings,  a 1 story building, a 2 story building, a 3 story building and a 4 story building. The buildings have flat roofs. The height of the stories in each building is not necessarily the same. The goal is to estimate the sum of the height if the buildings in meters just by looking at it. We will start using a linear scale. 

The 1 story building is probably between 2 and 4 meters high, the 2 story building between 5 and 7 meters, the 3 story building between 8 and 10 and the 4 story building between 10 and 14 meters high. In a table this looks like this:
So we have a minimum total height of 25 meters and a maximum of 35 meters with an average of 30 meters. We could have long discussions on each individual building and come up with any number in this range. Deciding for example whether building 3 is 8 or 9 meters high could be difficult, and any way it wouldn’t make a big difference on the total.

Now let’s do the same exercise using Fibonacci numbers, i.e.  we take the Fibonacci sequence as our possible values for the heights of the buildings. A reasonable estimate for each is then shown in the table below:
2 meters is probably a little low for a 1 story building and 5 too high, so we take 3. The 2 story building is obviously higher than 3 meters, but 8 is likely too much so we take 5 meters. The 3 story building is higher than 5 meters, but 13 meters sounds like too much, so we settle on 8 meters. The 4 story building is likely 13 meters high, since 21 meters is again too high. This gives a total of 29 meters, only 1 meter less than the average we calculated above, and without going into too much detail like the height of the individual floors of the buildings.

So by using Fibonacci numbers we can get to a reasonable estimate pretty quickly, even if we don’t know too much detail yet about the individual buildings and floors in this case. In my previous post I showed a table with the results of a fictional planning poker session using Fibonacci numbers and came to some conclusions about accuracy, precision and confidence. I repeat that table below.
How would that work out if we would replace the Fibonacci scale with a linear scale, i.e. 1, 2, and 3 point user stories stay the same, but a 5 point story becomes a 4 point story, an 8 point story becomes a 5 point story, a 13 point story becomes a 6 point story and so on? Below is the table with the individual estimates converted to a linear scale value.
You see that with the linear scale we get a lower estimate and a higher precision, the 2σ value of 12.14 is 28% of the average of 42.6, whereas with the Fibonacci sequence we had a precision of 53%.

We now repeat this exercise again but with a quadratic scale, i.e. a 1 point story and a 2 point story stay the same, but a 3 point story becomes a 4 point story, a 5 point story becomes an 8 point story, an 8 point story becomes a 16 point story, a 13 point story becomes a 32 point story and so on. The result is in the table below.
We now have a much higher average of 218.8 story points and a 2σ value of 163.29, or 75% of the mean. So with a quadratic scale we get much higher and much less precise estimates when compared to using the Fibonacci scale.

My conclusion is that by using Fibonacci numbers you have a good compromise between accuracy and precision, and it is also easier to use. There is no need for intermediate numbers like maybe one would be tempted to use when using the quadratic scale (and if you do, it would start to converge to the Fibonacci scale anyway). Likewise, there’s no need to skip numbers as you might be tempted to do when using the linear scale (and again, if you do that you converge towards the Fibonacci scale).