Monday, May 20, 2013

Agile Metrics

Metrics play an important role in project management. They are the primary way to monitor and communicate the status and progress of a project to senior stakeholders. I already mentioned several metrics in previous blogs, but I’d like to sum them up all together.

Velocity
The first metric, and one that was defined in Agile and has no counterpart in traditional project management, is the velocity. This is defined as the number of story points per sprint that can be delivered by the Team according to the Definition of Done. The velocity is important because it will tell you when all scope on the backlog will be done, it will tell you when you run out of scope. If the velocity drops from one sprint to the next, there should be an explanation for that. It may be that some team members fell sick. Maybe there where a few national holidays. Maybe the user stories that were put on the sprint backlog where more complicated than thought. There can be a wide variety of reasons why the velocity varies between one sprint and the next or why it deviates from the average so far. If those reasons are one-offs, you need to see if there is a way to make up for the loss to keep the project on track, or have the Product Owner come to accept the drop in scope. If the reasons are structural, you need to make the Product Owner and senior stakeholders aware that there is an issue and that expectations must be adjusted.

Burn Rate
The burn rate is the budgetary counterpart of the velocity. It is defined as the amount of man days that are spend during a single sprint, by everybody who books on the project. If your project has several scrum teams, you may want to split out the burn rate for each of the teams. If there are people booking on the project who support the teams, like business analysts, product owners, anyone who is not part of a scrum team, then these costs should be evenly distributed among the teams as to get a burn rate per team that includes indeed all costs that are made to have that team deliver software. Where the velocity will tell you when you run out of scope, the burn rate will tell you when you will run out of budget. If you have a fixed deadline, then the velocity tells you what will be delivered by that deadline and the burn rate will tell what you will have spent.


Defect Detection Rate 

The defect detection rate is the amount of defects detected per sprint. Assuming that developers produce defects at a more or less constant rate, it is correlated with the velocity; the more story points are delivered, the more defects should be found and fixed as well. Teams tend to be pretty consistent in the quality of the software they deliver, so a drop in velocity combined with a rise in the defect detection rate should trigger the alarm. Something’s cooking and you need to find out what it is. My personal opinion is that a lower defect detection rate isn’t necessarily better than a higher one. A defect more found in one of the development and test environments is a defect less that makes it into production. From that perspective, you could support the statement, the more defects the better.

Defect Closure Rate 

This is the amount of defects fixed and closed per sprint. It should be equal to the defect detection rate. If it’s not, the amount of open defects will rise as you move along with the project, leaving the largest part of the bug fixing for the end of the project. This brings me to the last metric.

Gap Between Total and Closed Defects 

This is the difference between the total amount of defects and the amount of closed defects at any one time. This number should be as low as possible. A low number indicates that the quality of the delivered software so far is good. That implies that there will be few if any surprises once UAT and release preparation starts. And that in turn implies that the velocity and burn rate you have measured are indeed reliable indicators to forecast the remainder of your project. I consider this the most important metric of all, for if it’s low, it means I can indeed rely on the other indicators.

A healthy project has a stable velocity and burn rate, combined with a stable and sufficiently high defect detection rate and a low gap between total and closed defects. The velocity and burn rate will ideally indicate that you will run out of scope before you run out of budget, and that you run out of both before the requested delivery date.


It is not possible to give here absolute numbers for any of these metrics that would indicate for a random project whether or not it is in good shape. For instance you can’t say, a project with x number of developers and y days per sprint should produce no more than z number of defects per sprint. Such statements are nonsensical. The actual values of these metrics will depend on the technology you build your systems on, the developers and testers you have, the tools and practices they use, the existing technical debt if there is any, the functional and business context the project is executed in and many, many factors more. What matters is that you determine the actual numbers that result from the execution of your project given its current context and that you know how to interpret them, so you can act accordingly.

Thursday, April 18, 2013

Agile and Prince2

Since Agile doesn’t say much about how to structure a project on a higher level, apart from release planning, it might be interesting to see if and how Agile practices can be combined with for instance a methodology like Prince2. I am going to limit this exercise to Prince2 since I have a certification in and experience with that particular methodology.

Prince2 divides a project in stages. There is the initiation stage, and then you can divide the remainder of the project in as many delivery stages as you see fit, followed by a closing stage.  Prince2 doesn’t say anything about the activities of these stages, except for the initiation stage. In that stage, a project charter, business case, risk register and so on are to be produced. This makes sense in an Agile environment too. You will have a run-up towards the first sprint where you need to define the vision, an initial story map and backlog, provide estimates, establish a fist release planning, do resource allocation, and translate this into a budget that can be included in a business case for approval by senior management. All this qualifies as an initiation stage as defined in Prince2.

During the delivery stages, you can plan whatever you see fit. This means you can take your release planning and consider the various releases of your project as stages in a Prince2 sense. You can create your stage plans as per Prince2 and describe in them the high level scope, risks and a baseline release burn-up or burn-down.  If you need to add teams or change team composition from one release to the next, that too can go in the stage plan. The stage will consist of several Sprints that make up the release. Prince2 mandates that the business case be reviewed after each stage to see if the project still makes sense and that comes very close to Agile practices indeed.

As for the roles, the central figure in Prince2 is the project manager. He has one or more teams working for him each with a team leader. There is no reason why these teams shouldn’t be Scrum teams with a Scrum Master. The project manager then becomes a sort of overall Scrum Master for the project, with all the responsibilities of a project manager in Prince2, as well as supporting the teams and other scrum masters in removing impediments. In Prince2 the users are represented in the Steering Committee by the Senior User. This person could take up the role of the Product Owner, or delegate that to someone in his organization, as long as the delegate has full autonomy and authority to make decisions the correspond to the Product Owner role.

Prince2 is based on a set of principles, that are listed in the table below.

PrincipleDefinition
Continued business justificationA PRINCE2 project has continued business justification
Learn from experiencePRINCE2 project teams learn from previous experience (lessons are sought, recorded and acted upon throughout the life of the project)
Defined roles and responsibilitiesA PRINCE2 project has defined and agreed roles and responsibilities with an organizational structure that engages the business, user and supplier stakeholder interests
Manage by stagesA PRINCE2 project is planned, monitored and controlled on a stage-by-stage basis
Manage by exceptionA PRINCE2 project has defined tolerances for each project objective to establish limits of delegated authority
Focus on productsA PRINCE2 project focuses on the definition and delivery of products, in particular their quality requirements
Tailor to suit the project environmentPRINCE2 is tailored to suit the project’s size, environment, complexity, importance, capability and risk

These are compatible with Agile principles and practices like iterative development, the focus on working software, clear definition of roles and empowered teams. As long as the project stays within tolerance for budget and schedule, the teams can proceed in the way that they think is best.

In short, I don’t see any fundamental incompatibilities between Prince2 and Agile. I think the combination of these two can go a long way in addressing senior management concerns on for example development teams doing whatever they want or scope not being fixed up front that you sometimes hear when an organization wants to move to Agile.

Wednesday, April 3, 2013

Estimating Part III

Sometimes I hear discussions or questions about the reasons for using the Fibonacci scale when estimating in story points. I think the main reason is that the Fibonacci scale jumps in steps of around 50% between the consecutive values. For estimating purposes this is fine, it is much easier to make a distinction between something of size 8 and 13, than between 10 and 11. To make this point more clear, let’s do a thought exercise.

Suppose we are looking at 4 buildings,  a 1 story building, a 2 story building, a 3 story building and a 4 story building. The buildings have flat roofs. The height of the stories in each building is not necessarily the same. The goal is to estimate the sum of the height if the buildings in meters just by looking at it. We will start using a linear scale. 

The 1 story building is probably between 2 and 4 meters high, the 2 story building between 5 and 7 meters, the 3 story building between 8 and 10 and the 4 story building between 10 and 14 meters high. In a table this looks like this:
So we have a minimum total height of 25 meters and a maximum of 35 meters with an average of 30 meters. We could have long discussions on each individual building and come up with any number in this range. Deciding for example whether building 3 is 8 or 9 meters high could be difficult, and any way it wouldn’t make a big difference on the total.

Now let’s do the same exercise using Fibonacci numbers, i.e.  we take the Fibonacci sequence as our possible values for the heights of the buildings. A reasonable estimate for each is then shown in the table below:
2 meters is probably a little low for a 1 story building and 5 too high, so we take 3. The 2 story building is obviously higher than 3 meters, but 8 is likely too much so we take 5 meters. The 3 story building is higher than 5 meters, but 13 meters sounds like too much, so we settle on 8 meters. The 4 story building is likely 13 meters high, since 21 meters is again too high. This gives a total of 29 meters, only 1 meter less than the average we calculated above, and without going into too much detail like the height of the individual floors of the buildings.

So by using Fibonacci numbers we can get to a reasonable estimate pretty quickly, even if we don’t know too much detail yet about the individual buildings and floors in this case. In my previous post I showed a table with the results of a fictional planning poker session using Fibonacci numbers and came to some conclusions about accuracy, precision and confidence. I repeat that table below.
How would that work out if we would replace the Fibonacci scale with a linear scale, i.e. 1, 2, and 3 point user stories stay the same, but a 5 point story becomes a 4 point story, an 8 point story becomes a 5 point story, a 13 point story becomes a 6 point story and so on? Below is the table with the individual estimates converted to a linear scale value.
You see that with the linear scale we get a lower estimate and a higher precision, the 2σ value of 12.14 is 28% of the average of 42.6, whereas with the Fibonacci sequence we had a precision of 53%.

We now repeat this exercise again but with a quadratic scale, i.e. a 1 point story and a 2 point story stay the same, but a 3 point story becomes a 4 point story, a 5 point story becomes an 8 point story, an 8 point story becomes a 16 point story, a 13 point story becomes a 32 point story and so on. The result is in the table below.
We now have a much higher average of 218.8 story points and a 2σ value of 163.29, or 75% of the mean. So with a quadratic scale we get much higher and much less precise estimates when compared to using the Fibonacci scale.

My conclusion is that by using Fibonacci numbers you have a good compromise between accuracy and precision, and it is also easier to use. There is no need for intermediate numbers like maybe one would be tempted to use when using the quadratic scale (and if you do, it would start to converge to the Fibonacci scale anyway). Likewise, there’s no need to skip numbers as you might be tempted to do when using the linear scale (and again, if you do that you converge towards the Fibonacci scale).

Friday, March 22, 2013

Estimating Part II

So what is an estimate really?  Maybe a stupid question but I think it is important that all parties involved, from developers and testers to the most senior stakeholders agree on the meaning of the estimates provided, and that includes agreement on its definition. So my definition for an estimate in the context of Agile and Scrum is the following: An estimate for a deliverable is a range or set of possible values for which that deliverable could be delivered according to the previously agreed upon definition of done, with a corresponding probability for each of the values in the range. The values can be expressed in story points, man days or dollars.
So basically we have a range or set of values with their corresponding probabilities. In statistics, that is the definition of a probability distribution. Assuming that you arrive at your estimates using planning poker, I am going to treat an estimate as a set of discrete values with their corresponding probabilities here.
When it comes to estimates, we want accuracy, precision and confidence. Accuracy is difficult to predict beforehand, after all, it is just an estimate. But there are some assumptions that if true lead to more accurate estimates and when false will lead to inaccurate estimates. These are:
1.  The user stories and epics have are independent, i.e. they can be implemented and tested and have value independent of the other ones
2.  The team performing the estimates is overall knowledgeable and experienced enough in the technology, business domain and estimating techniques. That does not mean they all need to be experts in all 3 areas, but working with a team consisting solely of recently graduated juniors is going to make this a much more difficult exercise than when all of your team members have more than 10 years working together on similar projects with similar technology.

On precision and confidence we can say something by doing some statistical analysis on the estimates provided by the team.

Say we have a team of 5 and we are playing planning poker to estimate 10 user stories and epics for a new project. Some user stories are very detailed and small and will lead to a lower estimate; other epics are only described on a high level and will be subject to change as the project moves along. These will receive higher estimates. The results of the estimating session are shown in the table below, together with mean estimates (μ) and standard deviations (σ) for the individual stories as well as for the project as a whole:
The last column is the 2-sigma value, which is approximately the limit of the 95% confidence interval if we assume that the estimates for a single story can be modeled as a normal distribution. That is of course for a single user story obviously not the case, but according to the central limit theorem, if we add up (or more precisely, convolute) the estimates for each of the user stories, then the probability distribution for the overall project will approach a normal distribution, the more so for more user stories, the less so for less user stories. In any case we may take the overall standard deviation of 46.58 story points as more valid for the entire project than the standard deviations of the individual user stories. So we have an estimate for the entire project of 87.2 ± 46.6 story points, with 95% confidence.

These 46.6 story points are 53.4 % of the mean so our precision with 95% confidence is ±53.4%. For user story A the 2-sigma value is 1.10, or 78.6% of the mean, so our precision with 95% confidence for this single user store is ±78.6%. So by convoluting estimates for several user stories, we get a more precise estimate for the overall project than for a single small user story. And that isn’t so strange if you think about it; if you estimate a single user story, you may over- or under-estimate it, and that’s that. But when estimating several user stories and epics, some will be overestimated and others will be underestimated and these will cancel each other out. The more user stories and epics you have the more opportunity there is for estimating errors to be cancelled out.

So what does this all mean? Well, a first conclusion is that you are more likely to come up with precise estimate for a large project than for a small one. This is a consequence of the central limit theorem. If I look back on my own projects, indeed I have had cost overruns between 1 – 12% on larger projects, but much larger ones on some of the small projects. All this for taking average estimates as my budget, plus a risk premium that didn’t even come close to the above mentioned 78% or 53%. OK, lesson learned.

A second conclusion is that you need to think about the confidence level that you could live with for your project estimates. If you want a 95% confidence, you need to submit a budget of μ + 2σ, if you want 99%, it will be μ + 3σ. If you can live with 70%, then μ + σ. What is this means is that you need to decide on the percentage of projects that have cost overruns you can live with. That number is your confidence interval.

Friday, March 8, 2013

Estimating


Estimating is a very important part of any project manager’s job. Wrong estimates lead to budget and schedule issues in your project. The best way to keep your schedule and budget under control is to be on top of your estimates, both before and during project execution. And you should be, as stakeholders and customers need to be able to rely on your estimates, for they have their own deadlines, targets, investment plans, marketing campaigns etc. that they need to plan as well. Delivering unreliable estimates and the corresponding unreliable schedules and budgets is simply unacceptable.

That being said, how do you come to reliable estimates for a project, if there is so much that is still unknown in the early phases, and so much to be learned and changed as the project is executed? When you start a project, make sure that there is a clear vision on what the project is supposed to deliver. This has to come from the product owner and project sponsor. Try to make sure that everybody involved in the project agrees on the vision; nothing can derail a project more than disagreement on this between senior stakeholders. Next, you’ll need an initial story map to start from. This has to come from the product owner. It should have all initial user stories and epics that would be necessary to implement the vision. The epics will be higher level and less precise, as the exact details will not yet be known. The most important user stories should be worked out pretty detailed, with acceptance criteria included enough to be able to start working on them in sprint 1. Other user stories will fall somewhere in between. Important is also that each user story and epic is as independent as possible from the other ones, i.e. it can stand on its own implementation and deployment wise, and it has business value independent of the rest. This is not always easy to achieve. Lastly, your story map must also includes enough nice to have’s. Since we consider scope our variable when we need to make decisions in order to keep the project on course, the story map, and later the backlog, must always contain some nice to have’s in case you need to drop something. If you don’t have these, you will be cornered immediately after a higher priority user story takes more budget and time than initially thought. And I think any project manager knows that this will happen.

Once you have this, you should think about which team would best be suited to implement this. In my experience, the best estimates come from teams that are also going to do the implementation. My experience with dedicated estimation groups who estimate a lot of projects that are then handed over to other project teams for execution is that they tend to miss a lot and produce unreliable estimates. But your experience may differ of course. I think it is a psychological issue; if you know you will be held accountable for the correctness of the estimate, and you also know you will have the power to take corrective action during the execution, you’re going to do a much more serious job. Another requirement for the team doing the estimation is that they have enough knowledge of the business domain and technology to come to sensible conclusions. That means you need at least a few experienced people on the team, who have done similar projects in the past.

So now you have a team and a story map. The next step is to schedule a few workshops with the team and the product owner to discuss the vision and the story map. This will lead to architectural and design discussions and decisions. It will also lead to changes to the story map, like new user stories, splitted user stories, rewritten user stories etc. Each time a user story is considered final, you can play planning poker to come to a story point estimate for it. Now of course you are not going to continue this until you have broken down all epics and user stories to 1 or 2 point stories. That’s just impossible. Usually 2 to 5 sessions spread over 1 or 2 weeks is enough the determine everything the product owner and the team know at this point in time, and the rest should become clear as the project moves along. So now you have a story map with some detailed user stories with small story point estimates, some less detailed user stories with larger story point estimates, and some epics with the largest story point estimates. All major architectural and design decisions should have been taken. The story map also contains some nice to have’s. The principal risks should also have been identified. At this point you have an estimate in story points for your entire project, which you can convert to a man day budget with 1 of the methods I described in my previous post. Or maybe you still have another method of doing that. Usually at this point I try to determine, together with the team, whether a risk premium is needed on top of the resulting man day estimate, and if yes, how high it should be.

When project execution starts, you will need to repeat this exercise regularly during backlog grooming meetings, so you can incorporate feedback and changes to the backlog. The backlog is of course a living entity, so some user stories will get dropped, new ones will appear, and other ones will need to be re-estimated based on what we learned. And of course it is vital that you keep track of any scope changes in your burn-up chart, so you can take immediate action when required.

OK, so you have your estimates, your budget and your schedule, and now your project sponsor or CFO says: That's too expensive to make a positive business case, make it cheaper. What to do? Well, asking your product owner which scope can be dropped, that's what you do. And again, make sure that not only the nice to haves are removed, because then you may run into trouble later in the project if one of your user stories comes in more expensive. If the project is like a pressure cooker, and the scope is what’s being cooked, then the bottom of your backlog is like its escape valve. If you don’t open it on time and let stuff out, the whole thing will blow up.

Another alternative that may be proposed is to look for cheaper resources. That’s fine, but I would at that point do the estimating exercise over again with the new team, because these cheaper resources may not have as much domain knowledge or experience, leading to higher man-day estimates. That would result in a different budget and schedule, than you would expect based on the daily rate alone.

Monday, February 25, 2013

Scope vs. Effort

When estimating and executing a project, it is important to have a good view on effort and scope. In order to do so you need to have a metric that you can use to measure these. For effort it is obvious, man days, man hours, man weeks will be your unit. Often, the same units are used for scope as well. In Earned Value Analysis you have something called a budgeted man-day (or budgeted dollar) which is used to define both scope as estimated effort. Your actual man-day’s or dollars will then tell you how much you really had to spend in order to achieve a budgeted manday or dollar. In this setup, estimated effort and scope are treated as one. Personally I find that confusing, since the same unit of measurement is used for two entities. Another drawback is that you are expected to come up with pretty precise man-day estimates even for the smallest items. That means a lot of estimating to be done up front, and the relevance and accuracy of man-day estimates for small items a long time before they are implemented is questionable.
Of course this one of the reasons why the Agile community came up with the concept of story points. I must admit I needed a while before I learned to appreciate the story point. Initially I found it too abstract, and I had difficulty in relating it to budget. My reasoning went as follows: It is nice to have a project estimate in story points, but you can’t go to your project sponsor or CFO and tell him that this project is going to cost 750 story points. Senior stakeholders need a dollar figure in order for them to calculate a ROI on the project to see if it makes sense. So at the end of the day you still need to convert your story point estimate to man-days and dollars, and if that is the case, then why bother with story point estimating at all? It just seemed making estimating unnecessarily complex.  So up until a few years ago, I did all my estimates in man-days and use Earned Value Analysis to track progress.
But more recently I realized that the good thing about story points is that it can be used as a measure of scope, independent of effort. The two are related of course through the concept of velocity (story points per sprint), but they are separate entities that have their own measurement unit and can be estimated separately. This means that a team can have discussions on the contents of a user story or epic without thinking about the effort. They can decide on whether user story A is larger or smaller than user story B and as a result they assign more or less story points. There are several ways of doing this (planning poker and comparing with a baseline user story, or sorting the user stories by size and give the smallest one a value of 1 and then go up the Fibonacci series). Once all user stories and epics are estimated in story points at the beginning of the project, and many functional and technical issues have been addressed during the estimation process, putting a man day figure on the project as a whole becomes a lot easier. I explicitly consider the story point estimate as a measure of scope, because I can then also track it separately from effort. As experience learns, there are cheap story points and expensive story points. Over the whole project you will average out a certain velocity, but some user stories that have a low estimate in story points will turn out to be expensive in actual effort, and some large user stories will come in pretty cheap. That’s just a consequence of not knowing everything up front and unforeseen events that impact the actual effort. In other words, you will tend to overestimate some user stories and underestimate some others, and you will encounter team members falling sick or leaving or on the contrary maybe feeling more productive than usually. This means that the relationship between scope size and actual effort is a statistical one, not a deterministic one. On average you will achieve a certain velocity but for an individual user story you may either achieve that velocity, or go faster or slower. That will depend on the things you didn’t know when estimating the story and circumstances you encounter while you are implementing it, both of which you don’t have under control at the start of the project.
In Earned Value Analysis, this would come up as some budgeted man-days being cheaper in actual man-days than other budgeted man-days. Again, I just find that confusing. I prefer to work in story points nowadays. If we add a new user story, we only consider its size relative to the ones we already have on the backlog. If we are not entirely sure, we assign more points to incorporate the risk. We don’t bother with getting a precise man-day estimate. After adding a new story, our scope has increased with some points and our project burn-up chart will tell us if the total scope is still feasible within the allocated budget and schedule or that we have to remove something.
All fine and well, but as I said previously, CFO’s don’t care about story points when the time is there to assess projects on their potential ROI. So how do we go from story points to man-days and dollars? If you have an existing team with an established velocity this is straightforward. Just divide the amount of story points they estimated through their velocity and you end up with the amount of sprints. From there it is easy to get to man-days and then via their daily rates to dollars. If you have a newly composed team with still unknown velocity, you can approach this in several ways.
The first is to let the team establish  an initial value for the velocity. Let them take a few user stories again, of different sizes and try to estimate these in man-days. Then compare the results to the story points of those user stories. Continue with this until the man-day estimates for these stories are more or less in line with the story point estimates i.e. a 1 point story results in an X days estimate, a 5 point story results in more or less a 5X days estimate. That should give you an indication of story points per man-day. That you can translate to an initial velocity and total project man-days and dollars.
Another approach is from a resourcing perspective. You can ask the team the question, how much time would this entire project take us, and which team composition would we need for it, based on the information we learned estimating the initial user stories and epics? The result will be a man-day estimate for the entire project, and you can work your way back to an initial velocity.
Or you can let them plan the first sprint based on what they think is realistic and take that as your initial velocity. Again just divide the amount of story points through this initial velocity and from there go to man-days and dollars.

Friday, February 15, 2013

Bug tracking and monitoring


Developers make bugs. There’s nothing you can do about it. So in order to deliver a high quality project the key is to find them and fix them. That means you need a view on how much bugs have been found vs. how much are potentially in the already developed code, and you need a view on how much of these have been solved. If you don’t have a clear view on either of these parameters at all times during your project, your project has an unknown quality. That means that your schedule and budget are becoming unreliable, since you don’t know how much time and money you still need to spend on bug fixing.

A useful indication of how much bugs you have found vs. how much might be in there is the amount of defects logged per developer man day spend. If you have historical bug and time tracking data on projects on the same code base performed by more or less the same people, you may be able to determine such an amount and compare it to the ratio of your current project. In the projects I’ve done so far, this number is between 0.5 and 3 bugs per developer man day spend. I don’t make a distinction between creating the bug and fixing it, I just take the total man days spend by developers, irrespective of their activities. I also don’t distinguish between types of bugs, so even if the bug cannot be reproduced or if it’s due to a misunderstanding of the functionality, I still count them. For instance, if you have seemingly no trouble to get your user stories accepted, but your bug detection rate is lower than what you would expect, that may be a sign you need to dig a little deeper to find out why.  Are the test scenarios thorough enough? Have developers improved their unit test code coverage leading to less defects being deployed in the test environment? It is important to know this because you may have a false perception everything is going all right during the project, only to be confronted with many post release issues that should have been avoided. Similarly, if the bug detection rate is higher than you anticipate, it is important to know why. Your developers may have less experience. Your user stories may be subject insufficiently elaborated. If you have more bugs than anticipated, you need to spend more time fixing and verifying them, so this may impact the amount of scope you can deliver within the set schedule and budget.
You also need to know how many bugs are fixed and verified in comparison with the total amount of bugs. In order to be confident in the quality of the already implemented code, these to numbers should be as close together as possible. If you have many open bugs, that means that you need to spend time and money on fixing them, and that the functionality you have delivered so far is of inferior quality. The way to handle this is to give bug fixing priority over all other tasks from the start of the project right until the end. It makes sense from a developer standpoint as well. A bug is fixed fastest if the coding he has done is still fresh in his memory. If he needs to address a bug a few weeks or even months after he created it, he will need some time figure out and remember what he had done exactly. If you have a well defined Definition of Done that you apply consistently to all you user stories, this comes in a natural way, since an open bug means a user story not done, and a corresponding drop in velocity. If you do this, and you log your total and open amounts of bugs at the end of each sprint, you should be able to pull a graph more or less like the one below:

You can see that during this project, the red line closely follows the blue line. Only at the end of sprint 16 there is a somewhat larger gap between the two. That means that at that point some action had to be taken to close the gap again. This can be overtime, or taking on less user stories in order to focus on bug fixing. In any case, it was dealt with because in the following sprints you see the gap closing again.
In a more traditional waterfall like approach, this graph would look somewhat like the one below:
You see here that the defect detection rate goes up very fast during the testing phase, and levels off towards the end of the project. Up until the leveling off starts, you basically have no idea how much bugs are still in the code, and you may still be in for negative surprises. Also, you don’t have actually reliable working software until just before the project ends. That means it is difficult to get any functional feedback from end users, since they need to work their way through the bugs first when assessing the application. Finally, the graph indicates that developers spend a good chunk of their time during the last stage of the project fixing bugs. A seemingly never ending flood of bugs can be demoralizing for your team, since they feel they have no control over it. Even very good developers may resign over situations like this, at the time when you actually need them most.

Tuesday, February 5, 2013

Project baselining and tracking

As a project manager, independent of methodology, you need to keep track of four things: Scope, schedule, budget, and quality. Nothing new here. In Agile, the favorite method (or at least my favorite method) of keeping tabs on your project is through the burn-up chart. This chart covers scope, schedule and budget. Tracking quality will be the topic of another blog post.
You start out with a certain amount of scope for your project, that you can express in story points, ideal man days, dress sizes or whatever metric you choose, as long as it is numeric and everybody involved in the project understands and agrees on its meaning (There will be future blog posts discussing the various options of measuring scope and their implications, as well as estimating the initial scope of a project). Please note that scope is treated as a variable through the project, we keep budget and schedule fixed.
The schedule follows from the speed at which you implement scope; this is of course known as the velocity, the amount of story points implemented per sprint. In order to baseline your project, you must have some idea of the velocity up front. If your team has been working together in the same composition in the previous project, they will have a pretty good idea of their velocity. If the team works together for the first time, you need to discuss and agree with them on a velocity that they feel can realistically be achieved. If possible, make corrections for known holiday periods or other absences. The alternative is of course that you use the first few sprints to determine this velocity, but then you cannot give a baseline schedule up front. This may or may not be an issue depending on your business processes regarding project funding, approval and kick-off.
You will also have a fixed budget within which your team must deliver the scope. This budget can be expressed in real man days or a currency. The translation from scope to budget must also be agreed upon with your team; at least as far as man days are concerned. The actual dollar amounts will depend on the rates and salaries of the team members, and also invoicing and payment cycles are not likely to match your sprints. Tracking the budget in monetary terms is therefore a little more complicated, but there are no fundamental reasons against it. Since you have already established a baseline schedule based on your expected team velocity, you can also produce a baseline cost progression based on expected team member availabilities. I will refer to the speed with which you spend budget, i.e. the amount of man days or dollars per sprint, as the burn rate.
So now we have four metrics: baseline scope to be delivered, baseline delivery schedule based on expected team velocity, a budget and a baseline cost progression based on expected burn rate. You can plot these in a graph versus time, in units of fixed length sprints, see the figure below.
In this graph you see that up until sprint 10 the expected velocity is the highest. That’s because one of the teams is expected to leave the project after that. From Sprint 20 on, only a minimal team will be taking care of post-release work. The baseline cost progression follows the same pattern. Scope is expressed in story points, budget in man days.
In order to make this a burn-up chart, you need to update the amount of story points delivered (as per your Definition of Done) and the amount of man days spend after each Sprint Review. After several sprints, the result will look something like this:

You see here that the actual scope delivery is above the baseline, and that the actual cost progression is slightly under the baseline, so this project is going good. You also note that the blue project scope line has upward bumps at sprint 5 and 10. At this point re-estimations and new user stories resulted in additional scope. According to the scope trend line, this project will finish by sprint 15, and according to the cost trend line, around 1800 md will have been spend at that point. Of course the trend line does not yet take into account the reduced staffing that will come into play after sprint 10, making these projections for cost and schedule too optimistic in this particular case. However so far this project has performed slightly under budget and it is ahead of schedule since more scope has been delivered than projected.
A different outlook would be the one below:
This project is running behind schedule; less scope has been delivered than planned, and at the same time more budget has been spend. At the same time, additional scope has been added to boot. The scope trend line indicates that all current scope will be delivered by sprint 22 (but only if we keep all teams on board, so no reduction in velocity), and the cost trend line indicates more than 3000 man days spend at that point, a 50% overrun. This project is headed for disaster, and it’s time for immediate remediating action. And that action would be to start dropping non-essential scope items so that the blue project scope line goes down again, as well as tackling any impediments identified by the team as a reason why they need to spend extra time.