Friday, February 15, 2013

Bug tracking and monitoring


Developers make bugs. There’s nothing you can do about it. So in order to deliver a high quality project the key is to find them and fix them. That means you need a view on how much bugs have been found vs. how much are potentially in the already developed code, and you need a view on how much of these have been solved. If you don’t have a clear view on either of these parameters at all times during your project, your project has an unknown quality. That means that your schedule and budget are becoming unreliable, since you don’t know how much time and money you still need to spend on bug fixing.

A useful indication of how much bugs you have found vs. how much might be in there is the amount of defects logged per developer man day spend. If you have historical bug and time tracking data on projects on the same code base performed by more or less the same people, you may be able to determine such an amount and compare it to the ratio of your current project. In the projects I’ve done so far, this number is between 0.5 and 3 bugs per developer man day spend. I don’t make a distinction between creating the bug and fixing it, I just take the total man days spend by developers, irrespective of their activities. I also don’t distinguish between types of bugs, so even if the bug cannot be reproduced or if it’s due to a misunderstanding of the functionality, I still count them. For instance, if you have seemingly no trouble to get your user stories accepted, but your bug detection rate is lower than what you would expect, that may be a sign you need to dig a little deeper to find out why.  Are the test scenarios thorough enough? Have developers improved their unit test code coverage leading to less defects being deployed in the test environment? It is important to know this because you may have a false perception everything is going all right during the project, only to be confronted with many post release issues that should have been avoided. Similarly, if the bug detection rate is higher than you anticipate, it is important to know why. Your developers may have less experience. Your user stories may be subject insufficiently elaborated. If you have more bugs than anticipated, you need to spend more time fixing and verifying them, so this may impact the amount of scope you can deliver within the set schedule and budget.
You also need to know how many bugs are fixed and verified in comparison with the total amount of bugs. In order to be confident in the quality of the already implemented code, these to numbers should be as close together as possible. If you have many open bugs, that means that you need to spend time and money on fixing them, and that the functionality you have delivered so far is of inferior quality. The way to handle this is to give bug fixing priority over all other tasks from the start of the project right until the end. It makes sense from a developer standpoint as well. A bug is fixed fastest if the coding he has done is still fresh in his memory. If he needs to address a bug a few weeks or even months after he created it, he will need some time figure out and remember what he had done exactly. If you have a well defined Definition of Done that you apply consistently to all you user stories, this comes in a natural way, since an open bug means a user story not done, and a corresponding drop in velocity. If you do this, and you log your total and open amounts of bugs at the end of each sprint, you should be able to pull a graph more or less like the one below:

You can see that during this project, the red line closely follows the blue line. Only at the end of sprint 16 there is a somewhat larger gap between the two. That means that at that point some action had to be taken to close the gap again. This can be overtime, or taking on less user stories in order to focus on bug fixing. In any case, it was dealt with because in the following sprints you see the gap closing again.
In a more traditional waterfall like approach, this graph would look somewhat like the one below:
You see here that the defect detection rate goes up very fast during the testing phase, and levels off towards the end of the project. Up until the leveling off starts, you basically have no idea how much bugs are still in the code, and you may still be in for negative surprises. Also, you don’t have actually reliable working software until just before the project ends. That means it is difficult to get any functional feedback from end users, since they need to work their way through the bugs first when assessing the application. Finally, the graph indicates that developers spend a good chunk of their time during the last stage of the project fixing bugs. A seemingly never ending flood of bugs can be demoralizing for your team, since they feel they have no control over it. Even very good developers may resign over situations like this, at the time when you actually need them most.

No comments:

Post a Comment