Although we no longer have to contend with actual moths infiltrating our vacuum tubes (yep, that’s where the term bug originates), their digital descendants are still regular visitors to every codebase on the planet. In the same way that bugs have changed over time, so has the way that they are dealt with, and this is particularly pertinent to our new agile way of thinking.
Previously, our waterfalling world viewed the handling of bugs very sequentially (see Figure 1).
With this simplified process as the benchmark, let’s look at what needs to change when implementing Scrum. In particular, it is important to appreciate that programming and testing need to be conducted in tandem rather than in sequential phases if teams hope to deliver working functionality early and often.
New Definitions
Before exploring some new Scrum-friendly processes for handling bugs, I’d like to set up some foundations with a few definitions and principles that I like to use.
Definition 1: Issues
- An issue is a problem that occurs during the sprint and is tightly coupled to a user story that has not yet met its definition of done. Issues will therefore typically be picked up during the sprint (that the corresponding user story is being worked on) either by the programmer, the automated build, a tester conducting exploratory testing, or the product owner during a walkthrough
- An issue is not a product backlog item (PBI). Instead, an issue should be seen as part of the evolving acceptance criteria for a user story. Essentially, what I am saying is that until the issue is resolved, the user story is not complete, and that being the case, an issue is a part of the actual user story rather than an independent, albeit associated, product backlog item.
Definition 2: Bugs
- A bug is a bug only if it is identified after a user story has been completed and accepted by the product owner. Bugs will therefore typically be picked up by users (postrelease) or via an automated regression test (following the implementation of subsequent user stories).
- A bug is a type of product backlog item. A user story is another type of product backlog item. Bugs and user stories should be prioritized together in the same product backlog and estimated using the same approach, such as relative estimation. A particular bug may relate to a specific user story, but it should be treated independently as far as any tracking and prioritizing is concerned. A bug can theoretically be represented utilizing the user story format, although I personally don’t find it to be a suitable format in most cases.
New Principles
Now let’s look at the three new principles.
Principle 1: Remove the Red Tape
Recall the second principle from the Agile Manifesto: “Working software over comprehensive documentation”. During my earlier years getting “soaked” in the waterfall world, I observed that a nontrivial amount of time was spent by both testers and programmers carefully documenting in painstaking detail the relevant bug minutiae. I remember regularly asking myself whether it was actually taking longer to document or to fix the damn bugs. Scrum relies on as much real-time communication as possible (rather than formalized, written bug reports), but if documentation is required, it should be fit for purpose and kept to the bare minimum.
Principle 2: Address Issues Immediately
There is nothing worse than your own stale code. Oh, wait a minute, yes there is — someone else’s stale code! Sadly, back in the day when we all used to follow the sequence detailed at the beginning of this article, it was very common to have to return to work on bugs in code that we had well and truly moved on from. The spin-up time to get back into the headspace to address old issues (be they yours or those of a colleague who’s off on vacation) is significant and, frankly, a big fat waste of time. The sooner an issue is found, the cheaper it is to fix, and that is why with Scrum, testing is completely entwined with the programming.
Principle 3: It Ain’t Over ’Til It’s Over
Bottom line is that unless a user story meets the definition of done, it might as well not exist as far as the customer is concerned. Customers are simply interested in final results and obtaining business value. If a user story is not yet done, it needs to be the top priority for the developer(s) working on it, and they should avoid moving on to any new work until it is completely done and dusted!
New Approaches
With our new bug-handling foundations now established thanks to the previous definitions and principles, let’s focus on some approaches that I recommend you follow within the sprint:
- Scenario 1: A tester is conducting some final exploratory testing on a user story and discovers an issue.
First, because the user story should be the top priority of the programmer working on it (see Principle 3), the tester should feel free to walk over to the programmer and explain and/or demonstrate the issue as soon as it is found. Again, because the user story is the top priority, the programmer should drop whatever he or she is doing and immediately jump on the issue. In this situation, there isn’t any requirement for written documentation, as the verbal discussion will suffice, assuming that the issue is immediately resolved and verified. - Scenario 2: Same as Scenario 1, but this time, the programmer in question is already in the middle of resolving a different issue (related to the same user story).
In this case, the tester, after finding another issue, looks over and sees the programmer with headphones firmly in place and in the zone fixing a previous issue. The last thing the tester wants to do is to disturb the programmer mid-fix. As such, it is important to capture the details somewhere so that the tester can continue with the exploratory testing without forgetting the details of the new issue.As discussed in the definition of issue, an issue should be treated as part of the acceptance criteria of the user story, saving the tester the pain of creating a new bug, classifying it, assigning it, prioritizing it, and so on. Instead, my recommendation is that the tester simply add a line item to the acceptance criteria with a date/time reference, initial it, and add some bullet point details of the issue. When the programmer is free, a discussion can take place using the notes as a prompt. Also, the documentation ensures that the programmer can get on with the resolution even if the tester is not around for whatever reason. - Scenario 3: During the final user acceptance testing for a release, a range of trivial user interface bugs are found that were somehow missed during development.
Again, let’s try to reduce the time spent on unnecessary administration; in this situation, I recommend that a single PBI be created as a container for the collection of minor bugs. Each specific fix may take only minutes, so creating individual PBIs for each issue could end up taking longer than the actual fix-ups!I recommend following this approach only if:- The trivial bugs are of a similar priority level.
- They are somewhat related and it makes sense to tackle them at the same time.
If these conditions aren’t met, then simply create separate PBIs for the items even if they are seemingly trivial in their own right.
-
- Scenario 4: During a sprint, a critical bug is found in production requiring some of the Scrum team to resolve it.
The first question to ask is, How critical is critical, or more specifically, can it wait until the next sprint? The last thing you want to do is to change the goal of the sprint. Assuming that the production bug can wait, it should be captured as a PBI, entered into the product backlog, prioritized by the product owner, and tackled potentially in the next sprint planning session.However, what should happen if the discovered product bug is one of those dreaded villains that simply can’t wait? Well, we then need to ask another question: How long will it take to fix the bug? If you recall from the article about sprint planning, it isn’t wise to max out team capacity for working exclusively on new sprint tasks to provide some room for handling non-project-related disruptions. As such, this buffer time can also be allocated to resolve the occasional emergency bug without disrupting the sprint.If, however, the resolution will take longer than the buffer time, you have two choices. First, you can treat these issues as impediments and track them accordingly (check out this article about Incriminating Impediments), or, if the issue is such a major drama (to the extent that it completely destroys the sprint goal), there is always the undesirable fallback position: a sprint cancellation that can be called by the product owner. A cancellation will end the current sprint and send the team back to sprint planning.
- Scenario 4: During a sprint, a critical bug is found in production requiring some of the Scrum team to resolve it.
Turning Moths into Butterflies
Bugs can certainly cause pain, and like it or not, they’re never going to become an extinct species. However, what we have learned are better ways to deal with them. We now know that disposing of fresh bugs is easier than having to deal with old, festering ones and that spending unnecessary time documenting every issue is a waste of time.
Scrum handles testing and bugs very differently from traditional approaches. By adopting these new definitions and principles, you will start to avoid the unnecessary overhead and communication breakdowns that have previously stopped teams from turning those moths into less ugly butterflies.
If you liked this article, you can:
Subscribe to this RSS feed!
Find out more about managing bugs and issues in Scrum by taking one of our CSM training courses.
Great article Ilan,
Thanks Ryan – glad you enjoyed
Agreed. Defects found in stories that are not ‘done’ should be fixed before the story can be considered done. There may be rare occasions when a defect cannot be fixed within the sprint. If so, the Product Owner has to decide whether to accept the story with limitations or reject it entirely. Either way, a new backlog entry is required to fully implement the story.
The debate about including or not including defects in velocity is unimportant in my view. As long as the team is consistent in its approach, velocity will work itself out.
Nice feedback Vin. My only comment would be that if the PO does in fact reject a problematic story then it can simply move from the sprint backlog back to the product backlog rather than needing to create a new entry.
@vin
a very good point about velocity working itself out when treated consistently.
Excellent post, bugs are an area that I found particularly challenging in my quest for agility. The product I inherited had a long list of bugs that could not be ignored when trying to convince my peers to go Agile. Bugs seem to be an item that it would be impossible to schedule because “they usually need to be fixed YESTERDAY and you’re telling me I may have to wait 2 sprints!” But after finally convincing them to use a method very similar to the impediment option you’ve proposed along with a change to the way we check in, test and release, 95% of bugs are now scheduled the same way user stories are – the other 5% are usually environment or configuration problems that are really just regular everyday impediments.
Estimating bugs is now my greatest challenge. I find that there is definitely a difference in estimation error between bugs and user stories – which is probably why you’ve mentioned their impact on velocity. In my case, implementing SCRUM has definitely stabilised the product and reduced the rate of bugs found, but they will always be a part of each sprint (due to the inherited level of technical debt combined with a lack of funding/interest in repaying it) so I think that including them in the velocity is important.
On the topic of ‘when it’s a bug’ I agree with all expect one niggly thing: Scenario 2 – if the tester finds 2 problems with the user story then I’d argue that it probably important to yank the dev out of the zone and make sure he’s doing the right thing in the zone – but then, this depends on the tester-dev relationship. I like to imagine that scene in the ‘Social Network’ where Eduardo grabs Mark’s laptop while he’s in the zone and smashes it and tells him to “make sure his definition of done is correct!”.
Sounds like you are making some great progress there Nick! I generally agree with you Nick regarding ‘scenario 2’ i.e. if the story is riddled with issues then it is time for an immediate chat…
Ah great scene! That would be particularly satisfying!
Hello Ilan,
Great article, thanks for it. You asked for feedbacks on how your readers handle the bugs, here’s mine.
We are new “real Scrum” practitioners (up until the current project we were not really applying Scrum principles according to me although we called it “Scrum”) and the handling of bugs is a real and big issue that needed to be handled specifically.
What we do:
* “Stories” and “Bugs” are two different kind of items in the backlog, as you suggests.
* “Bugs” does not only represent bugs that needs to be addressed on the current product but also various tasks related to shipping the previous product (including bug-fix but not only), since the new project starts before the result of the previous project has been completely delivered to the customer in its final version, i.e. code has been shipped to and validated by the customer.
* During Sprint Planning, the team does not only commits on a set of “Stories” items to be DONE by the end of the sprint, but also commits on a set of “Bugs” items to have finished. “Bugs” items does not use the definition of DONE as for “Storeis”; a “Bugs” item describing a task to do is done once the task is finished; a “Bugs” item describing a bug to fix is done once the bug has been fixed and validated.
* However, commitment on “Bugs” are not scheduled for the end of the sprint but for the end of each week of the sprint. This is to ensure that some “Bugs” are fixed every week since the customer is following closely our progress on the matter and because we are using 4 weeks sprints. This can be seen as some kind of “mid-sprint deliverables”.
* Finally, in case of an unforeseen impediment — pretty much your Scenario 4 — it is possible to replace during sprint “Bugs” items in the backlog by new ones. This requires the agreement from both the Product Owner (that is, acknowledge that the issue is to be fixed NOW and not during next sprint) and the team, which revises their commitment on “Bugs” items for the spprint. For instance adding a big item may require to remove several items to keep the global sprint commitment on track.
The core idea there is that the commitment of the team during Sprint Planning must NOT be spoiled: otherwise the team may not be able to deliver what have been committed, resulting in a big loss of motivation. Indeed, the feeling would be that “it took so much time to do the unplanned tasks that we could not work effectively and deliver as expected — we don’t feel guilty since we don’t felt in charge.” That would be definitely a bad Scrum practice 🙂
Right now applying these principles is more easily said than done, since this process is new. We’ll see how things goes in the long run, maybe it was a bad idea, maybe we just need to refine it; Retrospectives will tell.
Let me know what you think of it!
Thanks in advance,
Jean-Pierre Lambert
@Jean-Pierre – thank you for sharing your method so comprehensively with us all. Sounds like you are making some good headway! I will make a few suggestions though based on what I’ve read:
1) I think it is a slippery slope to allow the switching out of PBIs mid-sprint, especially frequently. It is pretty hard to get an exact match especially without impacting dependencies.
2) Based on the above point, I recommend that you perhaps try out some shorter sprint iterations – maybe 2 weeks? This way you can swap things more rapidly (between sprints) without impacting the sprint commitment. Is there a particular reason that you are working with 4 week sprints? From my experience this is too long.
3) Regarding your statement “During Sprint Planning, the team does not only commits on a set of “Stories” items to be DONE by the end of the sprint, but also commits on a set of “Bugs” items to have finished.” – why don’t you just use the priority order to determine what should be delivered? i.e. if bugs are a high priority then should be near the top. It sounds like you are effectively running two separate backlogs by the sounds of this comment.
I hope this helps 🙂
Thanks Ilan for the constructive comment, let me reply point by point.
1) We could not come up with any better alternative to handle these “unexpected, urgent tasks”. Maybe we should make sure that the team has enough free time to handle these unexpected tasks, however there is no clear pattern in when they happen. Recurring tasks not included into the backlog are already numerous but are not included into the backlog, as the team is expected to take them into account into their estimates, i.e; on average people are not full time on the backlog. This is, however, easier said than done: even these recurring tasks are taking much more time than we initially expected…
Also these item switching were especially needed now because we are shipping our previous product while starting work on the new one; once the previous product will be definitely shipped we do not expected this to happen too frequently.
Indeed, the “Bugs” on which we discuss advancement every week with the customer are not expected to change mid-sprint: the Product Owner has made it clear to the customer that we are working on 4-week long sprints and that the list of bugs to be fixed cannot be changed in the middle of these 4 weeks.
2) As a Scrum Master, I already suggested that it could be a good idea to reduce the duration of the sprint. However, as much it is my duty to suggest it, as much it is not mine to make the decision. There is also some reasons for keeping 4 week long iterations: for instance it is easier to synchronize with other teams that will certainly keep 4 week long iterations. Also, working on embedded software, short iterations are usually not working good.
I guess I should get people to weight the plus and minus of reducing the iteration length, and if possible give it a try. Experience would tell.
3) Well the problem is that bugs are not necessarily high priority; actually we are expected to spend some fixed amount of time on it in parallel of delivering the new product. We previously tried to dedicate one day per week to bugs but everybody agreed to say it was not working well. So instead we treat them as any other items in the backlog with a reduced delivery date by the end of the week instead of the end of the iteration.
Maybe another way to look at it is that stories are higher priority than bugs, but bugs should be fixed by a scheduled date… Either way, I am feeling it is broken, but we did not come up with any better option by now.
Following your advice, maybe we should prioritize bugs by something like “X bugs fixed by story done”, thus giving the priority rule to use into the backlog by mixing stories and bugs. If the sorties are small enough, this could work pretty well in the area of “delivering some bug fix news to the customer every week”. Again, especially if the sprints were shorter : right now, with 4 week long sprints, a “short story” is about one week of work — too long for the bug-and-stories-mixing to work properly.
Well, thanks! It always help to discuss with other people 🙂
Hi Pierre – this is an interesting situation however having thought about it, I think you need to look at things differently due to your multiple delivery backlogs i.e. you seem to have a different delivery schedule for legacy bugs as you do for other PBIs. This goes quite counter to Scrum as there is no single source of truth when it comes to prioritizing. As such, if I were you I would try and either create a separate BAU team (might only be one person) who will work on the legacy bugs or alternatively, reduce the capacity of the sprint and use the balance of the time to focus on the legacy bugs (outside of the Scrum project).
p.s. sorry for the delay on replying to this.
[…] Bugs de Software y Scrum: Es una propuesta a la gestión de bugs con Scrum […]
[…] Bugs de Software y Scrum: Es una propuesta a la gestión de bugs con Scrum […]
@IIan, Great blog post. I don’t fully agree with your approach, but I don’t hate it either. You’ve clearly thought through it, and you also clearly have been there, done that!
I don’t like that you assign story points to bugs and I also don’t like that it seems very complicated. My solution is mildly complicated too, but I believe it is simpler than yours and reflects that bugs are actually a drag on velocity not a component of it. Again, I don’t hate yours and it’s probably the second best one I’ve seen (besides my own — but that’s just *MY* opinion) out there and I’ve seen a lot of them!
I totally agree with preferring verbal communication/collaboration over mega-documenting… though some bugs require a fair amount of documentation…(usually, though, as documentation, all you need is “steps to reproduce” and a couple of screen shots).
The solutions I generally advocate are here:
http://scrumcrazy.wordpress.com/2011/01/06/in-scrum-should-i-assign-story-points-to-bugs/
http://scrumcrazy.wordpress.com/2011/08/18/how-do-i-handle-production-support-on-a-scrum-team/
Hi Charles – thanks for the feedback! Out of interest which part(s) seem complicated? Perhaps it was my explanation that wasn’t overly clear? Yes the story point assigning is contentious but again, I think for forecasting purposes it is important to include bugs in the velocity. Perhaps a compromise is to track velocity and (pure) velocity?
Bah, Scrum Bug! How to manage those bugs and issues http://t.co/358ryYLm
[…] Bah, Scrum Bug! How to manage those bugs and issues […]
Few questions:
– If the team is geographically distributed then documenting the issue will be required, correct?
– Coder and tester working together is an excellent way to ensure good quality product, however if the tester does not document the issues/bug how do we ensure that the coder is taking proper care in writting the logic i.e nothing half baked is handed over which will consume more time from testers side.
– Whenever developer says that the module/task is coded (Dev complete) and good to test, then any failure should be reported as bug, althoiugh the overall story is still under developement, dont you think so?
– If there is no bug raised during the user story coding period, how can the RnD manager go back and enhance the coding standards and review process?