Agile: Story Completion Problems

Monday, December 29, 2003

On a recent project we ran into an interesting problem – or rather we ran into it at the end of every iteration and especially the end of each release. The essence of the problem was the stories were “done” but the customer had not signed off on them. In non-agile shops the mantra “QA has it” is heard for the same symptom, in fact our morning stand-ups were starting to have the same flavor. For this project there are 2-3 developers for each tester and our current immature state of acceptance test automation means a full regression of our primary product takes about 1 week including automated, manual and some ad hoc testing.

During the iteration all the stories that were selected are “development complete” by the end of the iteration, but are almost always never completely acceptance tested by the customer. In our case the testing team is the customer advocate/proxy. So for every iteration the story signoff lags several days to sometimes weeks behind “code complete”. Our iteration story completion graph looks something like this:

Some on my team (myself included) originally argued that this lag or completion delta was a natural part of software development and that story completion shouldn’t be tracked by iteration, but rather by release. The argument went something like “until we get better at automated customer tests we’ll just have to live with it”.

We went by this assumption for several release cycles and were very unsatisfied with the results because the customer never had a good idea when any particular story would be complete (by the release date wasn’t good enough). Additionally the developers were having a hard time estimating their velocity (at this time we were tracking individual developer velocity) since there would be an unknown amount of time spent fixing defects from stories from previous iterations. In essence the customer couldn’t drive.

Things didn’t get any better at the end of the release either. Since all those incomplete stories kept stacking up we started having a 1 week long “release” iteration, which was composed of prepping any new hardware, creating roll-back scripts for the database and fixing bugs in old “completed” stories. Since we were completing bug fixes in the days before release, sometimes up to hours before the release, our quality confidence was not always high. Not surprisingly a couple of problems did slip through into production. Several times the release date slipped a week to allow for finishing the stories. Ultimately we came to grips with the fact that allowing a completion delta was not going to allow up to predictably deliver quality software. In retrospect our story completion graph for the release looked something like the following:

Not a lot different than the story completion graph for the iteration.

Since we had come to the realization that our story completion delta was preventing us from reaching our goal of predictable delivery of quality software, we knew that we had to change something. The obvious thing to change was our assumption that there needed to be a completion delta and ensure the stories were completed (customer accepted) by the end of the iteration in which they were developed.

Obviously there are some difficulties to overcome. Our team (and most developers) had a background that included an us vs. them relationship with the testing team. The first thing to realize was that development & testing are on the same team with the same goals, but different skills for achieving that goal. Development turns requirements (stories) into working code with unit tests and communicate with testers and the customer. Testers (customer advocates) assist in defining/refining stories (requirements) with the customer and communicating with developers. They also create and execute the customer tests that indicate whether the story is really complete. Testers in fact seems to have the more difficult job as they turn sometimes vague business needs into stories that can be tested and described to developers in a way that can be turned into usable software.

Since both teams are trying to achieve the same goal it makes sense to have both teams work on common areas – like acceptance tests. This is not to say that developers should write tests, merely execute existing tests written by the customer. The side benefit is that if you have “lazy” programmers they will soon find a way to automate the tests whether through a homegrown, open source (i.e. FIT) or commercial framework. If automation doesn’t spring up automatically, steps need to be taken to find out why not.

Another benefit of automating some of the acceptance tests is than the code gets written in a way that facilitates testing – which usually means loosely coupled, highly cohesive, well layered code, i.e. the kind you want anyway.

Theory is great, but how about reality? Will developers execute tests and try to automate them? Will testers trust developers to execute tests correctly? Can the communications gap between testers and developers be bridged?

We considered reinforcements to help change habits, such as demonstrating stories or acceptance tests from the previous iteration at the iteration planning meeting to give people a sense of urgency during the iteration and to help plan realistically regarding what is really complete. We also talked about using burn-down charts or some other information radiator so that everyone always knew the status of the iteration.

In the end this is what we ended up enforcing the rule that the story must be customer accepted before it was considered “done”. To help ensure that the highest priority stories were being worked on first we changed the format of our standup meeting to work through the status of the stories in priority order, rather than on individuals. Additionally I tried to add FIT to the mix to add some level of automation to customer tests. Other developers worked with the testers to create some comparison tools to automate some of the nastiest and largest tests.

One of the big things we learned that we didn’t expect to was that our stories has a tendency to get too large such that we couldn’t finish them in one iteration. This exhibited itself by the fact that we had stories that “carried over” into the next iteration, sometimes more than once.

1 Comment