Planning for technical debt
There’s a problem with technical debt in the software field.
Well, yeah. Okay. Sure. Actually, I didn’t mean that problem. I meant this one: Many people seem to be willing to incur technical debt in situations that don’t call for it.
This article by Nishi Grover Garg, Paying Off the Technical Debt in Your Agile Projects, is representative of the kind of thinking that has come to be accepted as “normal” regarding technical debt. I don’t mean to pick on Garg personally. It’s just that he happens to have written it down really nicely.
He lists several options a team may employ when they find their code has accumulated some cruft:
- Negotiate with the product owner on the number of user stories planned for the upcoming sprint in order to have some extra time for refactoring the code
- Dedicate an entire sprint to code refactoring
- Divide all errors and warnings among the development team and let them handle the task of corrections within the next sprint, along with their regular development tasks, by scheduling extra hours
- Plan to spread this activity over a number of sprints and have a deadline for this report before the end of the release
- Estimate the size of refactoring stories and either plan them into upcoming sprints as new user stories or accommodate them as part of existing user stories
He then goes on to say, “These are all viable options.”
Um…no, they’re not.
Every one of those actions is an anti-pattern. Here’s why.
Negotiate with the product owner to allow time for refactoring
First, product owner means Product Owner, a term specific to Scrum and not generic to “agile.” But that’s nit-picking words. The substance of the problem is that the Product Owner has nothing to say about whether or how a development team refactors code. The Product Owner is responsible for the what. The development team is responsible for the how. Refactoring is part of the how; it’s integral to the baseline software engineering practice known as test-driven development, and it’s implicit in other robust approaches to software engineering, as well.
If a team discovers they’ve neglected to refactor incrementally and their code has some problems, then they need to remediate the code incrementally. User Stories may be larger than the team members thought, but that’s just the way it is. It isn’t a question of negotiation. It isn’t optional.
Refactoring Sprint
No.
Just…no.
I mean, dear God, no.
Schedule extra hours for refactoring
Refactoring is not extra work, so it doesn’t require extra hours. If necessary, increase the relative sizes of the User Stories by one notch (whatever a “notch” means in the team’s sizing method) until the problems have been remediated. But don’t make it a formal matter that involves the Product Owner in any way. It isn’t in their purview, and it isn’t their problem; it’s yours.
Deadline for a report (?)
I’m not even visualizing what this might look like. It sounds horrendous. I think I’d rather walk to the middle of Death Valley barefoot and with no water, and try to swallow a tablespoon full of cinnamon.
Refactoring stories
There’s no such thing as a refactoring story. Refactoring is part and parcel of how the work is done. It isn’t a separate piece of work in its own right.
Common misconceptions
All the options mentioned in the article are misconceptions about technical debt that I hear and read about repeatedly. The key underlying errors appear to be:
- you can go faster if you cut corners with respect to code quality
- refactoring is a separate piece of work from analysis, design, coding, and testing, and is not considered a normal task that is part of every User Story
- the Product Owner has to give technical professionals explicit permission to use generally-accepted good technical practices
- it’s okay to pause value-add delivery in order to remediate accumulated technical debt
Those are misconceptions. They are incorrect. They are harmful.
A story about technical debt
Here’s an anecdote from a past job that illustrates what it really means to incur technical debt intentionally.
The company was a financial institution. There was a (then-)emerging market for high-end luxury boat sales. We wanted to be able to approve loan applications for nouveau riche customers who were excited to buy a bigger boat than their nouveau riche neighbors.
The business problem to be solved had to do with the turnaround time for loan approvals. It took about 24 hours. By then, the prospective customer had left the boat showroom and returned to reality, where they had a chance to sleep on the decision. I trust you can guess what they decided, in most cases.
The business opportunity lay in approving the loan application within five minutes. That way, customers would proceed with the purchases before they left the showroom and returned to their right mind.
Financial analysts figured we could clear $700,000 the first year and about $2,000,000 per year thereafter, provided we were able to get a solution into the hands of salespeople by spring of the current year. It was February.
Spring was the high sales season for that sort of product. If we missed that date, it would be a full year before the solution would be of any use. By then, competitors would have had a chance to deploy their own solutions. First mover advantage was a critical success factor.
The business sponsor sent a formal request to the IT department. The IT department said they would need 10 months and 24 people to deliver a solution.
The business sponsor did what any sensible person would do under the circumstances: She enlisted a team of four developers and provided management “air cover” so they (we, actually) could deliver the solution using “agile” methods, going around the IT department altogether.
But there was a wrinkle.
To complete all the steps in the loan approval process, we needed to query a system owned by an external company. That company did not provide an API to access the system. They offered to write an API for us. It would cost $144,000 and take four months (according to their optimistic estimate). That would have been too late for us to gain first mover advantage.
So we screen-scraped their system.
Technical purists in the company squealed like pigs on a hotplate. We met the deadline. The system achieved its financial and business objectives.
Two years later, I asked the business sponsor how that system had worked out for her. She said it was bringing in the anticipated amount of revenue and was now able to pay for its own upgrade. A team was remediating the technical debt (the screen scrape) and replacing it with a proper API. But it wasn’t done speculatively. It was bought and paid for by the success of the original implementation.
I also asked the head of production support how much trouble the screen scrape solution had been in production. Three times in two years, the external company had changed the layout of the screen we were scraping. A production ticket was opened. A programmer adjusted the screen scrape parameters. Average cycle time for the fix: 2 hours. Total annual support cost of the screen scrape implementation: $135.
That’s technical debt incurred intentionally for a specific business purpose and paid off when it was cost-effective to do so.
Garden-variety sloppy coding doesn’t rise to the same level. It isn’t really “debt.” It’s just slop. And let’s face it: That “debt” will not be repaid. It never is.
Comments (6)
Robert
Clear and spot on, good article!
Ps.: Link to the other article you comment on is broken.
Thanks for letting us know. The link is fixed now.
Junilu Lacar
Nice article and great points.
There is some nuance to the first and second points.
Negotiating with the Product Owner to allow time for refactoring, a big fat NO. Setting expectations with the Product Owner so they understand the cost of paying off technical debt and how it might impact the pace of delivering value, probably a good idea.
It’s also important to understand why a refactoring sprint is so horrific that it leaves those who know better almost speechless just thinking about it.
In and of itself, a refactoring sprint is an admission to the poor state of not only the code base and design but also of the lack of effectiveness of the development team and their engineering practices. This is not to say that blame-laying and finger-pointing should follow since that would be another anti-pattern. It should, however, be taken as a signal to reset and recalibrate. This ties in with the first point of setting expectations with the Product Owner.
If your code base and/or your development practices are in such a sad state that you have to stop delivering value for an entire iteration just to clean up a mess that is slowing you down, then steps definitely need to be taken so the delivery team can get back on track and start producing value at a sustainable pace again.
Those steps may include assessing the severity and scope of the debt that needs to be paid off, figuring out a strategy for spreading out payments of that debt so the team can start producing value again, redefining exactly is considered to be a reasonable “sustainable pace” while the debt is still being slowly paid off, and how long it might take to get back into “the black” again. Coaching/mentoring, training, retooling, automating, and reorganizing teams are some things that might also be considered.
Junilu Lacar
While calling what really is a sloppy mess “technical debt” runs counter to the idea that “a mess is not debt, it’s just a mess,” at this point it seems the distinction is moot given the stickiness of the meaning in the real world. As coaches, we should see these as opportunities to help the team pivot and redirect their energy towards a getting into a more virtuous cycle instead of continuing down the path of a known anti-pattern. Referring to a mess as “waste” may force a team to admit to what they really have but there’s also the risk of pushing a team into denial or defensiveness. I’m reminded of an Aikido sensei who would always say, “Ok, that’s good, that’s good!” to students who were struggling with technique but then follow it up immediately with “… But here’s how you can do it better.” You wouldn’t say that a refactoring sprint is good but you can give the team a positive perspective by saying something like “Ok, it’s good that you realize the kind of effort you need to put into refactoring but instead of a refactoring sprint, here’s how you can deal with this better…”
Junilu Lacar
(Note to moderator: I don’t know if the markdown for links works so please fix if not)
It is often disappointing to see the semantic diffusion that has happened to Ward Cunningham’s Debt Metaphor over the years. What’s worse is that the “slop” meaning associated with poor quality seems to spread faster and wider than the original “leveraged” meaning associated with value and time.
It would be nice if more people spent the five minutes it takes to listen to Ward explain how he came up with the Debt Metaphor and how debt was really meant to be a Good Thing.
Hey Junilu – Welcome to the team, I fixed your links for you.