Lean Thinking and the Parable of the Underbrush
Years ago we lived in Ohio. In our front yard there was a circular bed with a small tree in the center, surrounded by thick shrubbery.
The shrubs throve, but the tree looked as if it was barely alive. We called in a tree specialist and he told us the bad news: The tree was dying and we should have it removed.
We couldn’t afford to do anything about the tree just then, but we thought the circular bed would look a lot better if the tree were surrounded by flowers rather than by plain bushes. We pulled out the bushes and planted annuals. Removing the bushes was quite an ordeal; it turned out their roots reached out in all directions and were pretty dense and tangled.
We were surprised to see the tree start to recover within a week. Before long, it was filling out and the bark looked far healthier than before. The tree didn’t die; it got stronger. The bed looked great.
In hindsight, it’s clear the bushes were taking all the resources in the immediate area and the tree was starving. Once the bushes were removed, the tree recovered beautifully.
The Moral of the Story
Most organizations we visit have complicated procedures in place to control just about every aspect of the work they do. There tends to be a focus on perfecting those procedures rather than asking whether they are necessary in the first place.
Lean Thinking starts with identifying customer-defined value, and proceeds from there by focusing on ways to maximize the delivery of that value. Anything that directly contributes to the delivery of customer-defined value is deemed “value-add activity.” Everything else is “non-value-add activity.” Overhead.
Customers pay for the results they obtain by using your software products. They don’t pay for regulatory compliance reviews. They expect regulatory compliance, but the cost is on you. They don’t pay for software testing. They expect the software to work, but the cost of ensuring that is on you. They don’t pay for security assessments. They expect the system and its environment to be secure, but the cost of making it so is on you.
The methods you use to comply with regulations, to validate software functionality, to ensure reasonable security, and other details are up to you. Customers don’t care how you do it. But those things are not what customers are paying for. Doesn’t it make sense to minimize the overhead of achieving those results?
What effect would it have on your profit margin? What effect would it have on your ability to compete? What effect would it have on your ability to pivot in response to market changes…or to drivesuch changes?
If your staff are constantly busy checking the functionality of the code, looking for security holes, reviewing systems for compliance, and other activities that may be necessary but that are not what customers pay for, then how much time do they have left over to deliver customer-defined value?
Your mechanisms for delivering customer-defined value can grow strong and thrive if only you remove or minimize the underbrush in your processes. Regulatory compliance is necessary. A manual, after-the-fact, formal review process is optional. Correct software functionality is necessary. Manual, after-the-fact “testing” is optional. Security is necessary. Manual, after-the-fact, formal security review is optional.
Can you think of ways to support the same needs with less overhead?
Impact of the Overhead
The three needs mentioned above – regulatory compliance, correct functionality, and system security – apply to most, if not all software product development processes.
Conventional wisdom holds that the best or only way to ensure the product and the delivery process support these needs is through careful and painstaking review by qualified specialists in each area. Is that assumption true, or do manual methods of review amount to underbrush that prevents the development and delivery process from flourishing as it could?
It’s common for all three of these reviews to take place after the software has been written, and before it is delivered. This approach has been the industry standard for decades.
The outcomes from these reviews are important, but the process is overhead; non-value-add activity. What kinds of overhead costs do we incur when we use manual, after-the-fact reviews?
- Extended lead times. Each type of review takes time, and progress toward the release is halted until all the reviews are complete, even in the best case. I have seen each of these types of reviews take several weeks. They can overlap, but there is still significant time involved.
- Bottlenecks. The reviews are carried out by specialists in each area. Few, if any organizations can afford to staff every development team with specialists in multiple areas of expertise. That means a few specialists are in demand to support a potentially large number of teams concurrently. Much of the extended lead time occurs because work has to wait for specialists to become available to review it. In the case of software testing, in large organizations the bottleneck affects the availability of test environments and test data, as well as the availability of the specialists themselves.
- Inflexibility. When serious issues are discovered, the discovery occurs so late in the delivery process that there may be little or no time or money remaining to correct the problems. Products may have to be pushed to production or placed in the market with known flaws, lest a business opportunity window close. In all three areas, this leads to direct business risk.
- Insufficient time to do a thorough job. In keeping with the 80/20 rule, most of the review activity comprises relatively routine, repetitive, and predictable tasks. When specialists must perform these tasks manually, they often lack the time to complete all the kinds of review activities for which human observation and creativity are critical. The routine activities tend to be baseline requirements that cannot be set aside. The result in many cases is that the specialists must use all the available time performing routine, repetitive tasks, and never have an opportunity to apply their unique skills. In software testing and in security reviews, the opportunity to learn about previously-unknown possibilities is lost when the specialists lack time to explore the system.
- Mistakes due to rushing. The late stage of the delivery process, the pressure to move on to the next request in order not to delay delivery, and the overwork caused by the supply and demand balance for specialized skills all increase the chances of careless mistakes.
Getting Unstuck
The importance of these three needs leads people to be very cautious about changing the way they handle them. Do practical alternatives exist that might shorten lead times, reduce mistakes, and improve quality while still providing high confidence of regulatory compliance, correct functionality, and system security?
The current industry trend toward continuous delivery has caused everyone in the field to think about ways to automate as many repetitive and routine functions as possible. Referring again to the 80/20 rule, or Pareto Principle, the majority of review activities are repetitive and routine. Can some of those activities be automated?
Yes they can. Regulatory compliance is about following rules. Computers are very good at checking to see that rules have been followed. For example, rules pertaining to the display of personally identifiable information (PII) are straightforward.
Another aspect of compliance is the ability to document the fact we are in compliance. Continuous Integration (CI) servers can generate reports that verify standard procedures around testing and deployment of software have been followed.
Validation that software performs the functions it is intended to perform generally comprises two kinds of activity: Checking that the system produces known outputs when fed particular inputs under various conditions, and exploring the behavior of the system under less well-known conditions to discover behaviors that were not anticipated. The former is repetitive and routine, and lends itself nicely to automation. When the routine checking is automated, testing specialists have proportionally more time for exploration.
Security review has some characteristics in common with functional validation. Most of the work consists of checking for known risks. The exploratory work is where security specialists discover previously-unknown risks. Many of those then become known risks that can be checked automatically. Given the highly dynamic and rapidly-changing nature of hacking, it is especially important for security specialists to have ample time to look for new risks.
Streamlining the Process
One of the simplest steps an organization can take is to engage the services of specialists in these areas (among others) as early as possible in the software development process. If they can prevent software from being designed in a way that creates risk, then problems will not be propagated all the way to production or to the market. Projects will have sufficient time and budget to make course corrections early.
Automating the routine checks reduces the impact of the specialist bottleneck and gives specialists more time to perform exploratory work. In turn, that reduces the pressure to work fast, which reduces the chances of careless errors and reduces the general stress and frustrations of the work. This results in more-effective feedback loops and organizational learning as well as higher quality.
Moving most of the routine checking from the end of the delivery process closer to the beginning, and automating as much of it as possible, significantly reduces the length of time the product must pause on its way to release for the purpose of human review. Once the organization gains high confidence in compliance, quality, and security, the human review can shift to spot checks after release, rather than a fear-based delay in the release itself.
Conclusion
The fact things have always been done a certain way in the past doesn’t mean that is the most effective way to do things going forward. Take a fresh look at your process from a Lean point of view. Any time the work pauses to wait for attention from a specialist, it’s an opportunity for improvement. Look for ways to achieve the same goals with less overhead.
Comment (1)
Bob Williams
The weight of regulatory compliance and system security is getting heavier with each passing security breach. The standard controls are changing every year and it’s become an industry that feeds itself. You’re right Dave, we must look for ways to streamline and continuously improve. Otherwise we’ll wake up one day buried beneath regulations and rules.