Agilists often tell us we should craft our own process based on first principles,…
Continuous Delivery (CD) means the delivery of software changes incrementally and continuously (or nearly continuously) to internal production or to the external market. In recent years, CD has become a trend in the IT industry, as technological advances have made it a practical goal for many organizations.
Well-known success stories are making the rounds, like the story of Amazon pushing code to production every eleven seconds. Or was it Google? Or Twitter, maybe? Anyway, success stories are making the rounds.
Conventional value proposition
The usual value proposition for CD, and for many people the only imaginable value proposition, is that CD is necessary for businesses that operate on the bleeding edge of technology and culture. Certain companies must adapt to changes in the market instantaneously, or be superceded by competitors or startups who are better able to do so.
According to the LeadingAgile compass, these are companies operating in the Adaptive-Emergent (AE) quadrant. They operate similarly to Lean Startup organizations. Companies that have at least some business units moving from the Adaptive-Convergent (AC) quadrant to the AE quadrant are also advised to cultivate the capability to support CD.
But there is a problem: People aren’t considering the entire value proposition.
Is there a business need for your friendly neighborhood twentieth-century behemoth financial institution to deliver small software changes to production on a continuous basis? Your friendly neighborhood clothing manufacturer? Trucking company? Supermarket chain? Hospital? Appliance manufacturer? Embedded fire control system maker? Agricultural equipment manufacturer? Nuclear power plant? Industrial safety equipment manufacturer?
Most people who are considering the value of CD for their organizations stop their analysis at this point. And so these organizations continue to rely on twentieth-century methods to deploy, release, and support software. Should they consider other forms of value CD might offer?
Resiliency and business agility
Using conventional twentieth-century methods for testing, packaging, deploying, releasing, monitoring, and supporting business-critical software systems, it can take anywhere from weeks to months to get a software release into production from the moment management makes the call to release it.
This post is about CD, and the deployment clock starts after the code has been written/modified and functionally tested. But how long did it take to make the changes to the code? How long did it take to test the systems affected by the changes? How long did it take to upgrade the infrastructure to accommodate the modified system? How long after the release before the system is stable again? How slow is slow? Pretty slow, it seems.
This may sound okay when the company’s release schedule calls for one or two production releases per year. We’ve already established that the behemoth doesn’t require continuous delivery to production. Customers couldn’t absorb change that quickly, or that frequently. Even a release cadence of two or three months is a far cry from continuous delivery. So there’s no problem, right?
You never know when something urgent may come up. The government may change a regulation that pertains to your industry on short notice. You have to comply by the first of next month, or pay a five-figure weekly penalty, if not lose your license. Too bad your release cadence is three months (optimistically). A competitor may release a hot new feature that all your customers want…yesterday. You might learn the hard way, after a critical system crashes, that your backup database has been corrupted for the past five years. You never practiced restoring it because the manual methods your staff uses didn’t allow time for that sort of thing. Now it’s all hands on deck for three weeks to keep the ship afloat, while planned value-add work languishes on hold.
Can you spell “risk?”
Wouldn’t it be convenient if you had mechanisms in place to ensure released code was clean, to recover broken systems quickly, to restore data reliably and quickly, and to promote code changes rapidly through the delivery pipeline? Wouldn’t it be convenient, even if your normal release cadence is three, six, or nine months?
Is the fact you normally release every six months really a good business reason to build IT processes that can only move at a six-month pace?
Operating cost control and reliability
Companies that use a lengthy, cumbersome, mostly-manual process to move code forward after it has been deemed “production-quality” are locking down resources that could be used for activities that build market share, improve quality, enhance customer satisfaction, or deliver requested features. The cost of heavyweight, manual deployment and release procedures is not merely the sum of the costs of the individual activities and the time lost waiting in queues for the next manual step in the process. The total cost is double that, due to opportunity cost.
In addition, processes like these are almost always accompanied by other organizational characteristics that raise operating costs. These include manual testing procedures, suboptimal software engineering practices, insufficient test and staging environments, inappropriate test data management, hard cross-system and cross-organization dependencies, and insufficient funding to remediate any of the problems just mentioned.
The net result of all these suboptimal practices is higher cost across the board. The risk of making a change to software is high, because mechanisms are not in place to prevent defects from being introduced. The time required to vet the software before approving it for release is extended by the dependency on manual methods for code review, testing, packaging, deployment, release, monitoring, and support. Most releases introduce new problems, requiring people to halt work on value-add activities to address post-release issues. Sometimes, a release must be backed out entirely and re-done.
Costs continue to mount as the extended lead times create a need to overlap releases. If it takes nine months to move a software change from concept to cash, then the company will likely establish three overlapping releases with a target date every three months. This introduces significant complexity to any bug fix work that has to be done, and the largely manual methods used in every stage of development, testing, and deployment tend to introduce numerous bugs.
Can you spell “vicious circle?”
Even if the company has no business need to deliver software changes incrementally and continuously to the market or to internal production, the capability to do so automatically brings with it solutions to most, if not all these problems.
For continuous delivery to be possible at all, the parts of the organization responsible for different systems must be structured to minimize cross dependencies. There must be clear separation between development/test and production environments so that engineered test data with no security or privacy issues can be used for testing. There must be sufficient test environments to move code forward without holding versions in abeyance to wait for a shared test environment to become available. Development teams must use engineering practices shown to minimize defects, and collaboration methods that make it possible for code to be reviewed without the need for a separate, after-the-fact manual review step. Routine functional checks must be fully automated at all levels of abstraction, and exploratory testing must be a routine part of the development process and culture. Environment configuration and provisioning, application packaging, and deployment procedures must be fully automated. Production system monitoring must be automated, and at least some degree of automatic recovery from known types of issues must be in use.
Just having these pieces in place eliminates most of the problems inherent in twentieth-century manual methods.
And there’s more.
A technical infrastucture that can support continuous delivery can also support seamless failover to a disaster recovery site. From a technical standpoint, failover amounts to the same thing as a normal production release. There’s no need to stop the train a couple of times a year to go through a DR dry run (most of which don’t fully exercise the DR procedure anyway). Every release cycle is, in effect, a full-blown practice run of the DR procedure. Actually, I mis-spoke: Every release is a DR failover. With CD, a DR event is a non-event.
Technological and social change
There’s technological change. We’ve seen that throughout our careers. There’s social change. We’ve seen that throughout our lives. Now there’s conjoined, inseparable technological-plus-social change. We’ve seen that throughout the last…well, decade-ish.
We tend to exploit technological advances to support the Old World a little better. We tend to treat social change as a problem for the Marketing Department. Now we have to understand and exploit combined technological and social change.
I’m sure I needn’t remind you what has happened to the publishing industry and retail book sales, or to the recording industry and music sales. Do you think the changes will stop there? Are you ready to bet your company on it?
Technological and social changes are already chipping away at established traditional businesses like banks. You can order groceries, clothing, and furniture through your phone. You can pay your bills and apply for a mortgage through your phone. Even such a mundane business as the manufacture and sale of mattresses can be disrupted by technological and social change.
Can you spell “competition?”
The fact your company didn’t need to deliver software continuously to stay in business yesterday doesn’t mean you can stay in business tomorrow without it. If your technical organization is in a state typical of larger, older enterprises, then you’re looking at a significant and lengthy effort to bring everything up to date. If you wait until you see blood on the floor before you begin, it will be too late.
Technical staff morale and retention
It’s a commonplace to say that people are our most valuable asset. It also happens to be literally true.
Depending on how high you have ascended in your management career to date, you may or may not recall what a “release party” is. It’s a party that lasts all night, if not all weekend. All your closest friends are invited. And nobody leaves until the production release is up and running.
It’s the kind of party where nobody has any fun.
And it’s an expensive sort of party to throw.
It isn’t just the cost of the pizza. It isn’t just the cost of the work lost the following week as staff members recover from the weekend. Your top technical staff members will be the first to burn out or become fed up with the situation. They’re the ones who can most easily find other jobs. And they’re the ones who pay attention to what’s happening in the industry; they know the era of release parties is over, and they have options.
That leaves you with relatively junior or generally less capable technical staff. In turn, that situation leads to more errors in the manual procedures, leading to longer and more expensive release parties. Word-of-mouth reputation of your organization as a “sweat shop” will precede recruiters’ efforts to find replacement personnel.
Can you spell “death spiral?”
Organizations that practice continuous delivery also tend to be organizations that qualify for the label, “humane workplace.” Technical staff have the opportunity to learn continuously as they keep up with technological advancements. They can deliver 40 hours worth of results in 40 hours of work. They work in an environment that keeps them challenged and rewarded without needless burnout. They get weekends and holidays off, like real people. They aren’t going anywhere soon. And if they do go somewhere, recruiters have an easy time filling the empty slots; there’s a queue of eager candidates trailing out the door and down the street…even if it’s raining.
Just because you can release continuously doesn’t mean you must do so. If it isn’t necessary or desirable for your company to deliver continuously, you can still enjoy significant benefits internally by having the capability in place. Deploy routinely to a staging environment. Release whenever you please, painlessly. Why make things any harder, riskier, or more costly than necessary?
Continuous delivery has become a baseline operational function. As the sun sets on 2017 and dawn prepares to break on 2018, the question isn’t, “Do we need continuous delivery (and how can we justify avoiding it)?” but rather, “Are there any excuses left not to implement continuous delivery?”