Skip to main content

Recovering Productivity Through Agile IT Operations

Reading: Recovering Productivity Through Agile IT Operations

Situation

LeadingAgile was approached by a long-time client to improve the Customer time-to-market for IT infrastructure needed by the client’s Product Development Teams as well to address cost-overruns when leased equipment was not swapped out in time to be returned to the leasing company. There were several areas of concern; however, the client’s transformation leadership team decided that a narrow focus on the logjam of Linux servers needing provisioning was the highest priority.

Approach

LeadingAgile applied its hallmark Transformation Approach to the problem space. The most notable activities were:

  • Discovery
  • Applying Structure to Teams
  • Using Metrics to Focus on Outcomes
  • Combine all demands into a Single, Well-Refined Backlog of service requests

Discovery

A Two-Day Workshop visioning workshop followed by three weeks of discovery kicked off the transformation. In the Two-Day Workshop, the basics of the LeadingAgile approach to transformation were shared and then used to uncover several different potential transformation targets.

Over the course of the two-days it was decided that the focus on transformation that would give the highest possible return was on premise, Linux-based server provisioning to service legacy applications.
The LeadingAgile and client Transformation Leadership Team then took the next three weeks to educate change leaders, interview change agents, and develop a candid, agreed-upon AS-IS Assessment, TO-BE Vision, and a highest-level transformation roadmap.

Applying Structure to Teams

Structure and IT Operations are no strangers. The Information Technology Infrastructure Library (ITIL) for Information Technology Service Management (ITSM) is the best-known mechanism for IT Infrastructure governance that there is. The challenge in most organizations is that most organizations create functional silos around server management, network management, storage management, security management, etc. The result is that to build a server, you need physical people from different reporting structures with competing priorities. The handoffs become long customer lead times. The variance in the time from the completion in one work cycle to the beginning of the next work cycle creates a cumulative effect of large queue sizes and the inability to deliver working, tested products on-time. The governance around these structures usually is related to large initiatives and not flow through the network of teams.

Our client was no different. So, we built two teams as part of the initial launch of Agile IT Operations: one cross-functional service development team with every skillset needed to create a Linux server including a ScrumMaster and Product Owner; and one cross-functional service management team consisting of a service manager (Product Manager in Agile parlance), Program Manager, System Architects and various domain stakeholders and subject matter experts (SMEs). The service development team was collocated so that pairing could occur with zero-latency. They were told that they should teach each of the other team members their skillset.

A four-tier governance model was put in place: 1) IT Strategy tier for Investment Decisions and annual plan items; 2) a Portfolio tier for quarterly planning/prioritization and deconstruction of annual plan items into Quarterly Initiatives; 3) a Program tier to deconstruct Quarterly Initiatives into Feature sets, service design of the features and prioritization; and 4) a Service Development tier to deconstruct features into User Stories in the form of Release Items, to be prioritized and flowed through the development team, two at-a-time. A separate service delivery team was given all product support activities.
A planning cadence typical of an Agile methodology of quarterly, monthly and weekly was implemented at the Portfolio, Program and Development tiers with weekly and daily check-points in the form of daily stand-ups for rapid reaction to shifting priorities.

A Kanban system was established at all four tiers using Microsoft Visual Team Services (VSTS), with each tier having unique value stream stages. While VSTS was chosen because it was what the client has, it has proven to lack even the most basic of Lean Management reporting capabilities. As such, much expense is being spent building reports in Microsoft’s Power BI reporting and analysis tools.

Using Metrics to Focus Outcomes and Performance Improvement

All models are wrong, some are useful. The same can be said about metrics. Agile introduces more metrics than most organizations are used to monitoring. During discovery, a small set of metrics were identified. We called it the One Metric That Matters (OMTM). In this case, Customer Lead Time, the time from a service request to the time until the time of delivery to the customer, was selected as the OMTM. Other metrics included, Production Lead Time (the amount of time it takes a Release Item to be developed from the point that the Release Item is prioritized for the service development team, until the time it is completed), the amount of re-work due to miscommunicated requirements versus number of escaped defects, and the amount of time the team was distracted by non-development work.

Create Clarity around a Single, Well-Refined Backlog

When LeadingAgile arrived, the client had four different work intake/service request systems. The work to be performed was spread across three to five spreadsheets, systems, and/or SharePoint lists. BMC Remedy’s ITSM solution had be chosen, but was months away from being implemented. An existing legacy service request existed in the form of a Microsoft InfoPath form and its related SharePoint list. To provide clarity, by mandate, all service requests were required to be submitted by the InfoPath form. The Service Management team and Product Owner were then responsible for determining the readiness of the request: was the request actionable by the Service Development team without the need for additional analysis of more than an hour or two?

Because the task of wrangling all the backlogs was daunting, the service development team was enlisted to vet the consolidated backlog. Over three-fifths of the backlog was found to be outdated and no longer necessary, bringing the backlog size from over 500 servers down to approximately 200 servers. Of which less than a handful could be acted upon immediately. The service management team members then reached out to each requestor to determine if a set-base server design could be used in lieu every server request being a customized solution. The goal became: how small of a set of standard server definitions could be created so that self-service automation could replace service development for on-premises bare-metal and virtual environments to match that of the client’s private cloud capabilities.

Outcomes

While only in its first expedition (first 3 months), the transformation team has automated the service build and provisioning, and network provisioning (IP address and DNS orchestration). The network provisioning is still partially manual due to a legacy approach to network segmentation and design.
The result of a single backlog, and the governance around service development flow has resulted in the following:

  • Reduction in Customer Lead Time from over 300 days to under 5 days. (60x improvement)
  • Reduction in Production Lead Times by automation and the removal of extra-team handoffs and competing priorities from 15 days to an average of 15 minutes. (480x)
  • Same day surge capability when a hurricane was going to cause a need for additional processing capacity due to increased insurance claims.
  • Previously, production Incidents were a daily event distracting the team with major incidents occurring at least once a week. Post transition, there have been no major production incidents related to the new server builds due to test-driven server build automation. Misconfigurations never make it into production.
  • The additional head-count needed for Product Support is starting to wane and will plateau at very minimal level resulting in addition cost savings.
Next Corporate Communications in Agile

Leave a comment

Your email address will not be published. Required fields are marked *