Legacy technical debt retirement decisions

Last updated on January 5th, 2019 at 10:32 am

Decisions to retire the legacy technical debt carried by irreplaceable assets are not to be taken lightly. As decision makers gather information and recommendations from all around the organization, most will discover that information and recommendations aren’t sufficient for making sound decisions about technical debt retirement. The issues are complex. Education is also needed. It’s entirely possible that in some organizations, confronted with a set of decisions regarding legacy technical debt retirement in irreplaceable assets, the existing executive team might be out of its depth. To understand how this situation can arise, let’s explore the nature of legacy technical debt retirement decisions.

A common technical debt retirement scenario

What compels the leaders of a large enterprise to consider retiring the technical debt encumbering one of its irreplaceable assets is fairly simple: cost. Decision makers usually begin by investigating the cost of replacing the asset—the option I’ve oh-so-cleverly called “Replace the Asset.” They then typically conclude that replacement isn’t affordable. At this point, many decision makers choose the option I’ve called “Do nothing.” Time passes. A succession of incidents occurs, in which repairs to the asset or enhancements of the asset are required. And I use the term required here to mean “essential to the viability of the business.”

Two alternatives to retiring legacy technical debt in irreplaceable assets
Two alternatives to retiring legacy technical debt in irreplaceable assets. Neither one works very well.

Engineers then do their best to meet the need, but the cost is high, and the work takes too long. The engineers explain that the problems are due, in part, to the heavy burden technical debt in this particular asset. Eventually the engineers are asked to estimate the cost of “cleaning things up.” Decision makers receive the estimates and conclude that it’s “unaffordable right now.” They ask the engineers to “make do.” In other words, they stick with the Do Nothing option.

After a number of cycles repeating this pattern, decision makers finally agree to provide time and resources for technical debt retirement, but only because it’s the least bad alternative. The other alternatives—Replace the Asset, and Do Nothing—clearly won’t work and haven’t worked, respectively.

So there we are. The organization has been forced by events to address the technical debt problems in this irreplaceable asset. And that’s where the trouble begins.

Decisions about retiring legacy technical debt

In scenarios like the one above, the fundamental decision has already been made: the enterprise will be retiring legacy technical debt from an irreplaceable asset. But that’s just the first ripple of waves of decisions to come, made by many people in a variety of roles throughout the enterprise. Let’s now have a look at a short catalog of what’s in store for such an enterprise.

Recall that most large technical debt retirement projects probably exhibit a high degree of wickedness in the sense of Rittel and Webber [Rittel 1973]. One consequence of this property is the need to avoid do-overs. That is, once we make a decision about how to proceed to the next bit of the work, we want that decision to be correct, or at least, good enough. It should not leave the enterprise in a state that’s more difficult to resolve than the state in which we found it. Since another property of wicked problems is the prevalence of surprises, most decisions must be made in a collaborative context, which affords the greatest possibility of opening the decision process to diverse perspectives. We must therefore regard collaborative decision-making at every level as a highly valued competency.

What follows is the promised catalog of decision types.

Strategic decisions

This decision category leads the list because it provides the highest leverage potential for changing enterprise behavior vis-à-vis technical debt. Organizations that are confronting the problem of technical debt retirement from irreplaceable assets would do well to begin by acknowledging that although they might be able to devise tactics for dealing with the debt burdening these assets right now, they must make a strategic change if they want to avoid a recurrence. Accumulating debt to a level sufficient to compel chartering a major debt retirement project took time. It took years of deferring the inevitable. A significant change of enterprise strategy is necessary.

When changing complex social systems, applying the concept of leverage provides a critical advantage. In this instance, following the work of Meadows [Meadows 1997] [Meadows 1999] [Meadows 2008], we can devise interventions at several points that can have great impact on both the level of technical debt and its rate of accumulation. The leverage points of greatest interest are Feedback Loops, Information Flows, Rules, and Goals. For example, the enterprise can set a strategic goal of a specific volume of incremental technical debt incurred per project, normalized by project budget, as I discussed in the post, “Leverage points for technical debt management.”

One might reasonably ask why enterprise strategy must change; wouldn’t a change in technology strategy suffice? Changing how engineers go about their work would help—indeed in most cases it’s necessary. But because the conditions and processes that lead to technical debt formation and persistence transcend engineering activities, additional changes are required to achieve the objective of controlling technical debt.

Some technical debt is strategic—it’s incurred as the result of a conscious business decision. But some is non-strategic. We might even be unaware of how it occurred. However, both kinds of technical debt can arise as a result of non-technical factors. Read a review of non-technical precursors of non-strategic technical debt.

Organizational decisions

Before chartering a technical debt retirement project (DRP) for an irreplaceable asset, or a group of irreplaceable assets, it’s wise to consider how to embed that project in the enterprise.

The default organizational form for debt retirement projects concerned with an asset A is usually the same form that would be used for major projects focused on asset A. If the Information Technology (IT) unit would normally address issues in A, the debt retirement effort usually would be organized under IT. If A is a software product normally attended to in a product group, that same group would likely have responsibility for the DRP for asset A.

Although these default organizational structures are somewhat sensible, both technically and politically, there’s an alternative approach worth investigating. It entails establishing a technical debt retirement function that becomes a center of excellence for executing technical debt retirement projects, and for developing and injecting sound technical debt management practice into the enterprise. Such an approach is especially useful if multiple debt retirement projects are needed.

The fundamental concept that makes the center-of-excellence approach necessary is the wickedness of the technical debt retirement problem. To address the problem at scale requires capabilities beyond what IT can provide; beyond what product units can provide; indeed, beyond what any of the conventional organizational elements can provide. The reason for this is that the explosion of technical debt in most organizations is an emergent phenomenon. Every organizational unit contributed to the formation of the problem. And every organizational unit must contribute to its resolution.

A technical debt center of excellence is an approach that might be capable not only of synthesizing the expertise of all elements of the enterprise, but also might be capable of bringing new approaches into the enterprise from external sources.

Engineering decisions

Engineers have a tendency to identify and classify technical debt items on technical grounds. Further, they tend to set technical debt retirement priorities on a similar basis. That is, they’re inclined to set priorities highest for those debt items that they (a) recognize as debt items and (b) see as imposing high levels of MICs charged to engineering accounts. Engineers are less likely to assign high priorities to technical debt that generates MICs that are charged to revenue, or to other accounts, because those MICs are less evident—and in many cases invisible—to engineers.

Decisions regarding recognition of technical debt items and setting priorities for retiring them must take technological imperatives into account, but they must also account for MICs of all forms. Priorities must be consistent with enterprise imperatives.

Decisions about pace

Paraphrasing Albert Einstein, technical debt retirement projects should be executed as rapidly as possible, and no faster. The tendency among non-engineers and non-technical decision-makers is to push for rapid completion of debt retirement projects, for three reasons. First, everyone, like the engineers, wants the results that debt retirement will bring. Second, everyone, like the engineers, wants an end to the inevitable disruptions debt retirement projects cause. And finally, the longer the project is underway, the more it might cost.

For these reasons, once the decision to retire the debt is firmly in hand, the enterprise might have a tendency to apply financial resources at a rate that exceeds the ability of the project team to execute the project responsibly. When that happens, rework results. And for wicked problems like debt retirement, rework is the path to catastrophe.

Decisions about pace and team scale need to be regarded as tentative. Regular reviews can ensure that the resource level is neither too low nor too high. Even when the engineers are given control over these decisions, they must be reviewed, because pressures for rapid completion can be so severe that they can compromise the judgment of engineers about how well they can manage the resources applied to the project.

Resource decisions

Debt retirement projects concerned with legacy irreplaceable assets are different from most other projects the enterprise undertakes. Estimates of the labor hours required are more likely to be incorrect on the low side than are analogous estimates for other projects, because so much of the work involves pieces of assets with which few engineering staff have any experience. But with respect to resources, underestimating labor requirements isn’t the real problem. Non-labor resources are the real problem.

Because the assets are irreplaceable, it’s likely that they’re needed for ongoing operations. In some cases, the assets are needed continuously. Many organizations have kept such assets operational by exploiting hours of downtime during periods of low demand, usually scheduled and announced in advance. While these practices are likely sufficient for the relatively minor and infrequent changes usually associated with routine maintenance and enhancement, debt retirement imposes much more severe burdens on the organization than these short access windows can support. Effective debt retirement projects need far more access to the asset—a level of access that continuous delivery practices can provide [Humble 2010].

However, assets whose designs predate the widespread use of modern practices such as continuous delivery might not be compatible with the infrastructure that these practices require. And in organizations that haven’t yet adopted such practices, staff familiar with them might be in short supply. For these reasons, we must regard as developmental any early projects whose objectives are retiring technical debt from irreplaceable assets. They’re retiring the technical debt, of course, but they’re also developing the practices and infrastructure needed to support technical debt retirement projects. This dual purpose is what drives the surprisingly high non-labor costs and investments associated with early technical debt retirement projects.

The investments required might include such “items” as a staging environment, which “is a testing environment identical to the production environment” [Humble 2010]; extensive test automation, including results analysis; blue-green deployment infrastructure; automation-assisted rollback; and zero-downtime release infrastructure. Decisions to make investments require an appreciation of their value to the enterprise. They enable the enterprise to deal effectively with the wicked problem of technical debt retirement.

Last words

Because every situation and every organization is unique, few general guidelines are available for making these decisions. The criteria most organizations have been using for dealing with (or avoiding) the issue of technical debt have produced the problems they now face. So, to succeed from this point, whatever criteria they use in the future must be different. My own view is that short-term thinking is at the heart of the problem, but it’s a wicked problem. The long-term solution will not be simple.

References

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Other posts in this thread

Managing technical debt

Last updated on August 24th, 2019 at 05:57 pm

Managing technical debt is something few organizations now do, and fewer do well. Several issues make managing technical debt difficult and they’re discussed elsewhere in this blog. This thread explores tactics for dealing with those issues from a variety of initial conditions. For example, tactics that work well for an organization that already has control of its technical debt, and which wants to keep it under control, might not work at all for an organization that’s just beginning to address a vast portfolio of runaway technical debt. The needs of these two organizations differ. The approaches they must take might then also differ.

A jumble of jigsaw puzzle pieces. Managing technical debt can be like solving a puzzle.
A jumble of jigsaw puzzle pieces. Where do we begin? With these puzzles, we usually begin with two assumptions: (a) we have all the pieces, and (b) they fit together to make coherent whole. These assumptions might not be valid for the puzzle of technical debt in any given organization.

The first three posts in this thread illustrate the differences among organization in different stages of developing technical debt management practices. In “Leverage points for technical debt management,” I begin to address the needs of strategists working in an organization just beginning to manage its technical debt, and asking the question, “Where do we begin?” In “Undercounting nonexistent debt items,” I offer an observation about a risk that accompanies most attempts to assess the volume of outstanding technical debt. Such assessments are frequently undertaken in organizations at early stages of the technical debt management effort. In “Crowdsourcing debt identification,” I discuss a method for maintaining the contents of a database of technical debt items. Data maintenance is something that might be undertaken in the context of a more advance technical debt management program.

Whatever approach is adopted, it must address factors that include technology, business objectives, politics, culture, psychology, and organizational behavior. So what you’ll find in this thread are insights, observations, and recommendations that address one or more of the issues related to these fields. “Demodularization can help control technical debt” considers mostly technical strategies. “Undercounting nonexistent debt items” is an exploration of a psychological phenomenon.  “Leverage points for technical debt management” considers the organization as a system and discusses tactics for altering it. And “Legacy debt incurred intentionally” explores how existing technical debt can grow as long as it remains outstanding.

Accounting issues also play a role. “Metrics for technical debt management: the basics” is a basic discussion of measurement issues. “Accounting for technical debt” looks into the matter of accounting for technical debt financially. And “Three cognitive biases” is a study of how technical debt is affected by the way we think about it.

Posts in this thread:

References

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Leverage points for technical debt management

Last updated on December 11th, 2018 at 10:40 am

Adopting a program of technical debt management entails significant change to the system we call the enterprise. The problem can seem so daunting that we don’t know where to begin. The places to begin are the places where the change agents have greatest leverage—what systems analysts call leverage points. Consider this scenario.

You’re sitting in the kickoff meeting of the new Technical Debt Management Task Force. The CEO is talking about how she realized that the company had a technical debt problem. It was when the Marigold project went through delay after delay, and was finally declared done, with multiple objectives waived. She’s saying something about, “we were trying to do backflips with millstones around our necks. So I want this task force to show us how to get rid of the millstones, and then get rid of them.”

McMurdo Station, Antarctica, as seen from nearby Observation Hill
McMurdo Station, Antarctica, as seen from nearby Observation Hill. The United States Antarctic Program, a unit of the National Science Foundation, operates the station. It can house as many as 1258 people in Summer. Photo (cc) Gaelen Marsden courtesy .

OK, you think. But how? We’re a global enterprise with thousands of engineers and operations on every continent. Except maybe Antarctica. No wait, we’re there, too. McMurdo I think. We have software we don’t even know much about, acquired long ago along with the companies that built it. And we’re building new systems or modifying old ones all the time, trying to move everything to the cloud while enhancing data security. Where do we begin to look for the millstones of technical debt?

Have you been in that meeting? If not, can you imagine being in that meeting? Meetings like that are happening around the globe. We’re all in the same soup.

It turns out that the answers to the millstone questions are available, but the pioneers and deep thinkers who have shown the way aren’t working on technical debt. Their field is called systems analysis. They work on problems like the collapse of the North Atlantic fishery, urban deterioration, unemployment, poverty, climate change, and the causes of the Great Recession of 2008—really difficult problems. Although the technical debt problem isn’t quite that challenging, it’s challenging enough to justify taking a look at the methods of systems analysis.

And when we do that, we immediately encounter a concept many call leverage points.

What are leverage points?

Leverage points are places in complex systems where a small change in one thing can produce big changes in system behavior. In a brilliant 1997 article, Donella Meadows describes what she calls “places to intervene in a system.” [Meadows 1997] She followed this article, making improvements each time, in 1999 [Meadows 1999] and 2008 [Meadows 2008]. Let me summarize Meadows’ work here.

To alter the behavior of a complex system, intervene at one or more of 12 categories of leverage points. For example, one category is called “Rules.” It consists of the incentives, punishments, and constraints that govern the behavior of the people and institutions that comprise the system. By adjusting the system’s rules, we can alter overall system behavior.

One more thing: the leverage points form an ordered hierarchy, ordered by effectiveness. Acting at a higher-level leverage point is more effective than acting at a lower-level leverage point. And more difficult, too. The ordering of the categories is a bit fuzzy, because every situation has its own quirks, but generally, the order is as given in the list below.

In a moment I’ll give an example of using leverage point #9, Delays, to bring about change in the way the enterprise deals with technical debt. But first, here’s a brief summary of the leverage points in increasing order of leverage; not enough to truly understand what they are, but probably enough to pique your interest. As I write posts that illustrate interventions at these leverage points, I’ll link to them from here.

  1. Numbers: Constants and parameters such as subsidies, taxes, and standards
  2. Buffers: The sizes of stabilizing stocks relative to their flows
  3. Stock-and-Flow Structures: Physical systems and their nodes of intersection
  4. Delays in feedback loops
  5. Balancing Feedback Loops: The strength of the feedbacks relative to the impacts they are trying to correct
  6. Reinforcing Feedback Loops: The strength of the gain of driving loops
  7. Information Flows:  The structure of who does and does not have access to information
  8. Rules: Incentives, punishments, and constraints
  9. Self-Organization: The power to add, change, or evolve system structure
  10. Goals: The purpose or function of the system
  11. Paradigms: The mind-set out of which the system—its goals, structure, rules, delays, parameters—arises
  12. Transcending Paradigms

Changing systems that have delays in feedback loops

When we use feedback to control systems, and there are delays in the feedback, we can potentially create destructive system behavior. And that can happen when we try to control technical debt.

Whenever we try to control a quantity in an enterprise process, we must (a) set a target value for that quantity; then (b) measure its current value; and then (c) take action as appropriate to move the current value toward the target value. Systems analysts (and control theorists) call that arrangement a feedback loop. The action taken to move the current value to the target value is sometimes called the control signal. Under certain conditions, the feedback works as expected.

For example, to control the profitability of the enterprise, we can examine its net income, say, quarterly. And at the end of each quarter we can make adjustments if net income isn’t in the target range.

Feedback loops generally work pretty well, but under some conditions, oscillations can develop. One of those troublesome situations occurs when there’s a delay in the loop that’s of the same order as (or longer than) the time the system takes to respond to adjustments. Meadows uses the example of adjusting the water temperature of a shower when there’s a long delay between making the adjustment and feeling its effects. Overcorrection is almost inevitable, and that’s what causes system oscillation.

So let’s suppose that we’re trying to control the rate of accumulation of technical debt. One approach is to set a target for TDnew, the new technical debt generated in a project. To be fair to all projects, we decide to normalize this quantity according to the project budget B. So we set targets for each project’s N = TDnew/B, and we require that projects estimate N, on an ongoing basis, with a goal of having N in some target range when the project is complete.

One problem with this approach is that we rarely identify accurately all the technical debt we’ve incurred until some time has passed after project delivery. With time, as the newly produced assets go into production and learning accumulates, we acquire the wisdom needed to identify more of the technical debt we created. This is one source of delay in this feedback loop.

So let’s assume that this happens for several projects, and management decides that delayed recognition of incurred technical debt is a common occurrence. To account for this, management lowers the target ranges for N for future projects. This causes project managers and project sponsors to include in their project plans additional effort directed at retiring more of their incremental technical debt before their projects complete, to enable them to project lower values of N. They must therefore identify as much of the incremental technical debt as they can, and retire it, to meet the lower targets for N.

But recall that technical debt identification sometimes requires time and experience using the newly produced asset. And the reverse process also occurs. Technical artifacts that we thought were technical debt prove to be useful in unexpected ways, and actually turn out not to be debt items after all. As a result, some of the incremental technical debt that got retired before the project was completed actually should not have been retired. Eventually, people realize that this happens with uncomfortable frequency, and so the targets for N are raised once more.

Oscillations thus set in. Long delays inevitably cause them. To prevent oscillations, shorten the delays.

How to shorten delays in feedback controlling technical debt

With technical debt, we can shorten delays in several ways.

  • If the asset is meant for human use, involve representatives of the user population in the development and design process as soon as practical. Have them exercise the asset, or prototypes, early. Listen to their suggestions. Observe how they use the asset.
  • If the asset must interact with non-human assets, exercise it early and often. Don’t think of this as testing, though it might look very much like testing. What you’re actually doing is searching for shortcomings in how the asset interacts with non-human assets, in design and implementation in an asset that already works.
  • Subject the asset to multiple reviews all along the development trajectory. Don’t wait for final release to review it.

These practices expose technical debt items early—potentially, during initial design—thereby reducing delays in identifying what is and what is not technical debt. They help to advance the date at which we uncover missing capabilities or capabilities designed or implemented in awkward ways. No surprise, I’m sure, but these practices are consistent with Agile approaches to technological development.

Indirect effects can add to delayed recognition of technical debt

Most of the argument above assumed that the incremental technical debt associated with the project was incurred within the asset undergoing development or maintenance. But technical debt can occur in other assets as well. When the development team is unaware of such “remote” or “indirect” incremental technical debt, recognition of that new incremental technical debt can be significantly delayed. The project’s N will appear to be smaller that it actually is, until that remote incremental technical debt is recognized.

This form of delay is likely to occur when the debt incurred is asset-exogenous. Recall the example of line extension of mobile phones. In that example, the enterprise incurs technical debt in one set of products as a result of the introduction of a different product. In some cases, the newly incurred technical debt is immediately evident. When it is not, delays can be substantial.

This effect is by no means rare. Any organizational change can potentially add to the technical debt portfolio—reorganizations, acquisitions, expansions, wholly new products, and much more.

Conclusions

Interventions at the leverage points of an organization can produce the changes we want with a minimum of effort. Some subtlety is involved, because Meadows’ leverage points are expressed at a high level of abstraction.  But applying them to the problem of technical debt management is a promising approach.

Bookmark this post. I’ll be linking to more examples of using leverage points to manage technical debt. So far:

References

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Other posts in this thread