Automation-assisted technical debt retirement

Last updated on July 16th, 2021 at 04:23 pm

Collapse of the I-35W bridge in Minneapolis, Minnesota
The I-35W Bridge collapse, day 4, Minneapolis, Minnesota, August 5, 2007. The proximate cause of the collapse was underweight gusset plate design, which made the bridge vulnerable to the increased static load due to concrete road surfacing additions over the years, and to the weight of construction equipment and supplies during a repair project that was then underway [NTSB 2008].

When we conduct maintenance or technical debt retirement projects involving assets that must remain operational during project execution, we risk stressing the asset in ways that extend beyond its safe operation envelope. The National Transportation Safety Board found that this occurred in the case of the I-35W bridge collapse. These effects are more difficult to imagine in software systems, but they can occur when we shift load from the systems undergoing modification to other systems. Those load-bearing systems can then become overloaded. Or these effects can occur when we shift load not from one asset to another, but from one time window to another on the same asset. The result can be high loads in some time windows.

Photo by Kevin Rofidal, United States Coast Guard, courtesy Wikimedia Commons.

When we transform assets to retire some of the technical debt they carry, service disruptions are sometimes necessary. To minimize service disruptions while technical debt retirement efforts are underway, it’s advantageous to automate some procedures. Automation-assisted technical debt retirement provides two important benefits: reduced disruption of operations and reduced incidence of errors.

The meaning of automation

I’m using the concept of automation a bit loosely here. I don’t mean to imply that these procedures are autonomous. What I mean is that engineers have available tools for performing many operations with a minimum of thought. For example, in this sense of automated, an engineer can issue a command such as, “Test Module Alpha Using Test Suite Delta.” That command executes a predefined set of tests. Following execution, the appropriate engineers receive the results. The tool also archives those results appropriately. If the results are anomalous, engineers can then take appropriate action.

Benefits of automation-assisted technical debt retirement

The more obvious benefit of automated procedures is speed. For example, an asset removed from service for testing can be returned to service more quickly if the testing is automated. And if trouble erupts during operations of a newly transformed asset, engineers can swap the untransformed asset back into place quickly. So-called roll-out and roll-back tools are just a few of the many elements of a set of practices collectively known as continuous delivery [Humble 2010].

The second benefit of this kind of automation is error avoidance. For example, inconsistent or incomplete testing can fail to find errors and defects, and that leads to rework and further disruptions. Performing tests incorrectly, finding “defects” that aren’t there, is another way to generate trouble. Automated procedures are much less prone to error if we maintain, test, and certify them periodically. For example, consider subjecting a module to a particular test suite. With automation assistance, engineers needn’t remember (or take time to look up) how to prepare the asset for tests. They  needn’t remember (or take time to look up) how to run the tests, or what the members of the test suite are. Long advocated as an essential element of sound engineering practice, test automation can avoid some of these problems. But it’s far short of a panacea [Bach 1999].

Other automation opportunities

In some situations, we can automate debt retirement itself. When we can retire instances of the technical debt in question by performing an automated transformation on an asset, the transformation is faster and more reliable.

A most important practice associated with automation-assisted technical debt retirement is automation-assisted regression testing. Investments in thorough and focused regression testing have potentially shockingly high returns in the debt retirement context. They can be just as valuable during development and routine maintenance.

To perform a regression test on an asset that has undergone a change is to examine its behavior under a specified set of conditions. Such investigations can determine whether those changes caused the asset to misbehave. So a regression test determines whether the asset has regressed as a result of the change. Automated or automation-assisted regression tests help the project team detect problems in assets that they’ve transformed. And that’s much better than having the business units that depend on those assets encounter problems during operations [Ge 2014].

Many of these same regression tests can also be useful during enhancement and ongoing maintenance of the asset. Often, investing in automated regression tests in advance of the debt retirement project can enhance development and maintenance performance relative to those assets. Later, when the debt retirement project begins, the previously obtained results of regression tests will already be available.

Last words

For some debt retirement projects, specially created automated regression tests might be beneficial. Assign engineers to continual automation tool development for debt retirement projects. That’s probably the best way to support these needs.

These automation capabilities are unlikely to be available commercially, because they’re so specialized to the asset being tested. Because general applicability is unnecessary, building them in-house is both practical and economical.  If people with the necessary skills are unavailable, acquire them. We can justify these investments economically if we take into account the savings from reduced service disruptions during technical debt retirement projects.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

Other posts in this thread

Retiring localizable technical debt

Last updated on July 17th, 2021 at 06:53 am

Electricity pylons, Hamilton Beach, Ontario, Canada
Electricity pylons, Hamilton Beach, Ontario, Canada: a small part of the AC power grid, which seems destined one day to manifest a great deal of nonlocalizable technical debt. Photo by Ibagli courtesy Wikimedia Commons
Pylons in the same line are visible in Google Street View.

When technical debt appears as discrete chunks—when it’s localizable—we can often retire the debt incrementally. We can retire it system-by-system, module-by-module, or even instance-by-instance. These approaches offer great flexibility, both technically and financially. That makes retiring localizable technical debt a particularly manageable challenge.

Localizable technical debt

In “Technical debt in a rail system,” I explored the case of Amtrak’s Acela Express. In that example, I explained that Acela’s passenger cars can tilt to compensate for centrifugal forces that appear when the train rounds curves. The technical debt is in the form of tracks that are too close together to permit the trains to tilt as much as they’re designed to. That limits the trains’ speed rounding curves. The instances of this debt are the curves in which the tracks are too close together. These instances are thus inherently localizable.

In “Debt contagion: how technical debt can create more technical debt,” I described an example in which an organization is unable to upgrade its desktop computers from Windows 8 to Windows 10. In this case, each computer running Windows 8 is an instance of this form of technical debt.

Both of these examples illustrate localizable technical debt. Each instance is self-contained. We can “point” to it as an instance of the debt in question. But localizable technical debt need not be associated with hardware. In software systems, for example, localizable technical debt can exist in a module interface. If interface meets a requirement that’s no longer relevant, it might contain technical debt. That module and any other modules that interact with that interface therefore manifest that technical debt.

Nonlocalizable technical debt

Nonlocalizable technical debt is debt for which the instances are amorphous or system-wide. Or they span the bulk of the system, if not all of it. Retiring nonlocalizable technical debt typically requires extensive reengineering of the assets that manifest it.

For the most part, nonlocalizable technical debt arises at the level of system architecture or above. One can easily imagine this occurring in software systems, where physical constraints have little meaning. But let’s consider a hardware system to illustrate the importance of this concept.

An illustration of nonlocalizable technical debt

Until relatively recently in the United States, most electric power consumers used power for incandescent lamps, heating, or for electric motors. These applications are compatible with an alternating current power distribution system (AC grid). The AC grid is more efficient than an equivalent direct-current architecture (DC grid) when power generation plants are few and relatively distant from power load sites. The efficiency advantage is due to AC’s lower transmission losses compared to DC.

However, advances in electronics and in distributed power generation are eroding the advantages of the AC grid [Dragičević 2016]. Most electronic devices—phones, computers, rechargeable power tools, LED lighting, and electric vehicles—use DC internally. To access the AC grid, they change AC power into DC power, which entails efficiency losses.

Moreover, most renewable power generation systems generate DC inherently. Wind turbines generate AC at a frequency determined by wind power conversion efficiency, but they then convert it to DC before a second conversion to AC at the frequency of the AC grid. And because solar and wind power generators are geographically dispersed, they’re often situated near their load sites. Therefore, the losses due to transmission from generation site to load site are less important than they would be if the generation sites were few, concentrated, and at great distances from the loads they serve.

Our current AC grid architecture is likely to become a net disadvantage in the not-too-distant future. If that happens, we could come to regard the current AC grid as manifesting technical debt. The devices that are designed for the AC grid would also manifest that debt. However, localizing that debt in each device and each component of the AC grid would make little sense. The technical debt in question would reside in the grid architecture, as a whole. It would be inherently nonlocalizable.

Addressing localizable technical debt

As noted above, we can often retire localizable technical debt incrementally—instance-by-instance. In many cases, this enables engineers to address the debt at times and in sequences that are compatible with organizational priorities. By spreading the effort over time, the organization can ensure that costs are within the organization’s capacity in any given fiscal period. This isn’t always possible for localizable technical debt. And engineers are often justifiably averse to the temporary non-uniformity that results from incremental debt retirement. But exploiting localizability when planning debt retirement is often a useful strategy for retiring technical debt economically.

Temporary structures

Retiring localizable technical debt incrementally does present some problems. During the retirement process, for any given instance, temporary structures might be necessary to support continued operation with minimal service disruption. For example, with the Acela tracks, an alternate line might serve while the new track installation is in progress. Or the new track might follow a course at some distance from the existing track while trains continue using the existing track. Both approaches require investment beyond the investment required for the new track itself. Some managers have little appetite for such temporary investments. But temporary investments are in a real sense part of the MICs on that debt. They’re unusual in the sense that they’re part of the debt retirement effort, but they’re still MICs. In a way, they’re analogous to the charges that might appear when terminating an auto lease.

Entanglements of different kinds of technical debt

Another consideration when addressing localizable technical debt is its entanglement with other forms of technical debt. With respect to the effort to retire one kind of localizable technical debt, these other forms of technical debt are what I’ve called auxiliary technical debt (ATD). Consider carefully the time order of efforts to retire the localizable technical debt and one or more forms of its ATD. Because retiring localizable technical debt can seem deceptively straightforward, the temptation to deal with it before addressing some of the ATD can be difficult to resist. But dealing with some of the ATD first might actually be the wiser course. For example, when doing so eliminates numerous instances of the localizable technical debt, dealing with the ATD first can produce real savings.

One note of caution

Within the category of localizable technical debt are kinds of debt that are so widespread that retiring them affects a large part of the asset. Each instance of such debt might be identifiable and localized. But the instances are so widespread that they collectively have the properties of nonlocalizable debt. Incremental retirement might still be possible, but a more global retirement effort might be safer and less disruptive.

One approach technologists usually favor is suspending all other work while the debt in question undergoes retirement. While that approach might indeed be safest, all stakeholders must accept and understand the technical issues. And the technologists must understand the concerns of all stakeholders. A joint decision about the retirement strategy among all stakeholders, including technologists, is probably safest.

Last words

In the context of debt retirement projects, localizable technical debt provides needed flexibility. Often, the non-uniformity that results from retiring localizable technical debt instance-by-instance can be reduced before the debt retirement project is completed. In the meantime, the team can be relatively free to retire the localizable debt in whichever order is most fitting.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

Other posts in this thread

Legacy technical debt retirement decisions

Last updated on July 15th, 2021 at 07:17 pm

Two alternatives to retiring legacy technical debt in irreplaceable assets
Two alternatives to retiring legacy technical debt in irreplaceable assets. Neither one works very well.

Some irreplaceable assets carry legacy technical debt. Although retiring an asset retires its technical debt, that option isn’t available for irreplaceable assets. We need another option. For irreplaceable assets, we need to find a way to retire the debt. As decision makers gather information and recommendations from around the organization, most will come to an unsettling conclusion. They’ll find that information and recommendations aren’t sufficient for making sound decisions about technical debt retirement. The issues are complex. Education is also needed. It’s entirely possible that in some organizations, the existing executive team might be out of its depth. To understand how this situation can arise, let’s explore the nature of legacy technical debt retirement decisions.

A common technical debt retirement scenario

What compels the leaders of a large enterprise to consider retiring the technical debt encumbering one of its irreplaceable assets is fairly simple: cost. Decision makers usually begin by investigating the cost of replacing the asset. This is the option I’ve cleverly called “Replace the Asset.” They then typically conclude that replacement isn’t affordable. At this point, many decision makers choose the option I’ve called “Do nothing.” Time passes. A succession of incidents occurs, in which teams attempt required repairs or enhancements of the asset. And I use the term required here to mean “essential to the viability of the business.”

Engineers then do their best to meet the need, but the cost is high, and the work takes too long. The engineers explain that the problems are due, in part, to the heavy burden of technical debt. Eventually someone asks the engineers to estimate the cost of “cleaning things up.” Decision makers receive the estimates and conclude that it’s “unaffordable right now.” They ask the engineers to “make do.” In other words, they stick with the Do Nothing option.

After a number of cycles repeating this pattern, decision makers finally agree to provide time and resources for technical debt retirement, but only because it’s the least bad alternative. The other alternatives—Replace the Asset, and Do Nothing—clearly won’t work and haven’t worked, respectively.

So there we are. Events have forced the organization to address the technical debt problems in this irreplaceable asset. And that’s where the trouble begins.

Decisions about retiring legacy technical debt

In scenarios like the one above, people have already made the fundamental decision: the enterprise will be retiring legacy technical debt from an irreplaceable asset. But that’s just the first ripple of waves of decisions to come. Many people in a variety of roles throughout the enterprise will be making many decisions. Let’s now have a look at a short catalog of what’s in store for such an enterprise.

Recall that most large technical debt retirement projects probably exhibit a high degree of wickedness in the sense of Rittel and Webber [Rittel 1973]. One consequence of this property is the need to avoid do-overs. That is, once we make a decision about how to proceed to the next bit of the work, we want that decision to be correct, or at least, good enough. The consequences of that decision should not leave the enterprise in a state that’s more difficult to resolve than the state in which we found it. Since another property of wicked problems is the prevalence of surprises, most decisions must be made in a collaborative context, which affords the greatest possibility of opening the decision process to diverse perspectives. We must therefore regard collaborative decision-making at every level as a highly valued competency.

What follows is the promised catalog of decision types.

Strategic decisions

This decision category leads the list. It provides the highest leverage potential for changing enterprise behavior vis-à-vis technical debt. Organizations confronting the problem of technical debt retirement from irreplaceable assets would do well to begin by acknowledging that although they might be able to deal with the debt burdening these assets right now, they must make a strategic change if they want to avoid even worse trouble. Accumulating debt to a level sufficient to compel chartering a major debt retirement project took years of deferring the inevitable. A significant change of strategy is necessary.

When changing complex social systems, applying the concept of leverage provides a critical advantage. Following the work of Meadows [Meadows 1997] [Meadows 1999] [Meadows 2008], we can devise interventions at several points that can have great impact on the rate of accumulation of technical debt. The leverage points of greatest interest are Feedback Loops, Information Flows, Rules, and Goals. For example, the enterprise can set a strategic goal of a specific volume of incremental technical debt incurred per project, normalized by project budget. See “Leverage points for technical debt management.”

One might reasonably ask why enterprise strategy must change; wouldn’t a change in technology strategy suffice? Changing how engineers go about their work would help—indeed in most cases it’s necessary. But because the conditions and processes that lead to technical debt formation and persistence transcend engineering activities, additional changes are required to achieve the objective of controlling technical debt.

Some technical debt is incurred as the result of a conscious decision. But some is nonstrategic. We might even be unaware of how it occurred. Both kinds of technical debt can arise as a result of nontechnical factors. Read a review of nontechnical precursors of nonstrategic technical debt.

Organizational decisions

Before chartering a technical debt retirement project (DRP) for an irreplaceable asset, it’s wise to consider how to embed the DRP in the enterprise.

The default organizational form for DRPs concerned with an asset A is usually analogous to that used for major projects focused on asset A. If the Information Technology (IT) unit would normally address issues in asset A, the debt retirement effort usually would be organized under IT. If A is a software product normally attended to in a product group, that same group would likely have responsibility for the DRP for asset A.

Although these default organizational structures are both technically and politically sensible, there’s an alternative approach worth investigating. It entails establishing a technical debt retirement function that becomes a center of excellence for executing technical debt retirement projects. That unit is also responsible for developing sound technical debt management practice. Such an approach is especially useful if the organization contemplates multiple debt retirement projects.

The fundamental concept that makes the center-of-excellence approach necessary is the wickedness of the technical debt retirement problem. To address the problem at scale requires capabilities beyond what IT, product units, or any conventional organizational elements can provide. The explosion of technical debt in most organizations is an emergent phenomenon. Every organizational unit contributed to the formation of the problem. And every organizational unit must contribute to its resolution.

Engineering decisions

Engineers tend to identify and classify technical debt items on technical grounds. Further, they tend to set technical debt retirement priorities on a similar basis. That is, they tend to set priorities highest for those debt items that they (a) recognize as debt items and (b) see as imposing high levels of MICs charged to engineering accounts. Engineers are less likely to assign high priorities to technical debt that generates MICs that are charged to revenue, or to other accounts, because those MICs are less evident—and in many cases less visible—to engineers.

Decisions regarding recognition of technical debt items and setting priorities for retiring them must take technological imperatives into account, but they must also account for MICs of all forms. Priorities must be consistent with enterprise imperatives.

Decisions about pace

Paraphrasing Albert Einstein, technical debt retirement projects should be executed as rapidly as possible, and no faster. The tendency among nonengineers and nontechnical decision makers is to push for rapid completion of debt retirement projects, for three reasons. First, everyone, like the engineers, wants the results that debt retirement will bring. Second, everyone, like the engineers, wants an end to the inevitable disruptions debt retirement projects cause. And finally, the longer the project is underway, the more it might cost.

For these reasons, once the decision to retire the debt is firmly in hand, the enterprise might have a tendency to apply financial resources at a rate that exceeds the ability of the project team to execute the project responsibly. When that happens, rework results. And for wicked problems like debt retirement, rework is the path to catastrophe.

Decisions about pace and team scale need to be regarded as tentative. Regular reviews can ensure that the resource level is neither too low nor too high. Even when the engineers are given control over these decisions, these decisions must be reviewed, because pressures for rapid completion can be so severe that they can compromise the judgment of engineers about how well they can manage the resources applied to the project.

Resource decisions

Debt retirement projects concerned with legacy irreplaceable assets are different from most other projects. Estimates of the labor hours required are more likely to be incorrect on the low side, because so much of the work involves pieces of assets with which few staff have experience. But with respect to resources, underestimating labor requirements isn’t the real problem. Nonlabor resources are the real problem.

Irreplaceable assets probably provide critical support of ongoing operations. In some cases, the need for the assets is continuous. Many organizations have kept such assets operational by using periods of low demand for maintenance, usually scheduled and announced in advance. These practices are likely adequate for routine maintenance and enhancement. But debt retirement need a level of access to the asset that continuous delivery practices can provide [Humble 2010].

However, assets whose designs predate the widespread use of modern practices such as continuous delivery might not be compatible with the infrastructure that these practices require. In organizations that haven’t yet adopted such practices, few if any staff are experienced with them. We must therefore regard as developmental any projects whose objectives are retiring technical debt from irreplaceable assets. They’re retiring the technical debt, but they’re also developing the practices and infrastructure needed for debt retirement projects. This dual purpose drives the surprisingly high nonlabor costs associated with early debt retirement projects.

The investments required might include such “items” as a staging environment, which “is a testing environment identical to the production environment” [Humble 2010]; extensive test automation, including results analysis; blue-green deployment infrastructure; automation-assisted rollback; and zero-downtime release infrastructure. Decisions to make investments require an appreciation of their value to the enterprise. They enable the enterprise to deal effectively with the wicked problem of technical debt retirement.

Last words

Because every situation and every organization is unique, few general guidelines are available for making these decisions. The criteria most organizations have been using for dealing with (or avoiding) the issue of technical debt have produced the problems they now face. So, to succeed from this point, whatever criteria they use in the future must be different. My own view is that short-term thinking is at the heart of the problem, but it’s a wicked problem. The long-term solution will not be simple.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Other posts in this thread

Auxiliary technical debt: Rules of engagement

Last updated on July 16th, 2021 at 05:08 pm

Guardrails in a track bed as a rail line crosses a bridge
Guardrails in a track bed as a rail line crosses a bridge. The guardrails are the inner pair of rails. The rails just outside the inner pair are the running rails. Guardrails (also known as check rails) keep the wheels of derailed cars from straying too far from their proper locations. This is a risk mitigation function in high-risk geometries such as curves. It’s also advantageous even if the probability of risk events is low, as in this straight section of track. It’s worthwhile when the consequences of risk events are extremely costly. A derailment on a railway bridge or in steep terrain can result in rail vehicles falling to the earth below. That can cause them to pull other vehicles with them. Derailments under highway overpasses can also be problematic. Such derailments can result in damage to rail or highway bridge structures. That damage can result in service interruptions for periods much longer than the time needed to clear the derailment. For this same reason, guardrails are also used in tunnels and tunnel approaches. Because uncontrolled scope expansion can have such devastating effects, we need policy guardrails to control scope expansion when retiring technical debt from assets that contain auxiliary technical debt.

As noted earlier, a technical debt retirement project (DRP) emphasizes retiring a particular kind or kinds of technical debt from a set of assets. But those assets might also carry other kinds of technical debt. With respect to a given DRP, we can regard this other technical debt as Auxiliary Technical Debt (ATD). Because the presence of ATD can defocus debt retirement projects, it presents a risk we must anticipate and mitigate.

A word of caution

One word of caution. The technical debt discussed in this post is assumed to be localizable. Localizable technical debt is technical debt that manifests itself as discrete chunks. Each instance is self-contained, and we can “point” to it as an instance of the debt in question. More: “Retiring localizable technical debt

Mitigating the risks associated with the auxiliary technical debt

This post explores concepts and approaches for mitigating the risks associated with the auxiliary technical debt (ATD) of a given debt retirement project (DRP). As might already be evident, these initialisms (ATD, DRP, and one more to come) can be difficult to keep straight. Here’s a quick guide:

  • T always means Technical
  • D always means Debt or Design
  • R always means Retirement
  • P always means Project

So, DRP is Debt Retirement Project.

Also, if you have a pointing device, you can hover the cursor over the first mention of each initialism in each section. Then your browser displays the expansion of the initialism. Touch screen users and keyboarders: sorry, I haven’t yet figured out how to help you in an analogous way, so let me know if you have an idea.

I’ve been using the term TDIQ—Technical Debt In Question—to denote the kinds of technical debt whose retirement is the objective of a given DRP. The ATD of that DRP, then, is the collection of instances of any other kinds of technical debt. That is, all types differing from the TDIQ of the DRP, and which are present in the assets being modified by the DRP. Notice that the property of being auxiliary technical debt is relative. It’s relative to the objectives of a given DRP. A particular instance of technical debt might be ATD for one DRP, and TDIQ for another DRP, depending on the respective objectives of each DRP. Notice also that the ATD of a given DRP can include several different kinds of technical debt.

The temptation to retire auxiliary technical debt

Let’s now examine a scenario in which ATD can generate risk for a DRP. In this scenario, we’ll consider only one kind of ATD; call it ATD0.

Suppose several members of the DRP team undertake work to retire the DRP’s TDIQ in a portion of one of the debt-bearing assets. In performing this work, they encounter some instances of ATD0. Studying these instances of ATD0 carefully, they devise a plan. They realize that “fixing” the ATD0 along with the TDIQ in that portion of the asset would be easier and less risky than amore focused approach. The more focused approach would be to leave the ATD0 in place and attend only to the TDIQ. Let’s call the approach they adopted the ATD approach. And let’s say that the TDIQ approach is one in which the team addresses only the TDIQ. It leaves in place the ATD0 and all other ATD it finds.

Compared to the TDIQ approach, the advantages of the ATD approach are fairly clear. After the work is complete, in either approach, the team must test and re-certify the asset. In the TDIQ approach, when a subsequent DRP tries to retire ATD0, that second DRP team will need to test and re-certify the asset again. In the ATD approach, we can avoid modifying, re-testing, and re-certifying the asset a second time. We can avoid it because we’ve already retired all instances of ATD0 from the asset. Thus, in the ATD approach we can avoid a second round of modification, testing, and re-certification.

Risks associated with retiring auxiliary technical debt

But the ATD approach also has some serious disadvantages.

Enterprise assets might be left in a mixed state

Unless the team plans to retire all instances of ATD0, then upon completion of the DRP, enterprise assets will be in a mixed state. Some will be free of both the TDIQ and ATD0; some will be free of the TDIQ but continue to harbor ATD0. This non-uniformity can create complications for subsequent maintenance, documentation, testing, training, enhancement, automation assisted development, and so on.

Complications in testing and re-certification

If test results for the modified assets indicate the possibility of new defects, the cause might be associated with the TDIQ work, or the ATD work, or both. Resolving any issues in the test results is thus more complicated under the ATD approach than it is under the TDIQ approach. Similar considerations affect re-certification. Thus, there is a risk that the ATD approach will complicate interpretation of test and re-certification results.

Questions about the reliability of technical debt inventory data

As noted in an earlier post, for any given DRP, the DRP team needs to know which assets bear that project’s TDIQ. In the TDIQ approach, any data previously or concurrently gathered about the location of instances of ATD0 remains valid. It maintains its validity because the TDIQ approach doesn’t retire any instances of ATD0. However, in the ATD approach, such inventory data must be corrected to account for the retirement of whatever instances of ATD0 are retired in the ATD approach. Thus, there are problems if ATD0 inventory data has already been collected, or if it’s being collected in parallel with the DRP. The DRP team must then take steps to adjust the inventory data regarding locations of ATD0 as it retires instances of it.

There is one minor exception to this claim about TDIQ’s validity-preserving qualities. In some instances, retiring an instance of TDIQ can incidentally retire an instance of ATD0. It happens.

There is of course a risk that this will not occur as needed. If that risk materializes, it can create problems for any subsequent DRP for which the ATD0 is contained in its TDIQ. This can be especially challenging if there are multiple DRPs in process simultaneously. Suppose each DRP is working on different TDIQs, potentially in different debt-bearing assets. If they all encounter and retire instances of ATD0, keeping the inventory current can be a complicated affair.

Unconstrained scope creep

Suppose there is a DRP whose objective is retiring its TDIQ. And suppose it has decided to also retire instances of a particular kind of ATD, say ATD0. Although that activity would represent an expansion of scope beyond retiring the TDIQ, it might be acceptable and it might even be prudent. But as the team undertakes to retire ATD0, it might confront a similar problem. That problem relates to the relationship between the ATD0 and yet another kind of ATD, which we might call ATD1. The DRP team might then decide to expand scope again. And so on. In general, there is no self-evident stopping point for such a chain of scope expansion. In these circumstances, scope creep can become an unmitigated risk, threatening the coherence and focus of the DRP, with consequences for its budget and schedule.

Last words

In some cases, some of the ATD might be so intertwined with the TDIQ that retiring some instances of the TDIQ necessarily retires some of the ATD. And in other cases, leaving the ATD in place severely complicates retiring the TDIQ. In still other cases, leaving the ATD in place leaves the assets in a complex state that makes ongoing maintenance or enhancement work more difficult. In these cases, what I called the ATD approach above is plainly the wiser course, compared to the TDIQ approach.

Policymakers have a role to play here. They can develop guidance for DRP teams to apply as they come upon these difficult situations. That guidance can help them decide whether to take the ATD approach or the TDIQ approach. The military calls this guidance “rules of engagement,” while politicians call it “guardrails.”

Deciding between the ATD and TDIQ approaches on a whim, or on what feels right at the moment, inevitably leads to a chaos of inconsistency and scope creep. The safest course is to adopt wise rules of engagement. Then adjust them as the organization learns more about retiring technical debt from its assets.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Other posts in this thread

Where is the technical debt?

Last updated on July 15th, 2021 at 10:47 am

Part of the cutting head of an 84-inch (2.13 m) tunnel boring machine. An obsolete sewer line is out of sight to nearly everyone. Invisibility can raise the question, Where is the technical debt?
Part of the cutting head of an 84-inch (2.13 m) tunnel boring machine. The machine was used for installing a sewer in Chicago, Illinois, USA, in 2014. An obsolete sewer line is out of sight to nearly everyone. Invisibility can raise the question, Where is the technical debt? Photo © 2014 by J. Crocker.

When we first set out to plan a large technical debt retirement project (DRP), a question arises very early in the planning process. It is this: Which assets are carrying the kind of technical debt we want to retire? And a second question is: Which operations will be affected—and when—by the debt retirement work? Although these questions are clear, and easily expressed, the answers might not be. And the answers are important. So where is the technical debt?

The challenges of identifying debt-bearing assets

Determining which of the enterprise’s many technological assets might be carrying the Technical Debt In Question (TDIQ) can be a complex exercise in itself. It’s challenging because inspecting the asset might be necessary. Inspection might require temporarily suspending operations, or determining windows of time during which inspection can be performed safely and without interfering with operations. Further, inspection might require knowledge of the asset that the DRP team doesn’t possess. Moreover, access to the asset might be restricted in some way. In these cases, staff from the unit responsible for the asset must be available to assist with the inspection.

Although asset inspection might be necessary or preferable, it might not be sufficient for determining which assets are carrying the TDIQ. This is easy to understand for physical assets. For example, physical inspection cannot determine the release version of the firmware of the hydraulic controller electronics of a tunnel boring machine. But asset inspection might also be insufficient for purely software assets. For determining the presence of the TDIQ in software assets, reading source code might not be sufficient or efficient.

To locate the technical debt, it might be easier, faster, and more accurate to operate the asset under special conditions. For example, an inspector might want to provide specific inputs to various assets and then examine their responses. As a second example, we might use automation assistance to examine the internal structure of an asset, searching for instances of the TDIQ. And as with other assets, the assistance of the staff of the business unit responsible for the asset might be necessary for the inspection.

Which enterprise operations depend on debt-bearing assets?

Knowing which assets bear the TDIQ is useful to the DRP team as it plans the work to retire the TDIQ. But part of that plan could include service disruptions. If so, it’s also necessary to determine how those disruptions might affect operations. That information enables the team to control the effects of the disruptions and negotiate with affected parties. Thus for each asset that bears the TDIQ, we must determine what operations would be affected by service suspension.

Observations of actual operations in conditions in which the asset is out of service in whole or in part can be valuable. Such observations might be the only economical way to discover which enterprise functions depend on the assets that carry the TDIQ. Other techniques include examining historical data such as trouble reports and outstanding defect lists, and correlating them across multiple asset histories and operations histories.

Last words

In some cases, these investigations produce results that have a limited validity lifetime, or “shelf life.” The short shelf life is mainly due to ongoing evolution of the debt-bearing assets and the assets that interact with them. That’s why the work of retiring the TDIQ must begin as soon as possible after the inventory is complete. This suggests that the size of the DRP team is a critical success factor. Larger size teams can complete the inventory inspections rapidly. Speed is important because of the validity lifetime of the team’s research results.

Managing teams of great size is a notoriously difficult problem. One approach that can help involves delegating some of the DRP research effort. The people most qualified for this work are in the business units that own the assets in question. Properly motivated, they can provide the labor hours and expertise needed for the research. In this way, the DRP can deploy a team-of-teams structure, known as a Multi-Team System (MTS) [Mathieu 2001] [Marks 2005]. The DRP team can then bring to bear a large force in a way that renders the overall MTS manageable.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Marks 2005] Michelle A. Marks, Leslie A. DeChurch, John E. Mathieu, Frederick J. Panzer, and Alexander Alonso. “Teamwork in multiteam systems,” Journal of Applied Psychology 90:5, 964-971, 2005.

Cited in:

[Mathieu 2001] John E. Mathieu, Michelle A. Marks and Stephen J. Zaccaro. “Multi-team systems”, in Neil Anderson, Deniz S. Ones, Handan Kepir Sinangil, and Chockalingam Viswesvaran, eds., Handbook of Industrial, Work, and Organizational Psychology Volume 2: Organizational Psychology, London: Sage Publications, 2001, 289–313.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Other posts in this thread

Retiring technical debt from irreplaceable assets

Last updated on July 14th, 2021 at 07:38 pm

A map of the U.S. Interstate Highway System, which many regard as one of our irreplaceable assets
A map of the U.S. Interstate Highway System. The map shows primary roadways, omitting most of the urban loop and spur roads that are actually part of the system. In 2016, the total length of highways in the system was about 50,000 miles (about 80,000 km). About 25% of all vehicle miles the U.S. occur in this system. The cost to build it was about USD 500 billion in 2016 currency. Significant advances have occurred since the 1950s in technologies such as rail, electronics, data management, and artificial intelligence. And the effects of fossil fuel combustion on global climate are well known. One wonders whether such a system would be the right choice if construction were to begin today. If alternatives would be better, then this system might be regarded as technical debt. But replacing it might not be practical. Finding a way to retire the technical debt without replacing the entire asset might be the most viable solution. Image by SPUI courtesy Wikipedia.

Designing a project to retire some portion of the technical debt from a critical, irreplaceable asset, can be a daunting task. It’s best to acknowledge that the project design problem is very likely a wicked problem in the sense of Rittel and Webber [Rittel 1973]. See my post “Retiring technical debt can be a wicked problem” for more. In this thread, of which this is the first post, I suggest some basic preparations for dealing with irreplaceable assets. They form a necessary foundation for success in approaching the debt retirement problem for irreplaceable assets.

Wicked problems

As I’ve noted in previous posts, the problems associated with retiring technical debt can be wicked problems. And if some of these problems aren’t strictly wicked problems, they can possess many of the attributes of wicked problems in degrees sufficient to challenge the best of us. That’s why approaching a technical debt retirement project as you would any other project is risky.

For convenience and to avoid confusion, in my last post I adopted the following terminology:

  • DRP is the Debt Retirement Project
  • DDRP is the effort to design the DRP
  • DBA is the set of Debt Bearing Assets undergoing modification in the context of the DRP
  • IA is the set of assets, excluding the DBA assets, that interact directly or indirectly with assets in the DBA

In the posts in this thread, convenience demands that we add at least one more shorthand term:

  • TDIQ is the Technical Debt In Question. That is, it’s the kind of technical debt we’re trying to retire from the DBA assets. Other instances of the TDIQ might also be found elsewhere, in other assets, but retiring those instances of the TDIQ is beyond the scope of the DRP.

Know when and why we must retire technical debt

For those technical debt retirement projects (DRPs) that exhibit a high degree of wickedness, a critical success factor is clear communication of the mission of the DRP. Clear communication is important because the DRP team must deal with many stakeholders who are in the early stages of familiarity with the concept of technical debt. Some of them might be cooperating reluctantly. Expressing the objectives and benefits of the DRP in a clear and inspiring way is very helpful. With that in mind, I offer the following reminder of the reasons for tackling such a large and risky project that produces so few results immediately visible to customers.

Examining alternatives to retiring the TDIQ is a good place to begin. One alternative is simply letting the TDIQ remain in place. Call this alternative “Do Nothing.” A second alternative to retiring the TDIQ is replacing the debt-bearing asset with something fresh and clean and debt-free. Call this alternative “Replace the Asset.” The problem many organizations face is that they cannot always rely on these alternatives. And because these two alternatives to debt retirement aren’t always practical, some organizations must develop the expertise and assets necessary to retire widespread technical debt in large, critical, irreplaceable systems. Below is a high-level discussion of these two alternatives to debt retirement.

Do Nothing

The first alternative is to find ways to accept that the DBA will continue to operate in their current condition, carrying the technical debt that they now bear. This alternative might be acceptable for some assets, including those that are relatively static and which need no further enhancement or extension. This category also includes those assets the organization can afford to live without.

One disadvantage of the “Do Nothing” approach is that technology moves rapidly. What seems acceptable today might not be acceptable in the very near future. It might become old-fashioned, behind the times, or non-compliant with future laws or regulations. Styles, fashions, technologies, laws, regulations, markets, and customer expectations all change rapidly. And even if the asset doesn’t change what it does, the organization might need to enhance the asset. The enhancements might become very expensive to accomplish due to the technical debt the asset carries.

An especially troubling scenario takes shape when the DBA contains portions that are severely out of date. When that happens the organization might no longer be able to find qualified candidates who can perform needed work on the DBA. This situation can also arise when portions of the DBA were developed in-house. In that case, there might not be any qualified candidates outside the organization. When everyone who understands the DBA has departed the organization, work can proceed only if the DBA is properly documented and a training and mentoring program is healthy and current.

For these reasons, Do Nothing can be a high-risk strategy.

Replace the Asset

The second alternative to retiring the TDIQ is to replace the entire asset. For this option, the question of affordability arises. In some instances this alternative is practical, but for many assets, the organization simply cannot afford to purchase or design and construct replacements.

Pay special attention to those assets that “learn.” They might contain data gathered from experience over a long period of time. Retiring the asset can require developing some means of recovering the experience data and migrating it to the replacement asset. That task is a potentially daunting effort in itself.

Replacement is especially problematic when the asset is proprietary. If the organization created the asset itself, they might have constructed it over an extended period of time. Replacement with commercial products could require extensive adaptation of those products, or adaptation of organizational processes. Worse yet, replacement with assets of its own making will likely be costly.

Last words

When organizations depend on assets that they must enhance or extend, and which they cannot afford to replace, they face a daunting problem. They must develop the expertise and resources needed to address the technical debt that such assets inevitably accumulate.

This series of posts explores the issues that arise when an organization undertakes to retire the technical debt that its irreplaceable assets are carrying. Below, I’ll be inserting links to the subsequent posts in this series.

Other posts in this thread

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Marks 2005] Michelle A. Marks, Leslie A. DeChurch, John E. Mathieu, Frederick J. Panzer, and Alexander Alonso. “Teamwork in multiteam systems,” Journal of Applied Psychology 90:5, 964-971, 2005.

Cited in:

[Mathieu 2001] John E. Mathieu, Michelle A. Marks and Stephen J. Zaccaro. “Multi-team systems”, in Neil Anderson, Deniz S. Ones, Handan Kepir Sinangil, and Chockalingam Viswesvaran, eds., Handbook of Industrial, Work, and Organizational Psychology Volume 2: Organizational Psychology, London: Sage Publications, 2001, 289–313.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Nine indicators of wickedness

Last updated on July 16th, 2021 at 08:03 pm

For the new interchange between Interstate 35 and U.S. Route 30, just outside Ames, Iowa, a new flyover ramp (red) had to be rebuilt. Rollbacks are one of the indicators of wickedness.
The interchange between Interstate 35 and U.S. Route 30, just outside Ames, Iowa. The new flyover ramp is indicated in red. It replaces the cloverleaf ramp in the upper right quadrant of the cloverleaf. A construction error has forced a delay, while the piers of the flyover ramp bridge are corrected. The cloverleaf, designed with curves a bit too tight, was a high-accident area. It constituted a technical debt, with the ongoing vehicle accidents comprising the metaphorical interest charges. The construction error in the flyover ramp piers necessitated a rollback. Rollbacks are one of the indicators of wickedness in the projects to design technical debt retirement projects. But in this case, the indicated wickedness isn’t much greater than was anticipated by the project designers. Construction drawing by Iowa Department of Transportation  [Iowa DOT 2016].

Several properties of the problem of designing technical debt retirement projects tend to make those design problems more likely to be wicked problems. These properties make these projects more likely to satisfy all ten of the criteria of Rittel and Webber [Rittel 1973]. I call these properties indicators of wickedness.

We usually have some notion of the degree of wickedness of a given design effort for a technical debt retirement project. But actually executing the debt retirement project can reveal unanticipated issues and complexity. Some of what’s revealed can cause us to adjust our estimate of the degree of wickedness of the design effort. If we know in advance what kinds of revelations are most likely to cause such adjustments, we can reduce the incidence of unanticipated revelations.

Criteria for wickedness of wicked problems

In my post, “Degrees of wickedness,” I noted that we can regard all problems as lying on a Tame/Wicked spectrum, with wicked problems lying at the extreme Wicked end of the spectrum, and the tamest of the tame lying at the opposite end. As for the ten criteria of wickedness developed by Rittel and Webber, I proposed that they could be satisfied in degrees, with the most wicked problems satisfying all ten criteria absolutely.

As a quick review, here are the attributes of wicked problems as Rittel and Webber see them [Rittel 1973], rephrased for brevity:

  1. There is no clear problem statement
  2. There’s no way to tell when you’ve “solved” it
  3. Solutions aren’t right/wrong, but good/bad
  4. There’s no ultimate test of a solution
  5. You can’t learn by trial-and-error
  6. There’s no way to describe the set of possible solutions
  7. Every problem is unique
  8. Every problem can be seen as a symptom of another problem
  9. How you explain the problem determines what solutions you investigate
  10. The planner (or designer) is accountable for the consequences of trying a solution

Conditions or situations that tend to increase wickedness

Below is a sample of conditions or situations that tend to increase the wickedness of the problem of designing a technical debt retirement project. I have no data to support these conjectured effects. But the principles I used to generate them are three. If a condition tends to…

  • …expand the set of stakeholders in a debt retirement project, it tends to enhance the wickedness of the design problem.
  • …increase the number or heterogeneity of the assets or processes that we must consider, it tends to enhance the wickedness of the design problem.
  • …create a need for a rollback of work performed as part of the debt retirement project, and that rollback creates a need to redesign the debt retirement project, it tends to enhance the wickedness of the design problem.

In what follows, I use the term “DRP” to indicate the Debt Retirement Project itself. The effort to design the DRP is the “DDRP.” The problem whose wickedness we’re considering isn’t the DRP itself. Rather, it is the DDRP. Also, let DBA (for debt-bearing assets) be the set of assets undergoing modification in the context of the DRP. And let IA (for interacting assets) be the set of assets, excluding the DBA assets, that interact directly or indirectly with the DBA assets.

With all this in mind, I offer the following nine examples of indicators of wickedness of the DDRP.

1.   A previous attempt to retire this debt was abandoned

Two indicators of the wickedness of the DDRP are perhaps most significant. The first is the failure of a previous attempt to execute a DRP with similar objectives. And the second is the failure of a previous attempt to execute a DDRP for a DRP with those objectives. There are two reasons why such failures are significant indicators of wickedness.

First, it’s reasonable to assume that these previous attempts weren’t founded on any recognition of the wickedness of the DRP or the DDRP. Few such efforts are.  (A Google search for the two phrases “technical debt” and “wicked problem” yields less than 1000 results) (update 12 Nov 2018: 1160 results; 24 May 2021: 246,000 results) Consider first the DDRP. If it is a wicked problem, proceeding as if it were not would very likely fail. If the designers of the previous DDRP did assume that it was a wicked problem, investigating their approach could prove invaluable, and save much time and effort. An analogous argument applies for the DRP itself.

Second, if the previous attempt to execute a DRP with similar objectives has left traces of itself in the DBA, and if those traces must be taken into account while executing the DDRP, they might complicate the DRP, and they might be incompletely addressed in the DDRP. To the extent that these conditions prevail, Criterion 5 is satisfied, and the DDRP exhibits wickedness.

2.   The Debt Retirement Project (DRP) will interrupt some revenue streams

If the work of the DRP entails temporary interruption of revenue streams, executing the DRP can have significant and long lasting effects on the organization. In estimating the cost of the DRP, it’s clearly necessary to account for the financial impact of any revenue shifted into the future, and any revenue irretrievably lost as well. And in some cases, market share might also suffer. All of these factors tend to increase the wickedness of the DDRP.

When these effects are expected, political opposition to the DRP can develop. Senior management can prevent this opposition from halting the DDRP inappropriately by requiring that the business case for the DRP include these financial factors and demonstrate clearly the need to proceed despite them. For example, including these factors might entail adjusting revenue targets downward to account for the interruptions due to the DRP. Involving potential political opponents of the DRP in business case development can be an effective means of ensuring the strength of the business case.

The ability to model all these financial effects is an important organizational asset that can be developed and maintained, for deployment across multiple DDRPs. The organization can monitor DRPs, gathering actual experience data for comparison to the effects projected in the respective business cases of the DRPs. Those comparisons are useful for enhancing the modeling capability.

3.   We need to re-certify some assets the DRP doesn’t directly touch

A DDRP is more likely to be a wicked problem if, as a result of the changes executed in the DRP, any of the assets in IA need to be re-tested after or during DRP execution. The need to re-test any assets in IA typically arises when one of two conditions occurs. One condition occurs when there’s some risk that the DRP’s changes in the DBA could somehow affect the performance of the assets in IA. The second is when the consequences of such a risk event are severe.

Five ways this need to re-certify increases wickedness

This scenario enhances the wickedness of the DDRP for at least five possible reasons.

  • Baseline testing of IA is necessary to enable the DRP team to recognize the effects of the DRP on IA behavior. But this baseline testing can reveal pre-existing and unaddressed faults. Leaving those faults in place can seriously complicate interpreting anomalies that appear in IA assets after DRP work has begun. That’s why the DDRP team might insist that the owners of the IA assets in question address some of these faults. With regard to these issues, political differences between the DDRP team and the owners of IA assets are possible. The additional testing of IA assets can…
  • …expand dramatically the set of stakeholders affected by the DRP, to include the owners, users, and maintainers of the IA assets.
  • …increase the need to interrupt revenue streams temporarily, and increase the number, duration, and frequency of such interruptions.
  • …require expertise and staffing beyond the DRP project team, which can disrupt other elements of the organization as the people needed are temporarily assigned to IA testing.
  • …reveal unanticipated consequences of the DRP alterations, which can trigger re-planning or redesign of the DRP during its execution. That re-planning or redesign, in turn, can trigger alterations in the DDRP.

The need to re-test assets not directly touched in the DRP is more likely when the DRP alters the external behavior of any of the DBA assets. The goal of many DRPs is improvement of the internals of assets without altering their external behavior, except possibly for performance improvements. This goal is desirable. It limits the need for re-testing and re-certification of IA assets.

Two classes of debt affect re-testing and re-certifying

The need to re-test and re-certify IA assets distinguishes two classes of debt in the DBA assets. Externally detectable debt in the DBA assets is debt that can be detected in the externals of the DBA assets. It includes their architecture, behavior, appearance, or interfaces. Externally undetectable debt in the DBA assets is any other debt not facially evident in the DBA assets. Retiring externally undetectable debt from the DBA assets is relatively straightforward. Only the DBA assets require re-testing and re-certification. Retiring externally detectable debt from the DBA assets is inherently more difficult and riskier. It requires more extensive re-testing and re-certification of both DBA and IA assets.

4.   The DRP directly touches multiple sites

DRPs that entail modifying technological assets of geographically dispersed organizations tend to be more wicked. This comes about because of factors including the following:

  • Sites might be geographically dispersed. But they might also be separated by language boundaries, legal jurisdictions, cultural divides, time zones, financial reporting practices, and much more. The required work of actually retiring the debt can vary from site to site for both technical and nontechnical reasons.
  • The multiple sites might have different landlords, with different lease agreements governing the organization’s occupancy of the property. This is just one of many factors that increase the numbers of stakeholders involved. It also exacerbates their heterogeneity. And the leases might constrain the kind of work that is permissible according to the day of the week or time of day.
  • If local vendors provide services such as communications or Internet connections to some of the sites, the DRP can be more complicated. If the work of the DRP involves these technologies and the local vendors, the task of coordinating all the different players can be complex and can encounter unanticipated obstacles.
Examples of unanticipated obstacles

For example, consider a case in which the work of the DRP involves networking hardware and software. That is work that we might prefer to perform at night or on a weekend, when it is less disruptive for users. For a global enterprise, there might not be a suitable time of day for such work.

As a second example, consider a network upgrade for retail branch offices of a global bank. If that upgrade requires trenching for new cable connections, the project design must take into account local regulations governing the trenches. Factors to consider include the permitting process and trench requirements. Trench requirements include specifications for filling, covering, and marking while still open. These regulations vary with national and sometimes local jurisdiction. The complexity causes most organizations to rely on local vendors. But even then, the vendor selection process must include reliable vendor assessment and evaluation. Scheduling becomes a complex and risky endeavor.

For these reasons, a DDRP that involves technological assets housed at multiple geographically dispersed sites has an elevated probability of exhibiting the properties of a wicked problem.

5.   Government agencies and/or industrial standards organizations must re-certify assets

Another driver of stakeholder expansion is the need for re-certifying assets after the DRP has modified them. The certification agencies can range from local and municipal regulators to national regulators and pan-industrial standards organizations. The number of possible agencies itself contributes to increased wickedness. But the operating style of these organizations merits special notice.

Many of these agencies operate without competitors. Perhaps for this reason, “customer service” might not be their strength. Gaining timely cooperation from them might be a challenging undertaking. Even though re-certification might be a small part of the DRP, it can become a blocking obstacle. Researching these requirements and their associated lead times, and maintaining a current knowledge base about them, can be an important task of the DDRP.

To acquire experience and information about their performance, consider using a pilot approach. Try to gain certification for an asset similar to the target of the DRP.

6.   Nontechnical stakeholders must change their behavior

Generally, people don’t like to change how they work. There are exceptions, of course, if they recognize a benefit that arrives in some direct way. But unless there is a direct benefit, requiring people to change how they work as part of a DRP is likely to increase the wickedness of the DDRP. And the difficulty is more problematic if the people affected are technically unsophisticated. They’re less likely to appreciate the value of managing technical debt, and less likely to accept explanations of that value.

DDRP wickedness increases in this case for another reason. In addition to retiring the technical debt, the DRP must address the tasks of motivating and training the affected population. That requires preparing materials, scheduling and accounting for the time spent in training, and monitoring training effectiveness. The business case must also address these issues. It must also provide the evidence required to defuse any political opposition that might otherwise develop.

7.   Major unanticipated complexity triggers redesign of the DRP

Unanticipated complexity happens in almost every project of almost any kind. But for DDRPs, unanticipated complexity that triggers adjustment of the DRP is especially unpleasant. Such a discovery can mean that the DBA assets or their connections to the IA assets have changed since the design team devised the plan. Or it can mean that the design team had an incomplete or incorrect understanding of the problem at hand. These events can occur for a number of reasons.

Examples of nontechnical causes

Imagining technical causes might be easier. So I’ll focus on nontechnical causes, which can actually be more serious. For example, suppose a political alliance enabled the VP of Sales and the VP of Engineering to reach a deal. They agreed that the DRP team would work on some important DBA assets, taking them off line for defined periods. If that political alliance weakens, or if the deal between the two VPs collapses for other reasons, the scheduled downtime of those assets might vanish. This pattern is more likely to arise in situations in which the DDRP team isn’t a party to such agreements. The DDRP team must be a party to any agreements regarding access to assets by the DRP team.

As a second example, consider what happens when the enterprise undertakes an acquisition of another enterprise. And suppose the acquisition team doesn’t inform the DDRP team during their design effort. Because chances are good that the DDRP would have a significant amount of rework to do for the acquisition, this scenario is illustrates a problem. The DDRP team must be aware of any organizational changes that could affect the DRP, for the active life of the DRP.

Redesigning the DDRP can take time. The DDRP team must periodically revisit elements of the DDRP that have short shelf lives during the design period. And the need to redesign can also indicate gaps in the DDRP team’s understanding of the problem.

All of these conditions tend to move the DDRP in the direction of increased wickedness.

8.   The DRP requires weekend or middle-of-the-night work periods

The need to perform critical operations on weekends or in nighttime hours suggests three things. First, the work is risky in the sense that undetected faults that go into production can lead to costly operational errors. Second, the organization lacks a simulated operating environment that emulates the actual operating environment faithfully enough to enable defect detection before deployment. Such environments are also known as staging environments. Third, and finally, the organization lacks a rapid rollback mechanism that can restore the original state of an asset if the new modified state proves problematic when deployed.

If you anticipate multiple DRPs,before undertaking a DRP, it’s wise to construct a staging environment and devise a rollback mechanism. Cost is usually the blocking issue. But compare that cost to the cost of retarding all future DRPs, and the cost of any operational failures arising from deploying faulty systems. Staging and rollback capabilities are usually good investments.

Continued refusal to provide a staging environment with rapid rollback increases the wickedness of this and any future DDRPs.

9.   Rollback of attempted changes triggers redesign

In the course of executing the DRP, if reverting some (or all) of the work performed becomes necessary, we say that we’re ordering a rollback. Minor rollbacks do happen. But if we discover the need for a rollback long after completion of the work in question, the damage can be catastrophic. When these incidents occur, they can indicate a deep misunderstanding of the consequences of the work of the DRP. Because that misunderstanding could have consequences not yet recognized, such a rollback could suggest that the DDRP team underestimated the wickedness of the DDRP.

Let DBAf (faulty DBA) be the set of assets in the DBA that formerly contained some of the debt being retired. And suppose the DRP alterations contained or led to exposure of some kind of fault(s). Suppose further that the faults forced a rollback after they were deployed. Let DBAfw (wicked-faulty DBA) represent the subset of DBAf for which that rollback did trigger a redesign of the DDRP. Then wickedness of the DDRP is correlated with the size of DBAfw and the extent of the DDRP redesign that the rollback triggered.

An illustration of the effect of defects

For example, let Efw be a member of DBAfw. And suppose Efw is a modular element of a system that monitors the clicks of users of a Web site. It records data for later analysis, and because of the fault it does so incorrectly. When the site operators discover the errors, the DRP orders the rollback of Efw. They replace Efw with its original, unaltered, debt-bearing form. Because Efw contaminated the original database, data rollback is impossible. The site operators did discover the error, but they can’t re-capture lost data. That’s why the DDRP team must re-design the DRP. This scenario is an example of Criterion 5. If there are political consequences for the loss of data, this scenario could be an example of Criterion 10.

This example suggests how the frequency of incidents that trigger redesign of the DDRP can be an indicator of the wickedness of the DDRP.

Example of a non-indicator: the I-35 SR-30 interchange near Ames, Iowa

Just outside Ames, Iowa, is an interchange between Interstate 35 (a four-lane, divided, limited-access roadway) and U.S. Route 30 (four-lane, divided, not limited-access). The interchange is a conventional cloverleaf design. The “leaves” are rather tight, though, and consequently, there have been numerous rollovers and crashes at this interchange. We can regard these tight cloverleaf ramps as technical debt in the highway system, and the rollovers and crashes as metaphorical interest charges on that debt.

In 2016, construction began on a new “flyover” exit ramp from northbound Interstate 35 onto westbound U.S. Route 30. The objective was to reduce the number of accidents at the interchange by replacing the current tight-curvature cloverleaf ramp with a flyover exit ramp with a longer radius of curvature. We can regard this project as a Debt Retirement Project (DRP). The project that planned that DRP was an effort to Design a Debt Retirement Project (DDRP).

Much went well, but an error occurred

Completion of the DRP was scheduled for November 2018. When completed, the new ramp will replace the northeast leaf of the cloverleaf. Like most civil engineering projects, this project does have some elements of wickedness. But the project dealt with those elements effectively. Nevertheless, a construction error is delaying completion [Magel 2018] [Iowa DOT 2018].

The error involves the height and position of the bolt anchors where steel bridge beams will connect to the concrete piers of the new flyover ramp. The contractor constructed six piers to support the flyover. Correcting the piers involves jackhammering the concrete tops, leaving the steel reinforcement in place. After positioning the beam anchors correctly, and re-pouring the concrete, the piers will be ready to support the beams. At this writing, the contractor has not yet announced the new completion date.

How they’re correcting the error

This effort, which includes a rollback and re-deployment, is a significant project in itself. It requires scheduling the work. But it also requires scheduling highway lane closures and lane shifts, and working around high-volume traffic periods. Depending on the schedule, they will possibly pour concrete in winter conditions. And after correcting the piers, the bridge beam placement and bridge roadbed work must proceed on a new schedule.

Consequently, the construction error triggered a redesign of the flyover project’s DRP. But it probably did not trigger a significant redesign of the DDRP. The construction error is therefore unlikely to be an indicator of significant additional wickedness for the DDRP.

Last words

You can become better managers of the risk of unanticipated wickedness. If your organization is embarking upon a long-term program of technical debt retirement, you’ll be executing many DDRPs and DRPs. Gathering data about incidents of unanticipated wickedness in DDRPs can be a useful practice, if you use that data when you design new technical debt retirement projects.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Iowa DOT 2016] “Construction drawing for the Northbound I-35 Flyover Ramp at U.S. 30 Near Ames,” Iowa Department of Transportation, February 2, 2016.

Available: here; Retrieved: October 31, 2018

Cited in:

[Iowa DOT 2018] “Work Continues on the Northbound I-35 Flyover Ramp at U.S. 30 Near Ames,” Iowa Department of Transportation News Release, June 27, 2018.

Available: here; Retrieved: October 31, 2018

Cited in:

[Magel 2018] Todd Magel. “Uh-oh! Construction crews must redo $23 million project after big mistake,” KCCI News, July 11, 2018.

Available: here; Retrieved: October 31, 2018

Cited in:

[Marks 2005] Michelle A. Marks, Leslie A. DeChurch, John E. Mathieu, Frederick J. Panzer, and Alexander Alonso. “Teamwork in multiteam systems,” Journal of Applied Psychology 90:5, 964-971, 2005.

Cited in:

[Mathieu 2001] John E. Mathieu, Michelle A. Marks and Stephen J. Zaccaro. “Multi-team systems”, in Neil Anderson, Deniz S. Ones, Handan Kepir Sinangil, and Chockalingam Viswesvaran, eds., Handbook of Industrial, Work, and Organizational Psychology Volume 2: Organizational Psychology, London: Sage Publications, 2001, 289–313.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Degrees of wickedness

Last updated on July 12th, 2021 at 09:23 am

Window blinds with some slats open and some closed, a metaphor for the degrees of wickedness of a wicked problem
Window blinds with some slats open and some closed, a metaphor for the degrees of wickedness of a wicked problem. Think of the slats as the wicked problem criteria of Rittel and Weber. A closed slat is a criterion satisfied; a partially open slat is a criterion satisfied to some extent. A wicked problem has all slats closed; a tame problem has at least one slat at least partially open. When only a few slats are open, the problem isn’t a wicked problem, but finding a solution might be very difficult. The number of open slates corresponds to the degrees of wickedness of most technical debt retirement project design problems. When even one slat is partially open, we can peek through the blinds, and use that information to pry open other slats.

In a recent post I explored conditions that tend to make designing a project to retire technical debt a wicked problem. And in another post I noted some conditions that tend to make designing a project to retire technical debt a super wicked problem. But not all technical debt retirement project design efforts are wicked problems. “Wickedness” can occur in degrees. Designing these projects can be a tame problem, especially if we incurred the technical debt recently. In this post I explore degrees of wickedness in retiring technical debt. I propose a framework for dealing with technical debt retirement project design problems that are less-than-totally wicked.

The degrees of wickedness of a problem

As a quick review, here are the attributes of wicked problems as Rittel and Webber see them [Rittel 1973], rephrased for brevity:

  1. There is no clear problem statement
  2. There’s no way to tell when you’ve “solved” it
  3. Solutions aren’t right/wrong, but good/bad
  4. There’s no ultimate test of a solution
  5. You can’t learn by trial-and-error
  6. There’s no way to describe the set of possible solutions
  7. Every problem is unique
  8. Every problem can be seen as a symptom of another problem
  9. How you explain the problem determines what solutions you investigate
  10. The planner (or designer) is accountable for the consequences of trying a solution

Rittel and Webber held that wicked problems possessed all of these characteristics, but Kreuter, et al., take a different view, which I find compelling [Kreuter 2004]. Their view is that wicked problems and tame problems lie at opposite ends of a spectrum. A problem that satisfies all ten of the criteria would lie at the wicked end of the spectrum; one that satisfies none would lie at the tame end.

The ten criteria aren’t black-and-white

A close examination of Rittel’s and Webber’s ten criteria reveals that they aren’t black-and-white. We can regard each one as occurring in various degrees. For example, consider Criterion 1: “There is no clear problem statement,” which Rittel and Webber express as, “There is no definitive formulation of a wicked problem.” Burge and co-author McCall, who was a student of Rittel, offer this interpretation [Burge 2015]:

Here by the term formulation Rittel means the set of all the information need [sic] to understand and to solve the problem. By definitive, he means exhaustive.

The original language of Rittel and Webber, with the interpretation of Burge and McCall, is black-and-white. But we can imagine problems that satisfy this criterion to varying degrees. That is, one problem formulation might have almost everything needed to understand and solve the problem, while another might have almost none of what’s needed. In some cases, the problem solver might make progress toward a solution by making reasonable assumptions to fill gaps. Or the formulation as given might be incomplete. If so, by working on a solution despite gaps, the missing information might reveal itself, or it might arrive as a result of other research.

A continuum hypothesis

For these reasons, I regard the degree to which a problem satisfies Criterion 1 as residing on a continuum. And I expect that we could find analogous arguments for all ten criteria. This “continuum hypothesis” doesn’t conflict with the definition of a wicked problem. Wickedness still requires that all ten criteria be satisfied absolutely. But how well the problem satisfies the criteria of Rittel and Webber determines its position on the Tame/Wicked spectrum. In other words, as we address the problem of designing a technical debt retirement project, we can consider the degree of wickedness of the problem, not merely whether a problem is wicked.

The degree of a problem’s wickedness provides useful guidance. If a problem clearly satisfies nine of the ten criteria, but not the tenth, according to Rittel and Webber, it would not be a wicked problem. Because solving it might be extraordinarily difficult, we would treat it as wicked with respect to the nine criteria it satisfies. We would use that information to guide our decisions about resource choice and resource allocation. The model of wicked problems provided by Rittel and Webber would be useful, even though the problem itself might not meet their definition.

And so emerges the concept of the dimensionality of wickedness.

The dimensionality of wickedness

We can regard the ten criteria of Rittel and Webber as dimensions in a ten-dimensional space. When we do, our “wickedness spectrum” becomes much richer. Maybe too rich, in the sense that its complexity presents difficulty when we try to think about it. But the concept of dimensionality of wickedness can be useful, if we consider each dimension as having a degree of wickedness. This enables us to choose problem-solving techniques that work well for wicked problems that owe their wickedness to specific dimensions. That is the approach of Kreuter, et al. [Kreuter 2004].

This suggests a framework for designing (or redesigning) technical debt retirement projects:

  • Deal separately (and first) with any parts of the technical debt retirement project design problem that are tame
  • Determine the importance of each one of a set of “nine indicators of wickedness
  • Use that information to determine which of the ten criteria of Rittel and Webber are most relevant to this particular technical debt retirement project design problem
  • Apply established approaches that account for the relevant criteria to formulate a project design

This program is too much for a single post. But I can make a start in my next post with descriptions of the indicators of wickedness. That post includes an examination of the implications of each of these indicators relative to the presence of each of the ten criteria of Rittel and Webber. The next step will be to suggest techniques for technical debt retirement project design problems that meet, to some degree, the criteria of Rittel and Webber.

Buckle up.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Burge 2015] Janet E. Burge and Raymond McCall. “Diagnosing Wicked Problems,” Design Computing and Cognition 14, 2015, 313-326.

Available: here; Retrieved: October 25, 2018

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Iowa DOT 2016] “Construction drawing for the Northbound I-35 Flyover Ramp at U.S. 30 Near Ames,” Iowa Department of Transportation, February 2, 2016.

Available: here; Retrieved: October 31, 2018

Cited in:

[Iowa DOT 2018] “Work Continues on the Northbound I-35 Flyover Ramp at U.S. 30 Near Ames,” Iowa Department of Transportation News Release, June 27, 2018.

Available: here; Retrieved: October 31, 2018

Cited in:

[Kreuter 2004] Marshall W. Kreuter, Christopher De Rosa, Elizabeth H. Howze, and Grant T. Baldwin. “Understanding wicked problems: a key to advancing environmental health promotion.” Health Education and Behavior 31:4, 2004, 441-454.

Available: here; Retrieved: October 26, 2018

Cited in:

[Magel 2018] Todd Magel. “Uh-oh! Construction crews must redo $23 million project after big mistake,” KCCI News, July 11, 2018.

Available: here; Retrieved: October 31, 2018

Cited in:

[Marks 2005] Michelle A. Marks, Leslie A. DeChurch, John E. Mathieu, Frederick J. Panzer, and Alexander Alonso. “Teamwork in multiteam systems,” Journal of Applied Psychology 90:5, 964-971, 2005.

Cited in:

[Mathieu 2001] John E. Mathieu, Michelle A. Marks and Stephen J. Zaccaro. “Multi-team systems”, in Neil Anderson, Deniz S. Ones, Handan Kepir Sinangil, and Chockalingam Viswesvaran, eds., Handbook of Industrial, Work, and Organizational Psychology Volume 2: Organizational Psychology, London: Sage Publications, 2001, 289–313.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

Other posts in this thread

Retiring technical debt can be a super wicked problem

Last updated on July 10th, 2021 at 10:51 am

In my last post I provided a list of attributes of wicked problems [Rittel 1973]. I included the reasons why I feel that designing technical debt retirement projects can be wicked problems. As a review, here are the attributes of wicked problems as Rittel and Webber see them, rephrased for brevity:

  1. Wicked problems have no definitive formulation
  2. Wicked problems have no stopping rule
  3. Solutions to wicked problems aren’t true-or-false; they’re good-or-bad
  4. There is no immediate ultimate test of a solution to a wicked problem
  5. Every solution to a wicked problem is a “one-shot operation”; because there is no opportunity to learn by
    trial-and-error, every attempt counts significantly
  6. Wicked problems do not have an enumerable (or an exhaustively describable) set of potential solutions; nor is there a
    well-described set of permissible operations that we can incorporate into the plan
  7. Every wicked problem is essentially unique
  8. We can regard every wicked problem as a symptom of another problem
  9. The existence of a discrepancy representing a wicked problem can be explained in numerous ways. The choice of explanation
    determines the nature of the problem’s resolution
  10. The planner (or designer) has no right to be wrong

Four properties of super wicked bproblems

We can regard a subset of wicked problems as super wicked [Levin 2012]. Levin, et al. list the following four properties of super wicked problems. With each one, I’ve added reasons why planning a technical debt retirement project can qualify as a super wicked problem.

Time is running out

Super wicked problems have inherent timescales. For example, many believe that climate change will become irreversible within 30 years if current practices continue.

Technical debt retirement can have an inherent time scale. For example, Microsoft ended mainstream support for Windows 7 in January 2015. At that time, computers running Windows 7 incurred a technical debt. Yet by September 2018, 46.7% of computers running Windows were still running Windows 7. That was a 0.6% increase from the previous month [Keizer 2018]. At this writing, standard support will end in January 2020. That confers a timescale on this kind of technical debt.

When time runs out for solving wicked problems, the consequences can be severe. With respect to Windows 7, the consequences are more serious than forcing conversion to Windows 10. Some applications running on those machines might be compatible with Windows 10, but some might require udates. And all users of converted machines must learn how to use Windows 10 and any updated or replaced applications. So there’s also a training issue, a learning curve, and period of elevated user error rates.

Letting a time-boxed technical debt remain in place can be financially dangerous. As the retirement window closes, the cost of debt retirement concentrates on a declining number of fiscal quarters. If retirement costs are high enough, the impact of debt retirement on net income can be severe and negative. For enterprises whose securities are publicly traded, this effect can be costly for shareholders. At this writing there are only about five fiscal quarters remaining for Windows 10 conversions. For other technical debts, the number of fiscal quarters available for diluting the costs of retirement might be more—or less.

Those who cause the problem also advocate a solution
Two pilots line up their F/A-22 Raptor behind a tanker
Two pilots line up their F/A-22 Raptor behind a tanker. When Hurricane Michael made landfall on October 10, 2018, it passed over Tyndall Air Force Base. Tyndall has responsibility for air dominance training for F-22s. As Hurricane Michael approached, 33 of the 55 F-22s at Tyndall were repositioned to Wright-Patterson Air Force Base in Ohio [Phillips 2018a]. The remaining aircraft at Tyndall were undergoing maintenance and weren’t operational [Gabriel 2018]. The storm damaged some of them, but they’re believed to be repairable.

Because of climate change [Cook 2016], increases in storm intensity and frequency are likely. Air bases in coastal regions are at risk [Phillips 2018b]. They now constitute a technical debt. Relocating them could be a wicked problem, and possibly a super wicked problem. But if federal policies continue to fail to address climate change, they could prevent relocation. If that happens, the policies themselves represent a technical debt. U.S. Air Force photo by TSgt Ben Bloker, courtesy Wikipedia

The phrase “seek to provide a solution” might be somewhat tactful. I expect that some super wicked problems have the property that those who cause the problem exert some degree of control over what kinds of solutions are acceptable, or even discussible. In many cases, this represents a conflict of interest that can prevent the organization from deploying the more effective options.

That conflict of interest is certainly present in the context of many technical debt retirement projects. Technical debt formation and persistence are due, in part, to a failure to commit resources to retiring it, or, at least, to inhibiting its formation. That failure is the responsibility of those in leadership roles in the enterprise. Typically, these are the same people who must decide to commit resources to retire technical debt in the future.

The central authority needed to address the problem is weak or nonexistent

Again, I find this description unnecessarily limiting. I would prefer a phrasing such as, “The central authority, for whatever reason, chooses to exert, or is unable to exert, its authority in furtherance of solution, or even investigation.” In other words, the central authority need not be weak for it to be a source of difficulty in addressing the super wicked problem. It need only choose not to act. This can happen when those who cause the problem are the people who constitute the central authority, or they capture the central authority, or they capture the function to which the central authority has delegated responsibility for solution.

With respect to technical debt retirement, consider this scenario. At AMUFC, A Made-Up Fictitious Corporation, the sales and marketing functions have repeatedly struggled with the engineering function for shares of budget resources. Engineering has argued repeatedly, and unsuccessfully, that it needs additional resources to address the technical debt that has accumulated in several products. But the CEO is a former VP Sales, and a close friend of the CFO. Together, they have always decided to defer technical debt retirement in favor of new products and enhancements favored by the VP Marketing, by customers, and by investors.

Scenarios like this are common. Enterprise leadership is strong, but not inclined to address the technical debt retirement issue.

Partly as a result, policy responses discount the future irrationally

Irrational discounting of future costs and benefits occurs when policies are deployed that give too much emphasis to producing short-term benefits and/or to avoiding short-term costs or inconveniences. Benefits are pulled in from the future towards the present; costs and inconveniences are pushed out toward the future and deferred. One form of this discounting scheme—one of many—is hyperbolic discounting.

This tendency is one way of distracting attention from the actual problem. It is the principal tactic that enables the persistence of technical debt, and the means by which enterprises repeatedly defer attention to the problem of retiring technical debt.

Both the problem of managing technical debt and the problem of designing technical debt retirement projects, exhibit all of these properties to some degree. It’s likely, in my view, that these problems are super wicked problems.

Intervention strategies for super wicked problems

Levin, et al., recommend four distinct strategies for resolving super wicked problems [Levin 2012]. They are all approaches to devising policies that are difficult to alter, thus committing the organization to a particular path forward.

Lock-in

Lock-in is usually regarded as dysfunctional adherence to a strategy or course of action despite the existence of superior alternatives [Brenner 2011]. It occurs when a policy confers some kind of immediate benefit on a subset of the population. If that benefit is significant, if the population subset would be harmed by alterations of the policy that remove the benefit, and if the subset has enough political power to defend the benefit, the policy will be “locked in” and thus difficult to change. Levin, et al., suggest that this phenomenon can serve a beneficial purpose by protecting a constructive policy, thus preventing its abandonment.

Most technical debt retirement efforts focus solely on retiring the debt. All (or most) of the benefit appears in the form of increased engineering productivity, decreased sources of frustration for engineers, or increased engineering agility. Benefits for non-engineering stakeholders tend to be indirect. To establish policies that exploit lock-in we must craft them so that they provide ongoing, direct benefit to the most politically powerful stakeholders. For example, addressing first the forms of technical debt that are most likely to lead to product innovations that non-engineering stakeholders would value highly could cause those stakeholders to favor further technical debt retirement efforts.

Positive feedback

Exploiting lock-in makes policies durable when people or organizations already supporting the policy derive some kind of increased benefit, leading to others not yet supporting or covered by the policy to decide to support it. This mechanism is sometimes known as a “network effect.” When network effects are present, the value of a product or service increases as the size of the population using it increases [Shapiro 1998].

To exploit network effects when devising technical debt retirement efforts, focus on retiring the kinds of technical debts that confer benefits on stakeholders of platform assets. A platform asset is an asset that supports multiple other assets. Examples: an application development tool suite, a product line architecture, or an enterprise data network. Platform assets that support collaboration communities are more likely to generate network effects.

Increasing Returns

Policies and interventions that enable increasing returns to the population are more likely to be durable than those that offer steady returns. Because people adapt to steady levels of stimuli, policies that produce a change in the context only during the period immediately following initial adoption of those policies are less likely to maintain popular support than are policies that continue to provide increasing returns as long as they’re in place.

But among policies that provide increasing returns, Levin, et al., identify two types. Type I policies, which are less durable, confer their benefits on an existing population of supporters. They don’t cause others to become supporters. Type II policies also confer benefits on supporters, but they do cause others to become supporters. They are thus far more durable than are Type I, because they foster growth in the supporting population.

Framing technical debt retirement projects as individual projects with the objective of retiring a specified kind of technical debt is likely to lead to the enterprise population viewing the effort as a Type I policy at best. But framing each project as a phase of a longer-term effort could position the larger effort as a Type II policy, if the larger effort affects increasing portions of the enterprise population.

Self-reinforcing

Self-reinforcing policies create a dynamic that makes them more durable. Reinforcement can come about for two reasons. It can be a result of increases in the benefits the policy generates, or it can result from increases in the cost of rescinding the policy. In some cases, reinforcement can result from a combination of both effects. As with the strategy of Increasing Returns, there are two types of self-reinforcing policies. Type I self-reinforcing policies focus on maintaining support for the policy within the subset of the population consisting of its original supporters. In analogy with Type II Increasing Returns policies, Type II Self-reinforcing policies affect both the original supporting population and portions of the population not yet affected directly by the policy.

To exploit self-reinforcement, technical debt retirement programs must emphasize retiring debts that have curtailed organizational agility in recognizable ways, or which have prevented introduction of capabilities that the population values. Communicating these objectives is an important part of the program, because self-reinforcing popular support is possible only if the population understands the strategy and how it benefits the enterprise.

Last words

Because of the essential uniqueness of any wicked problem (Proposition 7 of Rittel and Webber), it is futile to attempt to apply as a template any retirement program that worked for some other organization, or for some other portion of a given organization at an earlier time with a different form of technical debt. But these four strategies, implemented carefully and communicated widely and effectively within the organization, can build organizational commitment to a long-term technical debt retirement program, even though retiring technical debt may be a super wicked problem.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Brenner 2011] Richard Brenner. “Indicators of Lock-In: I,” Point Lookout 11:12, March 23, 2011.

Available: here; Retrieved: October 23, 2018.

Cited in:

[Burge 2015] Janet E. Burge and Raymond McCall. “Diagnosing Wicked Problems,” Design Computing and Cognition 14, 2015, 313-326.

Available: here; Retrieved: October 25, 2018

Cited in:

[Cook 2016] John Cook, Naomi Oreskes, Peter T. Doran, William R.L. Anderegg, Bart Verheggen, Ed W. Maibach, J. Stuart Carlton, Stephan Lewandowsky, Andrew G. Skuce, Sarah A. Green, Dana Nuccitelli, Peter Jacobs, Mark Richardson, Bärbel Winkler, Rob Painting, and Ken Rice. “Consensus on consensus: a synthesis of consensus estimates on human-caused global warming,” Environmental Research Letters 11, 2016, 048002.

Available: here; Retrieved: October 23, 2018

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Gabriel 2018] Melissa Gabriel. “Hurricane Michael: Fate of costly stealth fighter jets at Tyndall Air Force Base still unknown,” USA Today: Pensacola News Journal, October 17, 2018.

Available: here; Retrieved: October 23, 2018

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Iowa DOT 2016] “Construction drawing for the Northbound I-35 Flyover Ramp at U.S. 30 Near Ames,” Iowa Department of Transportation, February 2, 2016.

Available: here; Retrieved: October 31, 2018

Cited in:

[Iowa DOT 2018] “Work Continues on the Northbound I-35 Flyover Ramp at U.S. 30 Near Ames,” Iowa Department of Transportation News Release, June 27, 2018.

Available: here; Retrieved: October 31, 2018

Cited in:

[Keizer 2018] Gregg Keizer. “Windows by the numbers: Windows 10 backtracks, Windows 7 remains resilient,” Computerworld, October 2, 2018.

Available: here; Retrieved: October 18, 2018

Cited in:

[Kreuter 2004] Marshall W. Kreuter, Christopher De Rosa, Elizabeth H. Howze, and Grant T. Baldwin. “Understanding wicked problems: a key to advancing environmental health promotion.” Health Education and Behavior 31:4, 2004, 441-454.

Available: here; Retrieved: October 26, 2018

Cited in:

[Levin 2012] Kelly Levin, Benjamin Cashore, Steven Bernstein, and Graeme Auld. “Overcoming the tragedy of super wicked problems: constraining our future selves to ameliorate global climate change,” Policy Science 45, 2012, 123–152.

Available: here; Retrieved: October 17, 2018

Cited in:

[Magel 2018] Todd Magel. “Uh-oh! Construction crews must redo $23 million project after big mistake,” KCCI News, July 11, 2018.

Available: here; Retrieved: October 31, 2018

Cited in:

[Marks 2005] Michelle A. Marks, Leslie A. DeChurch, John E. Mathieu, Frederick J. Panzer, and Alexander Alonso. “Teamwork in multiteam systems,” Journal of Applied Psychology 90:5, 964-971, 2005.

Cited in:

[Mathieu 2001] John E. Mathieu, Michelle A. Marks and Stephen J. Zaccaro. “Multi-team systems”, in Neil Anderson, Deniz S. Ones, Handan Kepir Sinangil, and Chockalingam Viswesvaran, eds., Handbook of Industrial, Work, and Organizational Psychology Volume 2: Organizational Psychology, London: Sage Publications, 2001, 289–313.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

[Phillips 2018a] Dave Phillips. “Tyndall Air Force Base a ‘Complete Loss’ Amid Questions About Stealth Fighters,” The New York Times, October 11, 2108.

Available: here; Retrieved: October 23, 2018

Cited in:

[Phillips 2018b] Dave Phillips. “Exposed by Michael: Climate Threat to Warplanes at Coastal Bases,” The New York Times, October 17, 2108.

Available: here; Retrieved: October 23, 2018

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

[Shapiro 1998] Carl Shapiro and Hal R. Varian. Information rules: a strategic guide to the network economy. Harvard Business Press, 1998.

Cited in:

Other posts in this thread

Retiring technical debt can be a wicked problem

Last updated on July 16th, 2021 at 07:38 pm

Prototypes of President Trump’s “border wall.”

Prototypes of President Trump’s “border wall.” Building the wall is an example of a wicked problem. Building prototypes in short segments of the wall is a tame problem. But these are just prototypes of short segments of the wall. They aren’t prototypes of the project.

Prototypes of the wall itself don’t demonstrate the process for taking private property, or how to build construction access roads, or the effects on wildlife, or how the government of Mexico will respond, or how to repair the wall when drug gangs destroy sections of it in isolated regions, or even the effectiveness of the wall. Prototyping works well for tame problems. It helps us project how the finished project will perform, and how difficult completing the project will be. But for wicked problems, prototyping is of limited value. As Rittel observes, prototyping can make the problem worse.

Photo by Mani Albrecht, U.S. Customs and Border Protection Office of Public Affairs, Visual Communications Division.

The theory of wicked problems originated with Horst Rittel in the mid-1960s. He was addressing “that class of problems which are ill-formulated, where the information is confusing, where there are many decision makers and clients with conflicting values, and where the ramifications in the whole system are confusing.” [Churchman 1967] The term wicked isn’t a moral judgment. It suggests the mischievous streak in these problems. Many of them have the property that proposed solutions can lead to conditions even more problematic than the original situation. Is it just me, or are you also thinking, “Ah, technical debt”? In this post, I suggest that retiring technical debt can be a wicked problem. I’ll show how wickedness explains many of the difficulties we associate with retiring forms of technical debt that involve many stakeholders, assets, revenue streams, policies, or strategies.

Introduction

Horst Rittel was a design theorist at the University of California at Berkeley. His interest in wicked problems came about because designers must deal with the interactions between architecture and politics. In today’s technology-dependent enterprises, analogous problems arise when we retire technical debt. When we do, we affect multiple sets of quasi-independent stakeholders.

Applicability to the technical debt problem

In the years since Rittel originated the wicked problem concept, others have extended it. These extensions have led some to regard the concept as inflated and less than useful. But extension or less-than-useful concepts rarely occurs, I take it as an indicator of worth. The focus of this post, then, is applying Rittel’s version of wicked problems to the problem of designing a complex technical debt retirement project.

The wicked problem concept has propagated mostly in the realm of public policy and social planning. Certainly wicked problems abound there. Poverty, crime control, and climate change are examples. But I know of no attempt to explore the wickedness of retiring technical debt in large enterprises, but have a look below and see what you think.

Rittel defines a problem as the discrepancy between the current state of affairs and the “state as it ought to be.” For the purposes of technical debt retirement planning, the state as it ought to be might at times be a bit ambitious. So I take the objective of a technical debt retirement project to be an attempt to resolve the discrepancy between the current state of affairs and some other state that’s more desirable. For the present purpose, then, the problem is designing a technical debt retirement project that converts the current state of an asset to a more desirable state that might still contain technical debt in some form. But in that new state, the asset is in a better configuration.

A note on super-wicked problems

Actually, there is a subset of wicked problems—super-wicked problems—that I think might include some technical debt retirement problems. I address them in the post “Retiring technical debt can be a super wicked problem.”

For now, though, let’s examine the properties of wicked problems. Let’s see how well they match up with the problem of designing technical debt retirement projects.

Attributes of wicked problems

Rittel’s summary of the attributes of wicked problems [Rittel 1973] convinced me that major technical debt retirement projects present wicked problems. Here are those attributes. In what follows, I use Rittel’s term tame problem to refer to a problem that isn’t wicked. (See also [Kreuter 2004])

1. [No Definitive Formulation] Wicked problems have no definitive formulation

For any given tame problem, it’s possible to state it in such a way that it provides the problem-solver all information necessary to solve it. That’s what definitive formulation means. For wicked problems, on the other hand, our understanding of the problem depends on the solution we’re considering. Each candidate solution might potentially require its own understanding of the problem.

When designing a technical debt retirement project, we must fully grasp the impact of the effort on all activities in the enterprise. Each proposed project plan has its own schedule and risk profile. Each proposed project plan affects enterprise activities in its own way. In principle, each candidate approach to the effort affects a different portfolio of enterprise assets in its own unique order. Because examining all possible candidate project plans is impractical, choosing a project plan by seeking an optimal set of effects is also impractical. By the time you’re ready to execute a given project plan, the data supporting your decision might be obsolete.

2. [No Stopping Rule] Wicked problems have no stopping rule

For any given tame problem, solutions have “stopping rules.” Stopping rules of solutions are signatures that indicate clearly that they are indeed solutions. For example, in a chess problem to be solved in N moves, N and checkmate provide a stopping rule. We know how to count to N and the position of checkmate is well defined.

Wicked problems have no stopping rule.

When planning a major technical debt retirement project, we must determine the attributes of the project. The attributes include a task breakdown, a sequence for performing the tasks, a resource array including both human and non-human resources, a risk plan including risk mitigations and risk responses, a revenue stream interruption schedule, and so on. For each such plan, we can estimate the direct and indirect costs to the enterprise. We can project the effects of the plan on market share for every affected product or service. Every plan has these attributes. When we compute them for a given candidate plan, the result doesn’t reveal that we’ve found “the solution.” We will have found only an estimate for that given solution. What we learn by doing this doesn’t reveal whether or not a “better” solution exists.

There is no indicator contained in any given candidate solution that tells us we can “stop” solving the problem. Most often, we just stop when we run out of time for finding solutions. In some cases, we stop when we find just one solution.

3. [Solutions Are Good/Bad] Solutions to wicked problems aren’t true-or-false, but good-or-bad

The criteria for finding solutions to tame problems are unambiguous. For example, if a candidate function satisfies a differential equation, it’s a solution to the equation. The volume of concrete required to pave a section of roadway is a single number, determined by computing the area of roadway and multiplying by the thickness of the roadbed, and subtracting the volume of any reinforcing steel.

The solutions to wicked problems have no such clarity. When evaluating a candidate project plan for retiring a technical debt, we can estimate its cost, the time required, interruptions in revenue streams, and the timing of resource requirements. But determining how “good” that is might be difficult. Much depends on what other demands there might be for those resources or funds. Much also depends on the political power of the people making those demands. No single number measures that.

4. [No Test of Solutions] There is no immediate ultimate test of a solution to a wicked problem

To test a candidate solution to a tame problem, the problem-solving team determines whether the solution meets the requirements set in the tame problem statement. The consequences of implementing the solution are all evident to the problem-solving team. The team has everything it needs to judge the success of the solution.

Not so with wicked problems. Any candidate solution to a wicked problem generates waves of consequences. As these waves propagate, some of the problem’s stakeholders might find the solution unsatisfactory. They’ll report their objections, possibly through politically powerful people or organizations. Because the consequences can be so diverse, the team can’t anticipate all of them. In some cases, the team might have difficulty understanding how the troubles that plague some stakeholders were actually related to the implemented solution. Some undesirable consequences can be far more harmful than any intended benefits are helpful. In other cases, the undesirable consequences might remain undiscovered until long after the solution is in use and operational.

When designing a technical debt retirement project, it’s necessary to determine everything that must be changed, what resources must be assembled to do the work, and what processes might be interrupted, when and for how long. Only rarely, if ever, can we determine all of that with certainty in advance. For that reason, determining that the design of the project is “correct” isn’t possible, except perhaps in the probabilistic sense. We never really know in advance that we’ve found a solution. Most of the time, after execution begins, we must make adjustments along the way, in real time.

5. [No Trial-and-error] Every solution to a wicked problem is a “one-shot operation”; because there is no opportunity to learn by trial-and-error, every attempt counts significantly

When solving tame problems, we can try candidate solutions without incurring significant penalties. That is, trying a solution might require some effort, and therefore incur a cost. But it doesn’t otherwise affect the ability to find other solutions. Wicked problems are different. Every attempt to “try” a solution leaves traces that can potentially make further solution attempts more difficult, costly, or risky than they would have been if we hadn’t tried that solution. These traces of past solution attempts might also impose constraints on future solutions. Those constraints can effectively transform the wicked problem into a different wicked problem. This property makes trial-and-error approaches undesirable and possibly infeasible. Indices of such undesirability are the half-lives of the traces of attempts to address the problem. A long half-life might mean that the problem solver has only one shot at addressing the problem.

When designing a technical debt retirement project, we sometimes try to “pilot” a potential approach to determine difficulty, costs, feasibility, political issues, or risk profiles. Even when we can revert the asset to its former state after a pilot is completed or suspended, the consequences for stakeholders and for stakeholder operations might not be reversible. When we next try another “pilot,” or perhaps a fully committed retirement project, these stakeholders might be significantly less willing to cooperate. Every attempted solution can thus leave political or financial traces like these, making future attempts riskier and more challenging.

6. [Solutions Are Not Describable] Wicked problems don’t have an enumerable (or an exhaustively describable) set of potential solutions, nor is there a well-described set of permissible operations that we can incorporate into the plan

In devising solutions to tame problems, one common approach entails first gathering the full set of possibilities. Next, we screen them according to a set of favorability criteria. Reducing the field of possibilities is a useful strategy for finding optimal or acceptable solutions to tame problems.

Wicked problems defy such strategies. Gathering the full set of possible solutions to a given wicked problem can be a wicked problem in itself. We cannot parameterize the set of possible solutions to a wicked problem. We cannot define a finite set of attributes that fully covers the solution space. For these reasons, we can never be certain that the set of candidate solutions is complete.

Candidate designs for technical debt retirement projects present this same quality. We have a dizzying array of choices. In what order should we retire different kinds of technical debts? In what order should we address the debts different assets bear? Can we “refinance” portions of the debt to intermediate forms [Zablah 2015]? What kinds of refactoring should we perform and when? Because options like these are neither denumerable nor parameterizable, we cannot know whether a given set of candidate project designs is complete.

7. [Essential Uniqueness] Every wicked problem is essentially unique

Among tame problems, we can define classes or categories of problems that share a solution method. That is, using the method associated with a given class, we can solve all problems in that class. For example, we can solve all second order linear differential equations with the same method.

Even though we can define classes of wicked problems whose members are in some sense similar, that similarity doesn’t enable us to find a unified solution strategy that works for every member of the class.

So it is with designing technical debt retirement projects. Certainly, the collection of all technical debt retirement projects is a class. But the problem of designing a given retirement project is essentially unique. What “works” for one project in one enterprise in one fiscal year probably won’t work for another project in another enterprise in another fiscal year. It might not even work for another project in that same enterprise in that same fiscal year. Elements of the solution for one project might be useful for another project. But even then, we might need to adapt them to the conditions of that next project.

This essential uniqueness property of technical debt retirement projects collides with a common pattern decision makers use when chartering major efforts. That pattern is reliance on consultants, employees, or contractors who “have demonstrated success and experience with this kind of work.” Because each technical debt retirement project is essentially unique, relying on a history of demonstrated success is a much less viable strategy than it would be with tame problems. Decision makers would do well to keep this in mind when they seek approaches, leaders, and staff for major technical debt retirement efforts: no major technical debt retirement project is like any other.

8. [Problems as Symptoms] We can regard every wicked problem as a symptom of another problem

With tame problems or wicked, we typically begin the search for solutions by inquiring as to the cause of the current condition. When we find the cause or causes, and remove them, we usually find a new problem underlying them. Thus, for wicked problems, what we regarded initially as the problem is thereby converted into a symptom of a newly recognized underlying problem. By repeating this process, we escalate the “level” of the problem we’re addressing. Higher-level problems do tend to be more difficult to resolve, but addressing symptoms, though easier, isn’t a path to ultimate resolution.

Rittel also observes that incremental approaches to resolving wicked problems can be self-defeating. The difficulty arises from the traces left behind by incrementalism, as described in the discussion of the unworkability of trial-and-error strategies. Rittel provides the example of the increase in difficulty of changing processes after we automate them.

To regard the wicked problem of designing a technical debt retirement project as a symptom of a higher-level wicked problem, we must be willing to regard as problems the very things that make the technical debt retirement project design effort a wicked problem. That is, the processes that lead to formation of technical debt, or that enhance its persistence, are themselves wicked problems. For example, one might inquire about how to change the enterprise culture so as to reduce the incidence of technical debt contagion. To undertake major technical debt retirement efforts without first determining what can be done to limit technical debt formation or persistence due to contagion or due to other processes, might be unwise.

9. [No Controlled Experiments] The existence of a discrepancy representing a wicked problem can be explained in numerous ways. The choice of explanation determines the nature of the problem’s resolution

When addressing tame problems, problem-solving teams can often perform controlled experiments. The general framework of these experiments is as follows. The team forms a hypothesis H as to the cause of the problem, conjecturing a solution. Then assuming H is correct, and given a set of conditions C, they deduce the consequences E that must follow. If any elements of E don’t occur, then H is incorrect. The process repeats until an H’ is found that provides all elements of E. H’ then provides the basis of a solution. Essentially, this is the scientific method.

With wicked problems, the method fails in numerous ways. Foremost among these failure modes is the inability to control C. That is, interventions that might be required to set C to be a desired C0 tend to be impossible. Moreover, even if we can establish C0, the experiments that determine whether E is observed tend to leave the traces discussed in Proposition 5 [No Trial-and-error]. Finally, the determining the presence or absence of the elements of E is usually subjective.

When planning enterprise-scale technical debt retirement projects, as with many projects of similar scale, we believe that we can benefit from running a pilot of our proposed plan, to determine its fitness. These trials are sometimes called “proof of concept” exercises. However, because we cannot control the conditions in which we execute the pilot, we cannot be confident that our interpretation of the results of the pilot will apply to the actual project. Moreover, a small-scale pilot cannot generate some of the effects we most want to observe because they occur only at full scale. These effects include staff shortages, resource contention, and revenue interruption incidents.

10. [100% Accountability] The planner (or designer) has no right to be wrong

In solving tame problems, solvers can experiment with proposed solutions. They make conjectures about what might work, and gather the results of trials to determine how to improve their conjectured solutions. There is no social or legal penalty for failed conjectures.

In solving wicked problems, experiments don’t exist. Any trial solution is a real solution, with real effects on stakeholders and later, real effects on the problem solvers themselves. Problem solvers are accountable for the undesirable consequences of each solution, whether it’s a trial or not.

In planning a technical debt retirement project, any attempt to gather data about how the approach would affect the enterprise could potentially have real, lasting, deleterious effects. The project bears the costs associated with these consequences, if not officially and financially, then politically. The politics of failure can lead to serious consequences for the problem solvers. Any approach that the team deploys, on any scale no matter how small, can potentially create financial problems for the enterprise, and political problems for anyone associated with the technical debt retirement project.

Last words

The fit between wicked problems and technical debt retirement project design looks pretty good to me. But the research on a subset of wicked problems—super wicked problems—is also intriguing. I’ll look at that in my next post. After that, we’ll be ready to examine which approaches to retiring technical debt take these matters into account.

References

[Bach 1999] James Bach. “Test Automation Snake Oil!” (1999).

Available: here; Retrieved: January 2, 2019

Cited in:

[Brenner 2011] Richard Brenner. “Indicators of Lock-In: I,” Point Lookout 11:12, March 23, 2011.

Available: here; Retrieved: October 23, 2018.

Cited in:

[Burge 2015] Janet E. Burge and Raymond McCall. “Diagnosing Wicked Problems,” Design Computing and Cognition 14, 2015, 313-326.

Available: here; Retrieved: October 25, 2018

Cited in:

[Churchman 1967] C. West Churchman. “Wicked problems,” Management Science 14:4, 1967, B-141–B-142

Available: here; Retrieved: October 16, 2018

Cited in:

[Cook 2016] John Cook, Naomi Oreskes, Peter T. Doran, William R.L. Anderegg, Bart Verheggen, Ed W. Maibach, J. Stuart Carlton, Stephan Lewandowsky, Andrew G. Skuce, Sarah A. Green, Dana Nuccitelli, Peter Jacobs, Mark Richardson, Bärbel Winkler, Rob Painting, and Ken Rice. “Consensus on consensus: a synthesis of consensus estimates on human-caused global warming,” Environmental Research Letters 11, 2016, 048002.

Available: here; Retrieved: October 23, 2018

Cited in:

[Dragičević 2016] Tomislav Dragičević, Xiaonan Lu, Juan C. Vasquez, and Josep M. Guerrero. “DC Microgrids–Part II: A Review of Power Architectures, Applications and Standardization Issues,” IEEE Transactions on Power Electronics, vol 31:5, 3528-3549, 2016.

Cited in:

[Gabriel 2018] Melissa Gabriel. “Hurricane Michael: Fate of costly stealth fighter jets at Tyndall Air Force Base still unknown,” USA Today: Pensacola News Journal, October 17, 2018.

Available: here; Retrieved: October 23, 2018

Cited in:

[Ge 2014] Xi Ge and Emerson Murphy-Hill. “Manual Refactoring Changes with Automated Refactoring Validation,” Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Available: here; Retrieved: January 1, 2019

Cited in:

[Humble 2010] Jez Humble and David Farley. Continuous delivery: reliable software releases through build, test, and deployment automation, Pearson Education, 2010.

Cited in:

[Iowa DOT 2016] “Construction drawing for the Northbound I-35 Flyover Ramp at U.S. 30 Near Ames,” Iowa Department of Transportation, February 2, 2016.

Available: here; Retrieved: October 31, 2018

Cited in:

[Iowa DOT 2018] “Work Continues on the Northbound I-35 Flyover Ramp at U.S. 30 Near Ames,” Iowa Department of Transportation News Release, June 27, 2018.

Available: here; Retrieved: October 31, 2018

Cited in:

[Keizer 2018] Gregg Keizer. “Windows by the numbers: Windows 10 backtracks, Windows 7 remains resilient,” Computerworld, October 2, 2018.

Available: here; Retrieved: October 18, 2018

Cited in:

[Kreuter 2004] Marshall W. Kreuter, Christopher De Rosa, Elizabeth H. Howze, and Grant T. Baldwin. “Understanding wicked problems: a key to advancing environmental health promotion.” Health Education and Behavior 31:4, 2004, 441-454.

Available: here; Retrieved: October 26, 2018

Cited in:

[Levin 2012] Kelly Levin, Benjamin Cashore, Steven Bernstein, and Graeme Auld. “Overcoming the tragedy of super wicked problems: constraining our future selves to ameliorate global climate change,” Policy Science 45, 2012, 123–152.

Available: here; Retrieved: October 17, 2018

Cited in:

[Magel 2018] Todd Magel. “Uh-oh! Construction crews must redo $23 million project after big mistake,” KCCI News, July 11, 2018.

Available: here; Retrieved: October 31, 2018

Cited in:

[Marks 2005] Michelle A. Marks, Leslie A. DeChurch, John E. Mathieu, Frederick J. Panzer, and Alexander Alonso. “Teamwork in multiteam systems,” Journal of Applied Psychology 90:5, 964-971, 2005.

Cited in:

[Mathieu 2001] John E. Mathieu, Michelle A. Marks and Stephen J. Zaccaro. “Multi-team systems”, in Neil Anderson, Deniz S. Ones, Handan Kepir Sinangil, and Chockalingam Viswesvaran, eds., Handbook of Industrial, Work, and Organizational Psychology Volume 2: Organizational Psychology, London: Sage Publications, 2001, 289–313.

Cited in:

[Meadows 1997] Donella H. Meadows. “Places to Intervene in a System,” Whole Earth, Winter 1997.

Available: here; Retrieved: June 28, 2018

Cited in:

[Meadows 1999] Donella H. Meadows. “Leverage Points: Places to Intervene in a System,” Hartland VT: The Sustainability Institute, 1999.

Available: here; Retrieved: June 2, 2018.

Cited in:

[Meadows 2008] Donella H. Meadows and Diana Wright. Thinking in Systems: A Primer. White River Junction, VT: Chelsea Green Publishing, 2008.

Order from Amazon

Cited in:

[NTSB 2008] National Transportation Safety Board. “Board Meeting Executive Summary: Collapse of I-35W Highway Bridge, Minneapolis, Minnesota, August 1, 2007,”, November 13, 2008.

Available: here; Retrieved: January 3, 2019.

Cited in:

[Phillips 2018a] Dave Phillips. “Tyndall Air Force Base a ‘Complete Loss’ Amid Questions About Stealth Fighters,” The New York Times, October 11, 2108.

Available: here; Retrieved: October 23, 2018

Cited in:

[Phillips 2018b] Dave Phillips. “Exposed by Michael: Climate Threat to Warplanes at Coastal Bases,” The New York Times, October 17, 2108.

Available: here; Retrieved: October 23, 2018

Cited in:

[Rittel 1973] Horst W. J. Rittel and Melvin M. Webber. “Dilemmas in a General Theory of Planning”, Policy Sciences 4, 1973, 155-169.

Available: here; Retrieved: October 16, 2018

Cited in:

[Shapiro 1998] Carl Shapiro and Hal R. Varian. Information rules: a strategic guide to the network economy. Harvard Business Press, 1998.

Cited in:

[Zablah 2015] Raul Zablah and Christian Murphy. “Restructuring and Refinancing Technical Debt.” Proceedings of the IEEE 7th International Workshop on Managing Technical Debt (MTD). IEEE, 2015.

Available: here; Retrieved: February 13, 2016

Cited in:

Other posts in this thread

Show Buttons
Hide Buttons