The resilience error and technical debt

Last updated on October 4th, 2018 at 02:09 pm

I’ve mentioned the reification error in a previous post (see “Metrics for technical debt management: the basics”), but I haven’t explored its dual, the resilience error. Let me correct that oversight now.

The future USS Zumwalt (DDG 1000) in sea trials, December 2015
The future USS Zumwalt (DDG 1000) is underway for the first time conducting at-sea tests and trials in the Atlantic Ocean Dec. 7, 2015. The first of the Zumwalt class of US Navy guided missile destroyers, it is designed to be stealthy, and to be supported by a minimal crew. After the program experienced explosive cost growth, the class has been downsized from 32 ships to three, and complement increased from 95 to over 140 to reduce capital costs. The three vessels on order now have significantly reduced missions. As one might expect, the causes of these troubles are much debated. But it’s possible that the resilience error plays a role. Before the first of a new class of ships goes to sea, it exists as an abstraction—a collection of concepts, plans, promises, and technologies, tried and untried. Many elements of this collection have never inter-operated with other elements. The first ship represents the first opportunity to see how all the elements work together. Although troubles often appear even before the ship is fully assembled, anticipating all troubles is extraordinarily difficult.

Reification risk is the risk that an error of reasoning known as the reification error might affect decisions—in this case, decisions regarding technical debt. The reification error [Levy 2009] [Gould 1996] (also called the reification fallacy, concretism, or the fallacy of misplaced concreteness [Whitehead 1948]) is an error of reasoning in which we treat an abstraction as if it were a real, concrete, physical thing. Reification is useful in some applications, such as object-oriented programming and design.

But when we reify in the domain of logical reasoning, troubles can arise. For example, we can encounter trouble when we think of “measuring” technical debt. Strictly speaking, we cannot measure technical debt. It isn’t a real, physical thing that can be measured. What we can do is estimate the cost of retiring technical debt, but estimates are only approximations. And in the case of technical debt, the approximations are usually fairly rough—they have wide uncertainty bands. That’s one way for trouble to enter the scene. When we regard the estimate as if it were a measurement, we tend to think of it as more certain than it actually is. Technical debt retirement projects then overrun their budgets and schedules, and chaos reigns.

For example, if we think we’ve measured the MPrin of a class of technical debt, rather than that we’ve estimated it, we’re more likely to believe that one measurement will suffice, and that it will be valid for a long time (or indefinitely). On the other hand, if we think we’ve estimated the MPrin of a class of technical debt, we’re more likely to believe that obtaining a second independent estimate would be wise, and that the estimate we do have might not be valid for long. These are just some of the consequences of the reification error.

The resilience error

If the reification error is risky because it entails regarding an abstraction as a real, physical thing, we might postulate the existence of a resilience error that’s risky because it entails regarding an abstraction as more resilient, pliable, adaptable, or extensible than it actually is.

When we commit the resilience error with respect to an abstraction, we adopt the belief—usually without justification, and possibly outside our awareness—that if we make changes in the abstraction without fully investigating the consequences of those changes, we can be certain that the familiar properties of the abstraction we modified will apply, suitably modified, to the new form of the abstraction.  Or we assume incorrectly that the abstraction will accommodate any changes we make to its environment.

Sometimes we benefit when we modify abstractions; usually we encounter unintended and unpleasant consequences. For example, unless we examine our modifications carefully, it’s possible that the implications of a modification might conflict with one or more of the fundamental assumptions of the abstraction.

Examples of the resilience error

Perhaps a (ahem) concrete example will illustrate. Consider the steel hull of an ocean liner. We can manufacture it more cheaply if we can devise a way to use less steel. So one approach to that goal is to remove a small portion of the bottom of the hull, say, a circular hole one meter in diameter. We send some people into the ship to do the work, and they return with panicky reports of water coming in. But the ship seems fine, so we reject the reports. Even a day later, all seems well. But by the end of the second day, the trouble is obvious. The ship is sinking.

The problem in our example is that the circular hole in the hull violated a fundamental assumption about how ship hulls work: they work by keeping all water out of the ship. We had extended the idea of hull to make it lighter, but in doing so, we encountered some unintended consequences because our extension violated a fundamental property of hulls.

Now for a less fanciful example.

Consider the fictitious company Alpha Properties LLC, which manages small condominium associations (from 25 to 100 units). Things have been going swimmingly at Alpha Properties, and they’ve decided to expand to handle large condominium associations. Their financial accounting software  has worked well, and their employees have become quite expert in its use. Alpha management has heard good reports from other management companies that deal with large client associations. So Alpha decides to use the same software for its larger accounts too. But things don’t work out so well.

The software is fine, but the processes used by the staff are cumbersome and slow. For example, setting up a new association requires too much manual data entry. For a 100-unit association, client setup wasn’t a burden, but for a 900-unit association the problem is just unmanageable.

This is a fine example of the resilience error. When we make this error, we fail to appreciate how an abstraction can encapsulate assumptions that make for difficulties when we try to extend it or apply it in a new or altered context. In this example, Alpha’s data flow processes are the abstraction. The context is signing up a new client association. When the context (signing up a large new client) is different, it violates an internal assumption of the abstraction (the data flow process for signing up a new client).

How the resilience error leads to technical debt

In many cases, the resilience error is at the heart of the causes of technical debt. It works like this. We have an asset that works perfectly well for one set of applications or in one set of contexts. We want to apply that asset in a new way, which might (or might not) require some minor extensions. When we try it, we find that the asset incorporates some assumptions about the application or the context, and one or more of those assumptions are violated by the new application or the new context. Scrambling, we find some quick fixes that can get things working again, but those fixes usually aren’t well designed or easily maintained. The result is a trail of technical debt.

Acquiring companies is like that. Before the acquisition, we think we’ll be able to merge the IT operations to save some expenses in operations. When we actually try it, though, merging them proves to be far more expensive than we imagined. Ah, the resilience error.

What makes this situation so difficult is that often we’re unable to anticipate what assumptions we might be about to violate. That’s why we make the resilience error.

Spotting difficulties with adapting to new applications and new contexts isn’t so difficult with physical entities. For example, we can see in advance that a square peg won’t fit into a round hole. But with abstractions, we can’t always see the problems in advance. Piloting, prototypes, games, and simulations can help us avoid some trouble, but not all.

References

[Gould 1996] Stephen Jay Gould. The mismeasure of man (Revised & Expanded edition). W. W. Norton & Company, 1996.

Order from Amazon

Cited in:

[Levy 2009] David A. Levy, Tools of Critical Thinking: Metathoughts for Psychology (second edition). Long Grove, Illinois: Waveland Press, Inc., 2009.

Cited in:

[Whitehead 1948] Alfred North Whitehead. Science and the Modern World. New York: Pelican Mentor (MacMillan), 1948 [1925].

Cited in:

Other posts in this thread

Team composition volatility

Last updated on February 1st, 2018 at 07:31 am

Team composition volatility can interfere with technical debt retirement. In many organizations, project team composition is rarely fixed from beginning to end. In most teams, people who have special knowledge cycle in and out as the work requires. Although these changes in team composition might not interfere with completing a team’s primary objectives, they can affect the team’s ability to retire technical debt that the team incurs over the life of the project. Changes in team composition can also limit the team’s ability to retire specified legacy technical debt that it encounters while working toward its primary objectives.

Now we know what we should have done.
“Now we know what we should have done.” This is one kind of incremental technical debt. When the composition of a development team changes over the course of project, recognizing how things should have been done can become more difficult.

Changes in team composition can increase the likelihood of incurring non-strategic incremental technical debt, and increase the likelihood of failing to retire all legacy debt specified in the team’s objectives.

Most product development, maintenance, and enhancement is carried out in groups we call teams. In this context, team is usually defined as, “a small group of interdependent individuals who share responsibility for outcomes.” [Hollenbeck 2012] However, as Hollenbeck et al. observe, teams vary widely in both skill differentiation and composition stability. My sense is that both factors can potentially influence a team’s ability to retire incremental technical debt. They also affect its ability to achieve its objectives with respect to retiring legacy technical debt.

For example, consider what Fowler calls the Inadvertent/Prudent class of technical debt — “Now we know how we should have done it.” [Fowler 2009] In a project of significant size, some might recognize that different approaches to all or parts of it would have been more effective than the ones that were chosen. The recognition might come several months, or even years, after the work affected was conceived or even completed.

But for the moment, consider only cases in which the recognition occurs during the project, or shortly after completion. In these cases, the people who performed that work might have moved on to other teams in need of their talents and abilities. The people who now realize “how we should have done it” might not be themselves capable of making the needed changes, even if they have the budget or time to do the work. Or worse, they might not have the knowledge needed to recognize that a different approach would have been more effective. In either case, recognized or not, the work performed by the people no longer on the team comprises incremental technical debt. Because of team composition volatility, recognizing or retiring that incremental technical debt can be difficult.

Team composition volatility can also interfere with retiring legacy technical debt. Some projects are specifically charged with retiring a class or classes of legacy technical debt. But others with different objectives might also be charged with retiring instances of specific kinds of legacy technical debt as they encounter them. When team members with special knowledge required for the team’s primary objectives are reassigned, some legacy technical debt can remain un-retired, if retiring that debt from the context in which it occurs requires their special knowledge, and if the reassignment occurs before they can complete the legacy debt retirement. This mechanism is more likely to occur when the legacy debt retirement objective is viewed as subordinate to other business objectives.

Keeping team membership stable has big advantages relative the technical debt management. Said differently, organizations that must shuffle people from team to team as a consequence of controlling costs by reducing headcount can pay big penalties in terms of increasing loads of technical debt.

References

[Fowler 2009] Martin Fowler. “Technical Debt Quadrant.” Martin Fowler (blog), October 14, 2009.

Available here; Retrieved January 10, 2016.

Cited in:

[Gould 1996] Stephen Jay Gould. The mismeasure of man (Revised & Expanded edition). W. W. Norton & Company, 1996.

Order from Amazon

Cited in:

[Hollenbeck 2012] John R. Hollenbeck, Bianca Beersma, and Maartje E. Schouten. “Beyond Team Types and Taxonomies: A Dimensional Scaling Conceptualization for Team Description,” Academy of Management Review, 37:1, 82–106, 2012. doi:10.5465/amr.2010.0181

Available: here; Retrieved: July 8, 2017

Cited in:

[Levy 2009] David A. Levy, Tools of Critical Thinking: Metathoughts for Psychology (second edition). Long Grove, Illinois: Waveland Press, Inc., 2009.

Cited in:

[Whitehead 1948] Alfred North Whitehead. Science and the Modern World. New York: Pelican Mentor (MacMillan), 1948 [1925].

Cited in:

Other posts in this thread