Basic Patterns in How Adaptive Systems Fail

Introduces the 3 fundamental forms of breakdown that apply to all adaptive systems at all scales. Resilient systems develop means to mitigate the risk of these 3. Also introduces the puzzle that adaptive systems are always & simultaneously mal-adapted, well-adapted, and under-adapted.

Figures - uploaded by David D Woods

Author content

All figure content in this area was uploaded by David D Woods

Content may be subject to copyright.

Discover the world's research

25+ million members
160+ million publication pages
2.3+ billion citations

Join for free

Chapter 10 1

Chapter 10: Basic Patterns in How Adaptive

Systems Fail

David D. Woods and Matthieu Branlat

This chapter provides one input to resilience management

strategies in the form of three basic patterns in how adaptive

systems fail. The three basic patterns are (1) decompensation –

when the system exhausts its capacity to adapt as disturbances /

challenges cascade; (2) working at cross-purposes – when roles

exhibit behaviour that is locally adaptive but globally mal-adaptive;

and (3) getting stuck in outdated behaviours – when the system

over-relies on past successes. Illustrations are drawn from urban

fire-fighting and crisis management. A working organisation needs

to be able to see and avoid or recognise and escape when the

system is moving toward one of the three basic adaptive traps.

Understanding how adaptive systems can fail requires contrasting

diverse perspectives.

The Optimist-Pessimist Divide on Complex Adaptive

Systems

Adaptive System Sciences begin with fundamental trade-offs –

optimality-brittleness, (Csete and Doyle, 2002; Zhou, Carlson and

Doyle, 2005) or efficiency-thoroughness (Hollnagel, 2009). As an entity,

group, system, or organisation attempts to improve its performance it

becomes better adapted to some things, factors, events, disturbances, or

variations in its environment (its ‘fitness’ improves). However, as a

consequence of improving its fitness with respect to some aspects of its

environment, that entity also must become less adapted to other events,

disturbances, or variations. As a result, when those ‘other’ events or

variations occur, the entity in question will be severely tested and may

2 Resilience Engineering in Practice

fail (this dynamic is illustrated by the story of the Columbia space

shuttle accident; e.g., Woods, 2005a).

The driving question becomes whether (and how) an entity can

identify and manage its position in the trade-off space? In other words,

can an organisation monitor its position and trajectory in a trade-off

space and make investments to move its trajectory prior to crisis

events? The pessimists on complexity and adaptive systems (e.g.,

Perrow, 1984) see adaptive systems as trapped in a cycle of expansion,

saturation, and eventual collapse. The pessimist stance answers the

above questions with ‘No.’ Their response means that as a system

adapts to meet pressures to be ‘faster, better, cheaper,’ it will become

more complex and experience the costs associated with increasing

complexity with little recourse.

Resilience Engineering, on the other hand, represents the optimist

stance and its agenda is to develop ways to control or manage a

system’s adaptive capacities based on empirical evidence. Resilience

Engineering maintains that a system can manage brittleness trade-offs.

To achieve such resilient control and management, a system must have

the ability to reflect on how well it is adapted, what it is adapted to, and

what is changing in its environment. Armed with information about

how the system is resilient and brittle and what trends are under way,

managers can make decisions about how to invest resources in targeted

ways to increase resilience (Woods, 2006a; Hollnagel, 2009).

The optimist stance assumes that an adaptive system has some

ability to self-monitor its adaptive capacity (reflective adaptation) and

anticipate/learn so that it can modulate its adaptive capacity to handle

future situations, events, opportunities and disruptions. In other words,

the optimist stance looks at human systems as able to examine, reflect,

anticipate, and learn about its own adaptive capacity.

The pessimist stance, on the other hand, sees an adaptive system as

an automatic built-in process that has very limited ability for learning

and self-management. Systems may vary in how they adapt and how

this produces emergent patterns but the ability to control these cycles is

very limited. It is ironic that the pessimist stance thinks people can

study and learn about human adaptive systems, but that little can be

done to change/design adaptive systems because new complexities and

unintended consequences will sabotage the best laid plans. Resilience

Chapter 10 3

Engineering admits that changing/designing adaptive systems is hard,

but sees it as both necessary and possible. Resilience Engineering in

practice provides guidance on how to begin doing this.

This chapter provides one input to resilience management

strategies in the form of three basic patterns in how adaptive systems

fail. The taxonomy continues the line of work begun by Woods and

Cook (2006) who described one basic pattern in how adaptive systems

behave and how they fail. The chapter also illustrates these patterns in

examples drawn from urban fire-fighting and crisis management. To

develop resilience management strategies, organisations need to be able

to look ahead and either see and avoid or recognise and escape when they are

headed for adaptive traps of one kind or another. A taxonomy of

different maladaptive patterns is valuable input to develop these

strategies.

Assessing Future Resilience from Studying the History

of Adaptation (and Maladaptation)

The resilience/brittleness of a system captures how well it can adapt to

handle events that challenge the boundary conditions for its operation.

Such ‘challenge’ events do occur (1) because plans and procedures have

fundamental limits, (2) because the environment changes over time and

in surprising ways, (3) because the system itself adapts around successes

given changing pressures and expectations for performance. In large

part, the capacity to respond to challenge events resides in the expertise,

strategies, tools, and plans that people in various roles can deploy to

prepare for and respond to specific classes of challenge.

Resilience, as a form of adaptive capacity, is a system’s potential for

adaptive action in the future when information varies, conditions change,

or when new kinds of events occur, any of which challenge the viability

of previous adaptations, models, plans, or assumptions. However, the

data to measure resilience comes from observing/analysing how the

system has adapted to disrupting events and changes in the past (Woods,

2009a, p. 500). Past incidents provide information about how a system

was both brittle, by revealing how it was unable to adapt in a particular

evolving situation, and resilient, by revealing aspects of how it routinely

adapted to disruptions (Woods and Cook, 2006). Analysis of data about

4 Resilience Engineering in Practice

how the system adapted and to what, can provide a characterisation of

how well operational systems are prepared in advance to handle

different kinds of challenge events and surprises (Hollnagel et al., 2006).

Patterns of failure arise due to basic regularities about adaptation in

complex systems. The patterns are generalisations derived from

analysing cases where systems were unable to prepare for and handle

new challenges. The patterns all involve dynamic interactions between

the system in question and the events that occur in its environment.

The patterns also involve interactions among people in different roles

each trying to prepare for and handle the events that occur within the

scope of their roles. The patterns apply to systems across different

scales – individuals, groups, organisations.

Patterns of Maladaptation

There are three basic patterns by which adaptive systems break down,

and within each, there is a variety of sub-patterns. The three basic

patterns are (1) decompensation, (2) working at cross-purposes, and (3)

getting stuck in outdated behaviours.

Decompensation: Exhausting Capacity to Adapt as Disturbances /

Challenges Cascade

In this pattern, breakdown occurs when challenges grow and cascade

faster than responses can be decided on and effectively deployed. A

variety of cases from supervisory control of dynamic processes provide

the archetype for the basic pattern. Decompensation occurs in human

cardiovascular physiology, e.g., the Starling curve in cardiology. When

physicians manage sick hearts they can miss signals that the

cardiovascular system is running out of control capability and fail to

intervene early enough to avoid a physiological crisis (Feltovich, Spiro

and Coulson, 1989; Cook, Woods and McDonald, 1991; Woods and

Cook, 2006). Decompensation also occurs in human supervisory

control of automated systems, for instance in aviation. In cases of

asymmetric lift due to icing or slowly building engine trouble,

automation can silently compensate but only up to a point. Flight crews

may recognise and intervene only when the automation is nearly out of

capacity to respond and when the disturbances have grown much more

Chapter 10 5

severe. At this late stage there is also a risk of a bumpy transfer of

control that exacerbates the control problem. Noticing early that the

automation has to work harder and harder to maintain control is

essential (Norman, 1990; Woods, 1994; Woods and Sarter, 2000

provide examples from cockpit automation). Figure 1 illustrates the

generic signature for decompensation breakdowns.

The basic decompensation pattern evolves across two phases. In

the first phase, a part of the system adapts to compensate for a growing

disturbance. Partially successful initially, this compensatory control

masks the presence and development of the underlying disturbance.

The second phase of a decompensation event occurs because the

automated response cannot compensate for the disturbance completely

or indefinitely. After the response mechanism’s capacity is exhausted,

the controlled parameter suddenly collapses (the decompensation event

that leads to the name).

Figure 1. The basic decompensation signature.

The question is whether a part of the system, a supervisory

controller, can detect the developing problem during the first phase of

6 Resilience Engineering in Practice

the event pattern or whether it misses the signs that the lower order or

base controllers (automated loops in the typical system analysis) are

working harder and harder to compensate but getting nearer to its

capacity limits as the external challenge persists or grows? This requires

discriminating between adaptive behaviour that is part of successful

control and adaptive behaviour that is a sign of incipient failure to

come.

In these situations, the critical information is not the abnormal

process symptoms per se but the increasing force with which they must

be resisted relative to the capabilities of the base control systems. For

example, when a human acts as the base control system, he/she would

as an effective team member communicate to others the fact that they

need to exert unusual control effort (Norman, 1990). Such information

provides a diagnostic cue for the team and is a signal that additional

resources need to be injected to keep the process under control. If

there is no information about how hard the base control system is

working to maintain control in the face of disturbances, it is quite

difficult to recognise the seriousness of the situation during the phase 1

portion, and therefore to respond early enough to avoid the

decompensation collapse that marks phase 2 of the event pattern. The

key information is how hard control systems are working to maintain

control and the trend: are control systems running out of control

capability as disturbances are growing or cascading?

There are a number of variations on the decompensation pattern,

notably:

• Falling behind the tempo of operations (e.g., the aviation expression

‘falling behind the power curve;’ surges in demands in emergency

rooms – Wears and Woods, 2007; bed crunches in intensive care

units – Cook, 2006).

• Inability of an organisation to transition to new modes of functioning when

anomalies challenge normal mechanisms or contingencies (e.g., a hospital’s

ability to manage mass casualty events – see Committee on the

Future of Emergency Care in the US, 2006; Woods and Wreathall,

2008 provide a general description of this risk).

Chapter 10 7

Working at Cross-purposes: Behaviour that is Locally Adaptive, but

Globally Maladaptive

This refers to the inability to coordinate different groups at different

echelons as goals conflict. As a result of miscoordination the groups

work at cross-purposes. Each group works hard to achieve the local

goals defined for their scope of responsibility, but these activities make

it more difficult for other groups to meet the responsibilities of their

roles or undermine the global or long term goals that all groups

recognise to some degree.

The archetype is the tragedy of the commons (Ostrom, 1990, 1999)

which concerns shared physical resources (among the most studied

examples of common pools are fisheries management and water

resources for irrigation). The tragedy of the commons is a name for a

baseline adaptive dynamic whereby the actors, by acting rationally in the

short term to generate a return in a competitive environment, deplete

or destroy the common resource on which they depend in the long run.

In the usual description of the dynamic, participants are trapped in an

adaptive cycle that inexorably overuses the common resource (a

‘pessimist’ stance on adaptive systems); thus, from a larger systems view

the local actions of groups are counter-productive and lead them to

destroy their livelihood or way of life in the long run.

Organisational analyses of accidents like the Columbia space shuttle

accident put production/safety trade-offs in a parallel position to

tragedies of the commons. Despite the organisations’ attempts to

design operations for high safety and the large costs of failures in

money and in lives, line managers under production pressures make

decisions that gradually erode safety margins and thereby undermine

the larger common goal of safety. In other words, safety can be thought

of as an abstract common pool resource analogous to a fishery. Thus,

dilemmas that arise in managing physical common pool resources are a

specific example of a general type of goal conflict where different

groups are differentially responsible for and affected by different sub-

goals, even though there is one or only a couple of commonly held

over-arching goals (Woods et al., 1994, Chapter 4). When the activities

of different groups seem to advance local goals but undermine over-

arching or long term goals of the larger system that the groups belong

8 Resilience Engineering in Practice

to, the system-level pattern is maladaptive as the groups work at cross-

purposes. Specific concrete stories that capture this pattern of adaptive

breakdown can be found in Brown (2005), who collected cases of safety

dilemmas and sacrifice judgments in health care situations.

There is a variety of sub-patterns to working at cross purposes.

Some of these concerns vertical interactions, that is across echelons or

levels of control, such as the tragedy of the commons. Others concern

horizontal interactions when many different groups need to coordinate

their activities in time and space such as in disaster response and

military operations. This pattern can also occur over time. A sub-

pattern that includes a temporal component and is particularly

important in highly coupled systems is missing side effects of change

(Woods and Hollnagel, 2006). This can occur when there is a change

that disrupts plans in progress or when a new event presents new

demands to be handled, among other events. Other characteristic sub-

patterns are:

• Fragmentation over roles (stuck in silos; e.g., precursors to Columbia

space shuttle accident, Woods, 2005a).

• Failure to resynchronise following disruptions (Branlat et al., 2009).

• Double binds (Woods et al., in press).

Getting Stuck in Outdated Behaviours: The World Changes but the

System Remains Stuck in what were Previously Adaptive Strategies

(Over-relying on Past Successes)

This pattern relates to breakdowns in how systems learn. What was

previously adaptive can become rigid at the level of individuals, groups,

or organisations. These behaviours can persist even as information

builds that the world is changing and that the usual

behaviours/processes are not working to produce desired effects or

achieve goals. One example is the description of the cycle of error as

organisations become trapped in narrow interpretations of what led to

an accident (Cook, Woods and Miller, 1998).

This pattern is also at play at more limited operational time scopes.

Domains such as military operations offer a rich environment for

Chapter 10 9

studying the pattern. When conditions of operation change over time,

tactics or strategies need to be updated to match new challenges or

opportunities. While such decisions are made difficult by the uncertain

nature of the operations’ environment and of the outcome of actions,

missed opportunities to re-plan constitute sources of failure (Woods

and Shattuck, 2000). Mishaps in the nuclear industry have also

exemplified the pattern by showing the dangers of “rote rule following”

(ibid.). In all of these cases there was a failure to re-plan when the

conditions experienced fell outside of the boundaries the system and

plans were designed for. Some characteristic sub-patterns are:

• Oversimplifications (Feltovich, Spiro and Coulson, 1997).

• Failing to revise current assessment as new evidence comes in (Woods and

Hollnagel, 2006; Rudolph, 2009).

• Failing to revise plan in progress when disruptions/opportunities arise

(Woods and Hollnagel, 2006).

• Discount discrepant evidence (e.g., precursors to Columbia, Woods,

2005a).

• Literal Mindedness, particularly in automation failures (Woods and

Hollnagel, 2006).

• Distancing through differencing (Cook and Woods, 2006).

• Cook’s Cycle of Error (Cook et al., 1998).

The three basic patterns define kinds of adaptive traps. A reflective

adaptive system should be able to monitor its activities and functions

relative to its changing environment and determine whether it is likely

to fall into one or another of these adaptive traps. The three basic

patterns can be used to understand better how various systems are

vulnerable to failures, such as systems that carry out crisis management,

systems that respond to anomalies in space flights, and systems that

provide critical care to patients in medicine. In the next section, we test

the explanatory value of these three basic patterns by re-visiting a recent

analysis of critical incidents (Branlat et al., 2009) that provided markers

of both resilience and brittleness (Woods and Cook, 2006). Urban fire-

fighting provides a rich setting to examine aspects of resilience and

brittleness related to adaptation and coordination processes. Incident

command especially instantiates patterns generic to adaptive systems

10 Resilience Engineering in Practice

and observed in other domains or at other scales (Bengtsson et al.,

2003; Woods and Wreathall, 2008).

The Basic Patterns Are illustrated in Urban Fire-

fighting Critical Incidents

High uncertainty and potential for disruptions, new events, and

surprises all pose challenges for fire-fighting operations. The fire-

fighting organisation needs to be able to adapt to new information

(whether a challenge or opportunity) about the situation at hand and to

ever-changing conditions. For example, consider this case from the

corpus (Branlat et al., 2009):

Companies arrive on the fire scene and implement standard

operating procedures for an active fire on the first floor of the

building. The first ladder company initiates entry to the apartment

on fire, while the second ladder gets to the second floor in order to

search for potentially trapped victims (the ‘floor above the fire’ is

an acknowledged hazardous position). In the meantime, engine

companies stretch hose-lines but experience various difficulties

delaying their actions, especially because they cannot achieve

optimal positioning of their apparatus on a heavily trafficked street.

While all units are operating, conditions are deteriorating in the

absence of water being provisioned on the fire. The Incident

Commander (IC) transmits an ‘all hands’ signal to the dispatcher,

leading to the immediate assignment of additional companies.

Almost at the same time, members operating above the fire

transmit a ‘URGENT’ message over the radio. Although the IC

tries to establish communication and get more information about

the difficulties encountered, he does not have uncommitted

companies to assist the members. Within less than a minute, a

back-draft-type explosion occurs in the on fire apartment,

engulfing the building’s staircase in flames and intense heat for

several seconds, and erupting through the roof. As the members

operating on the second floor had not been able to get access to

the apartment there due to various difficulties, they lacked both a

Chapter 10 11

refuge area (apartment) and an egress route (staircase). The second

ladder company was directly exposed to life-threatening conditions.

The three basic patterns can all be seen at work in this case:

• Decompensation. The situation deteriorated without companies being

able to address the problem promptly. The Incident Commander

(IC) recognised and signalled an ‘all hands’ situation, in order to

inform dispatchers that all companies were operating and to

promptly request additional resources. As there were no

uncommitted resources available, the fire companies were unable to

respond when an unexpected event occurred (the back-draft) which

created dangers and hindered the ability of others to assist. As a

result, team members were exposed to dangerous conditions.

• Working at cross-purposes. Companies were pursuing their tasks and

experienced various challenges without the knowledge of other

companies’ difficulties. Without this information, actions on the

first floor worked against the actions and safety of operators on the

second floor. Goal conflict arose (1) between the need to provide

access to the fire and to contain it while water management was

difficult, and (2) between the need to address a deteriorating

situation and to rescue injured members while all operators were

committed to their tasks.

• Getting stuck in outdated behaviour. The ladder companies continued to

implement standard procedures that assumed another condition

was met (water availability from the engine companies). They failed

to adapt the normally relevant sequence of activities to fit the

changing particulars of this situation: the first ladder company

gained access to the apartment on fire; but in the absence of water,

the opened door fuelled the fire and allowed flames and heat to

spread to the rest of the building (exacerbating how the fire

conditions were deteriorating). Similarly, the unit operating on the

second floor executed its tasks normally, but the difficulty it

encountered and the deteriorating situation required adaptation of

normal routines to fit the changing risks.

12 Resilience Engineering in Practice

Urban Fire-fighting and the Dynamics of Decompensation

During operations, it is especially important for the Incident

Commander (IC) constantly and correctly to assess progress in terms of

trends in whether the fire is in or out of control. To do this, the IC

monitors (a) the operational environment including the evolution of the

fire and the development of additional demands or threats (e.g.,

structural damages or trapped victims) and (b) the effort companies are

exerting to try to accomplish their tasks as well as their capacity to

respond to additional demands. Based on such assessments, the IC

makes critical decisions related to the management of resources:

redeploying companies in support of a particular task; requesting

additional companies to address fire extensions or need to relieve

members; requesting special units to add particular forms of expertise

to handle unusual situations (e.g., presence of hazardous material).

ICs are particularly attentive to avoid risks of falling behind by

exhausting the system’s capacity to respond to immediate demands as

well as to new demands (Branlat et al., 2009). The ‘all-hands’ signal is a

recognition that the situation is precarious because it is stretched close

to its maximum capacity and that current operations therefore are

vulnerable to any additional demands that may occur. The analysis of

the IC role emphasised anticipating trends or potential trends in

demands relative to how well operations were able to meet those

demands (see also Cook’s analysis of resource crunches in intensive

care units; Cook, 2006). For urban fire-fighting, given crucial time

constraints, resources are likely to be available too late if they are

requested only when the need is definitive. A critical task of the IC

therefore corresponds to the regulation of adaptive capacity by

providing ‘tactical reserves’ (Klaene and Sanders, 2008, p. 127), i.e., an

additional capacity promptly to adapt tactics to changing situations.

Equivalent processes also play out (a) at the echelon of fire-fighters or

fire teams, (b) in terms of the distributed activity (horizontal

interactions) across roles at broader echelons of the emergency

response system, and (c) vertically across echelons where information

about difficulties at one level change decisions and responses at another

echelon.

Chapter 10 13

Urban Fire-fighting and Coordination over Multiple Groups and Goals

Fire-fighting exemplifies situations within which tasks and roles are

highly distributed and interdependent, exposing work systems to the

difficulty of maintaining synchronisation while providing flexibility to

address ever-changing demands. Interdependencies also result from the

fact that companies operate in a shared environment.

Several reports within the corpus described incidents where

companies opened hose-lines and pushed fire and heat in the direction

of others. These situations usually resulted from companies adapting

their plan because of difficulties or opportunities. If the shift in activity

by one group was not followed by a successful resynchronisation, it

created conditions for a coordination breakdown where companies

(and, importantly, the IC) temporarily lost track of each other’s position

and actions. In this context one group could adapt to handle the

conditions they face in ways that inadvertently created or exacerbated

threats for other groups. Another example in the corpus was situations

where companies’ capacity to fulfil their functions were impeded by

actions of others. One groups actions, though locally adaptive relative

to their scope, introduced new constraints which reduced another

company’s ‘margins of manoeuvre’ (Coutarel, Daniellou and Dugué,

2003). This notion refers to the range of behaviours they are able to

deploy in order to fulfil their functions, therefore to their capacity to

adapt a course or plan of action in the face of new challenges. Such

dynamics might directly compromise members’ safety, for example

when the constrained functions were critical to egress route

management. In one case, a company vented a window adjacent to a

fire escape which had the consequence of preventing the members of

another company operating on the floor above from using the fire

escape as a potential egress route, should it have been needed.

Goal conflicts arise when there are trade-offs between achieving

the three fundamental purposes of urban fire-fighting: saving lives,

protecting property and ensuring personnel’s safety. This occurs when,

for example, a fire department forgoes the goal of protecting property

in order to minimise risk to fire-fighters. Incidents in the corpus vividly

illustrate the trade-offs that can arise during operations and require

adaptations to on-going operations. Under limited resources (time,

14 Resilience Engineering in Practice

water, operators), the need to rescue a distressed fire-fighter introduces

a difficult goal conflict between rescue and fire operations. If members

pursue fire operations, the victim risks life-threatening exposure to the

dangerous environment. Yet by abandoning fire operations,

momentarily or partially, team members risk letting the situation

degrade and the situation becomes more difficult and more dangerous

to address. The analysis of the corpus of cases found that adaptations in

such cases were driven by local concerns, e.g., when members

suspended their current operations to assist rescue operations nearby.

The management of goal conflicts is difficult when operations are not

clearly synchronised, since decisions that are only locally adapted risk

further fragmenting operations.

Urban Fire-fighting and the Risk of Getting Stuck in Outdated

Behaviours

As an instance of emergency response, urban fire-fighting is

characterised by the need to make decisions at a high-tempo and under

uncertainty. As fire-fighters discover and assess the problem to be

addressed during the course of operations, replanning is a central

process. It is critical that adaptations to the plan are made when

elements of the situation indicate that previous knowledge (on which

on-going strategy and tactics are based) is outdated. The capacity to

adapt is therefore highly dependent on the capacity correctly to assess

the situation at hand throughout the operations, especially at the level

of the IC. Accident cases show that the capacity of the IC to efficiently

supervise operations and modify the plan in progress is severely

impaired when this person only has limited information about and

understanding of the situation at hand and the level of control on the

fire.

Given the level of uncertainty, this also suggests the need for

response systems to be willing to devote resources to further assess

ambiguous signals, a characteristic of resilient and high-reliability

organisations (Woods, 2006a; Rochlin, 1999). This is nonetheless

challenging in the context of limited resources and high tempo, and

given the potential cost of replanning (risk of fragmenting operations,

cost of redeploying companies, coordination costs).

Chapter 10 15

At a wider temporal and organisational scale, fire departments and

organisations are confronted with the need to learn from situations in

order to increase or maintain operations’ resilience in the face of

evolving threats and demands. The reports analysed resulted from

thorough investigation processes that aimed at understanding limits in

current practices and tools and represented process of learning and

transformation. However, it is limiting to assume that the events that

produce the worst outcomes are also the ones that will produce the

most useful lessons. Instances where challenging and surprising

situations are managed without leading to highly severe outcomes also

reveal interesting and innovative forms of adaptations (Woods and

Cook, 2006). As stated previously, many minor incidents also represent

warning signals about the (in)adequacy of responses to the situations

encountered. They are indicators of the system starting to stretch

before it collapses in the form of a dramatic event (Woods and

Wreathall, 2008). To be resilient, organisations must be willing to

pursue these signals (Woods, 2009a). Unfortunately, selecting the

experiences or events which will prove fruitful to investigate, and

allocating the corresponding resources, is a difficult choice when it has

to be made a priori (Hollnagel, 2007; Dekker, 2008, chapter 3).

Recognising What is Maladaptive Depends on

Perspective Contrasts

The chapter has presented three basic patterns in how adaptive systems

fail. But it is difficult to understand how behaviours of people, groups,

and organisations are adapted to some factors and how those

adaptations are weak or strong, well or poorly adapted. One reason for

this is that what is well-adaptive, under-adaptive, or maladaptive is a

matter of perspective. As a result, labelling a behaviour or process as

maladapted is conditional on specifying a contrast across perspectives.

First, adaptive decision-making exhibits local (though bounded)

rationality (regardless of scale). A human adaptive system uses its

knowledge and the information available from its field of view/focus of

attention to adapt its behaviour (given its scope of autonomy/authority)

in pursuit of its goals. As a result, adaptive behaviour is well-adapted

16 Resilience Engineering in Practice

when examined locally, even though the system can learn and change to

become better adapted in the future (shifting temporal perspective).

Second, adaptive decision-making exists in a co-adaptive web

where adaptive behaviour by other systems horizontally or vertically (at

different echelons) influences (releases or constrains) the behaviour of

the system of interest. Behaviour that is adaptive for one unit or system

can produce constraints that lead to maladaptive behaviour in other

systems or can combine to produce emergent behaviour that is

maladaptive relative to criteria defined by a different perspective.

Working at cross-purposes happens when interdependent systems

do things that are all locally adaptive (relative to the role/goals set

up/pressured for each unit) but more globally maladaptive (relative to

broader perspectives and goals). This can occur horizontally across

units working at the same level as in urban fire-fighting (Branlat et al.,

2009). It can occur upward, vertically, where local adaptation at the

sharp end of a system is maladaptive when examined from a more

regional perspective that encompasses higher level or total system goals.

One example is ad hoc plan adaptation in the face of an impasse to a

plan in progress; in this case the adaptation works around the impasse

but fails to do so in a way that takes into account all of the relevant

constraints as defined from a broader perspective on goals (Woods and

Shattuck, 2000).

Working at cross-purposes can occur downward vertically too

(Woods et al., in press). Behaviour that is adaptive when considered

regionally can be seen as maladaptive when examined locally as the

regional actions undermine or create complexities that make it harder

for the sharp end to meet the real demands of situations (for example,

actions at a regional level can introduce complexities that force sharp

end operations to develop work-arounds and other forms of gap-filling

adaptations).

This discussion points to the finding in adaptive system science

that all systems face fundamental trade-offs. In particular, becoming

more optimal with respect to some aspects of the environment

inevitably leads that system to be less adapted to other aspects of the

environment (Doyle, 2000; Zhou et al., 2005; Woods, 2006a; Hollnagel,

2009). This leads us to a non-intuitive but fundamental conclusion that

all adaptive systems simultaneously are (Woods, 2009b):

Chapter 10 17

• well-adapted to some aspects of its environment (e.g., the fluency

law—‘well”-adapted’ cognitive work occurs with a facility that

belies the difficulty of the demands resolved and the dilemmas

balanced; see Woods and Hollnagel, 2006),

• under-adapted in that the system has some degree of drive to learn

and improve its fitness relative to variation in its environment. This

is related in both intrinsic properties of that agent or system and to

the external pressures the system faces from stakeholders.

• maladapted or brittle in the face of events and changes that challenge

its normal function.

This basic property of adaptive systems means that all forms of

linear causal analysis are inadequate for modelling and predicting the

behaviour of such systems. Adaptive systems’ sciences are developing

the new tools needed to accurately model, explain and predict how

adaptive systems will behave (e.g., Alderson and Doyle, in press), for

example, how to anticipate tipping points in complex systems (Scheffer

et al., 2009).

Working organisations need to be able to see and avoid or

recognise and escape when a system is moving toward one of the three

basic adaptive traps. Being resilient means the organisation can monitor

how it is working relative to changing demands and adapt in

anticipation of crunches, just as incident command should be able to do

in urban fire-fighting. Organisations can look at how they have adapted

to disruptions in past situations to estimate whether their system’s

‘margins of manoeuvre’ in the future are expanding or contracting.

Resilience Engineering is beginning to provide the tools to do this even

as more sophisticated general models of adaptive systems are being

developed.

18 Resilience Engineering in Practice

References

Andersson, K. P. and Ostrom, E. (2008). Analyzing decentralized

resource regimes form a polycentric perspective. Policy Science,

41, 71-93.

Alderson, D. L. and Doyle, J. C. (in press). Contrasting views of

complexity and their implications for network-centric

infrastructures. IEEE Systems, Man and Cybernetics, Part A.

Bengtsson, J., Angelstam, P., Elmqvist, T., Emanuelsson, U., Folke, C.,

Ihse, M., Moberg, F. and Nyström, M. (2003). Reserves, Resilience

and Dynamic Landscapes. Ambio, 32(6), 389-396.

Branlat, M., Fern, L., Voshell, M. and Trent, S. (2009). Coordination in

Urban Firefighting: A Study of Critical Incident Reports. Proceedings

of the Human Factors and Ergonomics Society 53rd Annual Meeting, San

Antonio, TX.

Brown, J. P. (2005). Key themes in healthcare safety dilemmas. In M. S.

Patankar, J. P. Brown, & M. D. Treadwell (Eds.), Safety Ethics: Cases

from Aviation, Healthcare, and Occupational and Environmental Health

(pp. 103-148). Adelshot, UK: Ashgate.

Committee on the Future of Emergency Care in the US (2006).

Hospital-based Emergency Care: At the Breaking Point. National

Academic Press, Washington, DC.

Cook, R. I. (2006). Being bumpable: consequences of resource

saturation and near-saturation for cognitive demands on ICU

practitioners. In D. D. Woods & E. Hollnagel (Eds.), Joint Cognitive

Systems: Patterns in Cognitive Systems Engineering. (pp. 23–35). Boca

Raton, FL: Taylor & Francis/CRC Press.

Cook, R. and Rasmussen, J. (2005). “Going Solid”: A model of system

dynamics and consequences for patient safety. Quality and Safety in

Health Care, 14, 130-134.

Cook, R. I., Woods, D. D. and McDonald, J.S. (1991). Human

Performance in Anesthesia: A Corpus of Cases. Cognitive Systems

Engineering Laboratory Report, prepared for Anesthesia Patient

Safety Foundation, April 1991.

Chapter 10 19

Cook, R. I., Woods, D. D. and Miller, C. (1998). A Tale of Two Stories:

Contrasting Views of Patient Safety. Chicago, National Patient Safety

Foundation. (available at http://csel.eng.ohio-

state.edu/blog/woods/archives/000030.html )

Coutarel, F., Daniellou, F., & Dugué, B. (2003). Interroger

l'organisation du travail au regard des marges de manoeuvre en

conception et en fonctionnement [Examining Work Organization

in Relation to Margins of Maneuver in Design and in Operation].

Pistes, 5(2).

Csete, M.E. and Doyle, J.C. (2002). Reverse engineering of biological

complexity. Science, 295, 1664–1669.

Dekker, S. (2008). Just Culture: Balancing Safety and Accountability.

Adelshot, UK: Ashgate.

Doyle, J.C. (2000). Multiscale networking, robustness, and rigor. In T.

Samad and J. Weyrauch (Eds.), Automation, control, and complexity : an

integrated approach. NY: John Wiley & Sons, Inc. New York, pp. 287

– 301.

Feltovich, P. J., Spiro, R. J. and Coulson, R. L. (1989). The nature of

conceptual understanding in biomedicine: The deep structure of

complex ideas and the development of misconceptions. In D.

Evans and V. Patel (Eds.), The Cognitive Sciences in Medicine (pp. 113-

172). Cambridge MA: MIT Press.

Feltovich, P. J., Spiro, R. J., & Coulson, R. L. (1997). Issues of expert

flexibility in contexts characterized by complexity and change. In P.

J. Feltovich, K. M. Ford, & R. R. Hoffman (Eds.), Expertise in

context: Human and machine. Menlo Park, CA. AAAI/MIT Press.

Hollnagel, E. (2007). Resilience Engineering: Why, What and How. In

NoFS 2007 - Nordic Research Conference on Safety, 13-15 June 2007,

Tampere, Finland.

Hollnagel, E. (2009). The ETTO Principle: Efficiency-Thoroughness Trade-Off:

Why Things That Go Right Sometimes Go Wrong. Ashgate.

Klaene, B. J., & Sanders, R. E. (2008). Structural Firefighting: Strategies and

Tactics (2nd ed.). Sudbury, MA: Jones & Bartlett Publishers.

Ostrom, E. (1990). Governing the Commons: The Evolution of Institutions for

Collective Action. New York: Cambridge University Press, 1990.

Ostrom, E. (1999). Coping with Tragedies of the Commons. Annual

Review of Political Science, 2, pp. 493—535.

20 Resilience Engineering in Practice

Perrow, C. (1984). Normal Accidents: Living with High-Risk Technologies.

New York: Basic Books.

Rochlin, G.I. (1999). Safe operation as a social construct. Ergonomics,

42(11), 1549-1560.

Scheffer, M., Bascompte, J., Brock, W. A., Brovkin, V., Carpenter, S. R.,

Dakos, V., Held, H., van Nes, E. H., Rietkerk, M. and Sugihara, G.

(2009). Early-warning signals for critical transitions. Nature,

461(7260), 53-59.

Wears, R. L. and Woods, D. D. (2007). Always Adapting. Annals of

Emergency Medicine, 50(5), 517-519.

Woods, D. D. (2005). Creating Foresight: Lessons for Resilience from

Columbia. In W. H. Starbuck and M. Farjoun (eds.), Organization at

the Limit: NASA and the Columbia Disaster. Malden, MA: Blackwell,

pp. 289--308.

Woods, D. D. (2006). Essential characteristics of resilience. In E.

Hollnagel, D. D. Woods, & N. Leveson (Eds.), Resilience Engineering:

Concepts And Precepts (pp. 19–30). Adelshot, UK: Ashgate.

Woods, D. D. (2009a). Escaping Failures of Foresight. Safety Science,

47(4), 498-501.

Woods, D. D. (2009b). Fundamentals to Engineer Resilient Systems:

How Human Adaptive Systems Fail and the Quest for Polycentric

Control Architectures. Keynote presentation, 2nd International

Symposium on Resilient Control Systems, Idaho Falls, ID, August 11-13

2009 (https://secure.inl.gov/isrcs2009/default.aspx accessed

September 8, 2009).

Woods, D. D. and Cook, R. I. (2006). Incidents: Are they markers of

resilience or brittleness? In E. Hollnagel, D.D. Woods and N.

Leveson, eds., Resilience Engineering: Concepts and Precepts. Ashgate,

Aldershot, UK, pp. 69-76.

Woods, D. D., & Hollnagel, E. (2006). Joint Cognitive Systems: Patterns in

Cognitive Systems Engineering. Boca Raton, FL: Taylor & Francis/CRC

Press.

Woods, D. D. and Sarter, N. (2000). Learning from Automation

Surprises and Going Sour Accidents. In N. Sarter and R. Amalberti

(Eds.), Cognitive Engineering in the Aviation Domain, Erlbaum,

Hillsdale NJ, pp. 327-354.

Chapter 10 21

Woods, D.D. and Shattuck, L. G. (2000). Distant supervision—local

action given the potential for surprise. Cognition, Technology and

Work, 2, 242—245.

Woods, D. D. and Wreathall, J. (2008). Stress-Strain Plot as a Basis for

Assessing System Resilience. In E. Hollnagel, C. Nemeth and S. W.

A. Dekker, eds., Resilience Engineering: Remaining sensitive to the

possibility of failure. Ashgate, Aldershot, UK, pp. 145-161.

Zhou, T., Carlson, J. M. and Doyle, J. (2005). Evolutionary dynamics

and highly optimized tolerance. Journal of Theoretical Biology, 236,

438-447.

Peter Karamoskos

Tal Gilead

Niklas Grabbe

Zusammenfassung Jüngste Fortschritte in der Fahrzeugautomatisierung versprechen mehr Sicherheit und Effizienz. Gleichzeitig macht die sogenannte „Freigabefalle“ bestehende Lücken im Sicherheitsnachweis für hochautomatisiertes Fahren deutlich. Umso dringlicher sind neue Ansätze zur Erprobung und Risikobewertung, insbesondere im gemischten Verkehr. In diesem Kapitel wird die Functional Resonance Analysis Method (FRAM) auf ein Überholszenario auf einer Landstraße angewandt, um die Interaktionen zwischen menschlichen Fahrern und automatisierten Fahrzeugen zu untersuchen. Durch die Identifikation von Mustern, die zu Unfällen führen oder diese verhindern, soll die Systemstabilität verbessert werden. Zudem werden methodische Weiterentwicklungen von FRAM vorgestellt, darunter neue Kennzahlen zur Erfassung von Komplexität und Interaktion, die für ein praxisorientiertes Fachpublikum aufbereitet sind. Praktische Relevanz: Die Ergebnisse bieten konkrete Ansatzpunkte für die Gestaltung sicherer Mensch–Automation-Interaktionen im Straßenverkehr. Durch die Identifikation wiederkehrender Muster, die sowohl Risiken als auch Resilienzfaktoren abbilden, lassen sich gezielt Hebelpunkte für technische Unterstützung und Funktionsallokation zwischen Fahrer und Automation bestimmen. Damit können Entwickler frühzeitig einschätzen, welche Automatisierungsfunktionen sinnvoll sind und wo der Mensch unverzichtbar bleibt. Für Praktiker in Industrie und Regulierung eröffnet dies die Möglichkeit, komplexe Verkehrsdynamiken systemisch zu bewerten und Sicherheitsnachweise jenseits rein empirischer Testkilometer fundierter zu gestalten.

This study investigates the optimization of broadband communication channel capacity through an integrative information-theoretic framework. Leveraging Shannon’s theory, it examines fundamental constraints such as bandwidth limitations, channel noise, modulation techniques, error correction mechanisms, and adaptive systems. A comprehensive literature review of 118 articles identified 18 critical enablers, which were evaluated by domain experts. The Fuzzy DEMATEL method was employed to prioritize enablers based on interdependencies and influence. Results indicate that Security Considerations, Channel Access Protocols, and Propagation Characteristics exert the most significant impact on capacity optimization. The findings offer a structured decision-making model for stakeholders, enabling efficient allocation of technological, infrastructural, and human resources. By bridging theoretical principles with practical implementation, this research provides actionable insights for academic researchers and industry practitioners in designing robust, high-capacity broadband systems. The integrative modeling approach advances the application of information theory in modern communication networks, supporting informed technology adoption and system integration.

Supporting coordination between a human and their machine counterparts is essential for realizing the benefits of an automated system and maintaining system safety. In supervising the automation, the ability to answer question "what will happen next" given the system design is necessary for continuous coordination. If the human's view of the world, the autonomous system's activities, and the world are misaligned, automation surprises occur. We introduce the What's Next diagram which can be used to visualize the ability of the human to coordinate with 15 automated systems over time in both a retrospective and future-oriented manner. By analyzing the interplay between projection, retrojection, and events as they occur temporally, gaps in design can be recognized and design recommendations can be formulated. Two case studies are presented showing how to use and generate insights from this diagram in both manners (retrospective and future-oriented) supported by a computational-based analysis.

The rebound curve remains the most prevalent model for conceptualizing, measuring, and explaining resilience for engineering and community systems by tracking the functional robustness and recovery of systems over time. (It also goes by many names, including the resilience curve, the resilience triangle, and the system functionality curve, among others.) Despite longstanding recognition that resilience is more than rebound, the curve remains highly used, cited, and taught. In this article, we challenge the efficacy of this model for resilience and identify fundamental shortcomings in how it handles system function, time, dynamics, and decisions — the key elements that make up the curve. These oversimplifications reinforce misconceptions about resilience that are unhelpful for understanding complex systems and are potentially dangerous for guiding decisions. We argue that models of resilience should abandon the use of this curve and instead be reframed to open new lines of inquiry that center on improving adaptive capacity in complex systems, rather than on functional rebound. We provide a list of questions to help future researchers communicate these limitations and address any implications on recommendations derived from its use.

Our fascination with new technologies is based on the assumption that more powerful automation will overcome human limitations and make our systems 'faster, better, cheaper,' resulting in simple, easy tasks for people. But how does new technology and more powerful automation change our work? Research in Cognitive Systems Engineering (CSE) looks at the intersection of people, technology, and work. What it has found is not stories of simplification through more automation, but stories of complexity and adaptation. When work changed through new technology, practitioners had to cope with new complexities and tighter constraints. They adapted their strategies and the artifacts to work around difficulties and accomplish their goals as responsible agents. The surprise was that new powers had transformed work, creating new roles, new decisions, and new vulnerabilities. Ironically, more autonomous machines have created the requirement for more sophisticated forms of coordination across people, and across people and machines, to adapt to new demands and pressures. This book synthesizes these emergent Patterns though stories about coordination and mis-coordination, resilience and brittleness, affordance and clumsiness in a variety of settings, from a hospital intensive care unit, to a nuclear power control room, to a space shuttle control center. The stories reveal how new demands make work difficult, how people at work adapt but get trapped by complexity, and how people at a distance from work oversimplify their perceptions of the complexities, squeezing practitioners. The authors explore how CSE observes at the intersection of people, technology, and work, how CSE abstracts patterns behind the surface details and wide variations, and how CSE discovers promising new directions to help people cope with complexities. The stories of CSE show that one key to well-adapted work is the ability to be prepared to be surprised. Are you ready?.

Richard Cook

David D Woods

Une intervention ergonomique participative a été conduite pour concevoir un atelier de découpe de canard gras. La méthodologie mise en place, basée sur la conduite de projet, a permis aux ergonomes d’intervenir sur la conception des outils de production (chaîne à obus), sur l’organisation du travail sur la chaîne, sur la formation des opérateurs à la nouvelle méthode de découpe, et sur la présence effective d’un encadrement de proximité.C’est principalement au regard de cette intervention, mais sans en avoir à ce jour tous les résultats, que, dans la première partie de cet article, nous nous proposons de définir un cadre théorique, ainsi qu’un cadre méthodologique, permettant d’appréhender de manière globale la prévention des troubles musculo-squelettiques (TMS) en conception autour de la notion de marge de manœuvre.Dans un second temps, en interrogeant la rotation du point de vue des marges de manœuvre des opérateurs, nous tenterons de poser un certain nombre de conditions à la mise en place de la rotation dans un atelier de découpe, pour que celle-ci puisse constituer une solution efficace en termes de prévention des TMS, mais aussi d’efficacité productive.

In N. Sarter and R. Amalberti (Eds.) Cognitive Engineering in the Aviation Domain, Erlbaum, Hillsdale NJ, in press.

S. Dekker

Building on the success of the 2007 original, Dekker revises, enhances and expands his view of just culture for this second edition, additionally tackling the key issue of how justice is created inside organizations. The goal remains the same: to create an environment where learning and accountability are fairly and constructively balanced. The First Edition of Sidney Dekker's Just Culture brought accident accountability and criminalization to a broader audience. It made people question, perhaps for the first time, the nature of personal culpability when organizational accidents occur. Having raised this awareness the author then discovered that while many organizations saw the fairness and value of creating a just culture they really struggled when it came to developing it: What should they do? How should they and their managers respond to incidents, errors, failures that happen on their watch? In this Second Edition, Dekker expands his view of just culture, additionally tackling the key issue of how justice is created inside organizations. The new book is structured quite differently. Chapter One asks, 'what is the right thing to do?' - the basic moral question underpinning the issue. Ensuing chapters demonstrate how determining the 'right thing' really depends on one's viewpoint, and that there is not one 'true stor' but several. This naturally leads into the key issue of how justice is established inside organizations and the practical efforts needed to sustain it. The following chapters place just culture and criminalization in a societal context. Finally, the author reflects upon why we tend to blame individual people for systemic failures when in fact we bear collective responsibility. The changes to the text allow the author to explain the core elements of a just culture which he delineated so successfully in the First Edition and to explain how his original ideas have evolved. Dekker also introduces new material on ethics and on caring for the' second victim' (the professional at the centre of the incident). Consequently, we have a natural evolution of the author's ideas. Those familiar with the earlier book and those for whom a just culture is still an aspiration will find much wisdom and practical advice here.

Erik Hollnagel

Accident investigation and risk assessment have for decades focused on the human factor, particularly 'human error'. Countless books and papers have been written about how to identify, classify, eliminate, prevent and compensate for it. This bias towards the study of performance failures, leads to a neglect of normal or 'error-free' performance and the assumption that as failures and successes have different origins there is little to be gained from studying them together. Erik Hollnagel believes this assumption is false and that safety cannot be attained only by eliminating risks and failures. The ETTO Principle looks at the common trait of people at work to adjust what they do to match the conditions – to what has happened, to what happens, and to what may happen. It proposes that this efficiency-thoroughness trade-off (ETTO) – usually sacrificing thoroughness for efficiency – is normal. While in some cases the adjustments may lead to adverse outcomes, these are due to the very same processes that produce successes, rather than to errors and malfunctions. The ETTO Principle removes the need for specialised theories and models of failure and 'human error' and offers a viable basis for effective and just approaches to both reactive and proactive safety management.