
When we observe the complex world around us, it becomes evident that everything is interconnected. From ecological systems to human societies, everything functions as a cohesive whole, comprising interdependent systems. Systems are ubiquitous. This includes departments within an organization, organizations themselves, modules of a software system, and software systems interconnectedly. Upon closer examination, nearly everything we perceive and utilize is constructed from interconnected systems, which are themselves composed of interconnected systems, ad infinitum.
‍
Given the pervasive nature of systems, this concept is also pertinent to crisis management. In this context, crisis management can be framed as the process of managing situations that arise when the continuity of a system is compromised. An example would be the continuity of production within an organization due to inventory issues.
‍
An inventory problem can arise from errors made in determining available stock quantities or from a supplier failing to adhere to agreements regarding new stock deliveries. A robust system is defined as one that can maintain certain crucial properties, irrespective of changes in its internal elements or environment. While not every system requires robustness, a certain degree of it is critical for essential components.
‍
The extent to which production can be sustained or rapidly restarted is highly dependent on the interdependencies within the system. Strong dependencies between elements can render a system vulnerable.
‍
In the world of software design, the concept of “loose coupling” is employed. This approach aims for system components to exist as independently as possible, minimizing the impact of changes or errors in one component on another. To ensure effective collaboration, it is crucial to clearly define the assumptions components can make about each other. A significant additional benefit is that an alternative solution can be easily implemented, provided it adheres to the defined specifications.
‍
The drawback of “loose coupling” and most other measures that enhance system robustness is that they tend to reduce system efficiency. Consider, for instance, Just-In-Time inventory management. While minimal stock reduces storage issues, it also renders the system more vulnerable to supply chain disruptions. Generally speaking, increased robustness often leads to greater system complexity.
‍
Therefore, when designing a system or during its organic growth, a continuous balance must be struck between efficiency and robustness. Insufficient attention to robustness renders the system vulnerable, while inadequate focus on efficiency makes the system impractical.
‍
During times of crisis, system dependencies often become acutely apparent. A crisis can cause disruptions in various parts of a system, rapidly propagating to other elements. Understanding these dependencies is crucial for crisis management, as it aids in identifying vulnerabilities, assessing the impact of changes, and implementing effective countermeasures. This applies equally to both the 'cold,' preparatory phase and the 'warm,' acute phase.
‍
It is worthwhile to analyze the dependencies of critical systems within your organization and apply risk management accordingly. This can extend to practices such as those at Netflix. Netflix developed the “Chaos Monkey”, a piece of software that, during normal operations in a production environment, randomly shuts down a server. The image evoked of a monkey pulling cables in a data center will likely remain vivid in my mind for some time. The existence of this Chaos Monkey and the disruptions it causes compel Netflix engineers to consider system dependencies and implement robustness. The accompanying motto is frequently heard in crisis management: “What you rarely do, you rarely do well”.
‍
While it might be excessive to disrupt a random process daily within your organization, if the organization can withstand it, I dare say it will become significantly more robust in the long run!
‍