Open Architecture (OA) presents Challenges form a System Safety Perspective: Maybe?
Mike Allocco, Emeritus Fellow ISSS
System Safety Engineering and Management of Complex Systems; Risk Management Advisor...Complex System Risks
System Safety Challenges
An open architecture (OA) presents challenges form a system safety perspective in that there may be inherent system risks because of the use of commercial-off-the-shelf, government-off-the-shelf, and complex legacy systems that may not have met safety requirements; or because intended use will vary depending on newly designed add-on systems. Regardless of these challenges it is quite possible to apply system safety axioms to assure that the system risks have been eliminated or controlled to an acceptable level. (System risks are emphasized in that system-level safety and assurance requirements are needed to address software and firmware, but hardware, the environment, and human factor interactions.)???
Example OA Problem and System Safety Process
Consider that a safety critical system is to be designed and the system is to be integrated within an existing open architecture. There are several steps to be taken form a “system” safety perspective, consider the following:
1 - The integrated system must be described to conduct integrated hazard analysis. The system is to be defined in various ways: functionally, architecturally, and operationally. Define all the outputs and inputs that communicate between the OA and new safety critical system. Decompose the interface and new system in various forms and understand sequences, event progressions, and threads. Develop models such as state diagrams, network flows, digraphs. Understand the intended use of the integrated system and consider potential misuse of the system. Address software, firmware, and hardware, human and environmental interactions.
2 – Consider conducting integrated scenario-driven hazard/threat/vulnerability analysis, evaluating hardware, software, firmware, human, and environment elements; to identify system risks.?
There is a human computer subsystem and an automated subsystem. The human is to take contingency action based upon what is displayed on the safety critical laptop and the automated subsystem is also safety critical. There are several “system” risks that come to mind:
Undetected hazardous misleading information presented to the human and the human takes inappropriate action hindering contingency action when needed. The situation results in a catastrophic outcome.
The automated subsystem takes inappropriate action and inadvertently operates. The outcome is also catastrophic.
There is delayed digital communication to the human during an emergency and contingency is also hindered.
A safety-critical command input is needed to “Safe” the automated subsystem and the command signal is lost or delayed.
The system is spoofed and an unauthorized threat or intruder gains access and control.
The system is jammed, or lockup occurs at a critical time.
The human introduces a real-time hazard via human error any time during the life cycle of the system.
Decision errors are made, which introduce latent hazards within the system: logic error, timing error, or sneak path for example.
Product history is inadequate and latent hazards are not detected.
Plug-in systems provide a use or function not foreseen and real-time hazards occur.
Over complexity is introduced and system state status is not known.
3 – Apply many different system analyses methods to understand potential accidents/outcomes (system risks). Consider that the potential accident/event is a form of adverse integration of many hazards: initiators, contributors, and primary hazards. These hazards may stem from: specification errors, judgment errors within the design, mistakes in development and coding, compiling errors, upsets, adverse energy effects, hardware and firmware failures, inappropriate human action, anomalies, malfunctions, logic errors, timing, scheduling, and sequencing errors[1].
Depending on the system risk it may be appropriate to confine thinking to address so-called software or hardware hazards. When applying functional approaches understand how the functionally related hazard/threat/vulnerability may manifest by means of logic error, firmware failure, hardware failure, or human error. Appropriate detailed analyses are also required to evaluate subsystem risks, via subsystem hazard analyses. Consider classic approaches such as:
Software, Hardware Failure Modes and Effects Analysis
Detailed Software Safety Analysis
Thread Analysis
Walkthrough Analysis
Access attempts
4 – Evaluate the human throughout the life cycle: the computer-human interface, command, control and communication tasks and procedures, contingency action, and maintenance actions.
5 – Apply system and current engineering concepts assuring that requirements are in concert with system safety: security, availability, logistics, quality, and maintainability, cyber safety, cyber security, human factors, human reliability.
6 – Conduct threat and venerability analyses to assure that the security risks (that adversely effect system safety) have been eliminated or controlled to an acceptable level.? Such efforts are now integrated within system hazard, threat, vulnerability analysis and risk assessment.?
7 – Develop mitigation with the application of layering of risk controls, which are to eliminate or control system risks; both engineering and administrative controls to act as barriers to hinder or abate adverse flow, (see the discussion on the automated safety monitor).?
8 – Consider past knowledge associated with similar systems: loss analysis, incidents, and accidents. Include an understanding of the service history of commercial-off-the-shelf, government-off-the-shelf, and complex legacy systems.??
9 – Monitor the system to assure continued hazard/threat/vulnerability tracking and risk resolution.
10 – Evaluate any changes to the integrated system and conduct reevaluation.?
[1]?For further information see Chapters 5, 9 and 14?Raheja, D.G. and M. Allocco, Assurance Technologies: Principles and Practices: A Product, Process, and System Safety Perspective, Second Edition, Wiley-Interscience, 2006