登录查看更多内容

Complex Systems and Safety Requirements Development

Mike Allocco, Emeritus Fellow ISSS

System Safety Engineering and Management of Complex Systems; Risk Management Advisor...Complex System Risks

发布日期: 2017年9月28日

+ 关注

Mike Allocco, PE, CSP

This paper is a continuation on the discussion of safety requirements development.

Introduction

Complex systems require (integrated) system hazard analysis. This effort addresses the total complex system and how the system fits into other system of systems or families of systems. The system hazard analysis generally can be considered a top-down analysis. High-level system hazards and system risks are identified throughout the life cycle of the complex system. These high-level system risks may be combinations of hardware, firmware, software, and the human and environmental hazards. Further subsystem and detailed hazard analyses may be required and subject matter experts are needed to get into the details. Bottom up analyses provide the required detail, for example, a failure modes and effects analysis may be conducted by reliability engineering and a software failure modes and effects analysis may be conducted by software safety. Additionally a procedure analysis may be conducted by human factors. The system safety engineer looks at the big picture and assures that the system hazard analysis represents an integrated hazard analysis in that system risks can be traced down through the detailed analyses.

Needless to say it is very important to do good hazard analysis so that safety efforts are conducted efficiently and effectively. Excessive wheel spinning can occur during inappropriate analysis efforts. Tasks are to concentrate on fixing potentially big safety problems the high to lower level system risks and system hazards.

Depending on the design method applied, from a system and software safety perspective there are particular considerations to be addressed. Some design methods may use abstract models to illustrate how the system is to perform. These models may or may not reflect reality. It is very important that any depiction used within safety be validated and verified, (from a system safety view). Mistakes and errors can be made with assumptions that may degrade safety analyses and introduce further risk. For example, functional hazard analysis is a popular method. Keep in mind that the particular function may not be easily segregated since it may be manifested via software, firmware, hardware, the human and/or environment. Consequently, physics of failure must be addressed. It is very important to understand how (for example) an error in code can propagate throughout the system and result in an adverse outcome, such as physical harm. Expect that system and software safety may use models to show adverse sequences possible within complex systems; examples of such methods are fault tree analysis, event tree analysis, or digraph analysis.

A design is a meaningful engineering representation of something that is to be built. It is a higher-level interpretation of what will actually be implemented in the source code. Designs should be traceable back to a customer’s and other stakeholders requirements. They should also be assessed for quality against a set of predefined safety criteria for a good design.

Analysis and design methods for software have been evolving over the years, each with its approach to modeling the needed worldview into software. The following methodologies are most commonly used. Specific methodologies under the main categories are a sample of available methodologies.

Structured Analysis and Structured Design (SA/SD)…

SA/SD methods were among the first to be developed. They provided means to create and evaluate a “good” design. Prior to the introduction of SA/SD processes, “code and debug” was the normal way to go from requirements to source code. Even in this “object-oriented” time, SA/SD is still used by many.

Functional Decomposition
Data Flow (also called Structured Analysis)
Information Modeling

Object Oriented Analysis and Object Oriented Design (OOA/OOD)…

OOA/OOD breaks the world into abstract entities called objects, which can contain information (data) and have associated behavior. OOA/OOD has been around for nearly 30 years. In the last decade the majority of development projects have shifted to this collection of methodologies. Object-orientation has brought real benefits to software development, but it is not a silver bullet.

Object-Oriented Analysis and Design (OOA/OOD) method
Object Modeling Technique (OMT)
Object-Oriented Analysis and Design with Applications (OOADA)
Object-Oriented Software Engineering (OOSE)
Universal Machine Language (UML)

Formal Methods (FM) and Model-based Development…

FM is a set of techniques and tools based on mathematical modeling and formal logic that are used to specify and verify requirements and designs for computer systems and software. FM is also a process that allows the logical properties of a computer system (primarily software) to be predicted (in a process similar to numerical calculation) from a mathematical model of the system by means of a logical calculation.

Formal Specification
Formal Verification
Software models (with automatic code generation)

In complex automated designs software safety engineers implement software safety programs to assure that software hazards are eliminated or controlled. A software hazard presents a circumstance(s) that initiates, or contributes, or presents an adverse outcome within a potential system accident. Almost any aspect of software engineering that can have an effect on an automated system can potentially have an adverse effect and software hazards can be introduced, or not be identified, and are not mitigated.

In evaluating complex software systems beware of simple assumptions generally made about failures and hazards; here are examples:

A software failure is a hazard. This statement may or may not be appropriate. It depends on the definition of a failure. A failure could mean an inadvertent termination of a capability of a functional unite to perform its required operation. In this situation any deviation from a required operation is a failure. Such failures may or may not be a hazard. It may be appropriate for a system to fail rather then result in a hazardous condition, consider a fail-safe design.

An over generalization of a failure is an over simplification, when evaluating complex systems. Human errors have been considered failures, however, the failed human task may not have been considered within the required operation. Such a human error may be a hazard. Systems may be operating within required parameters (operations) and hazardous situations can still occur.

In a software safety context a more concise definition of a failure allows for a more specific detailed understanding of a software hazard. Consider a physical condition that adversely effects hardware or firmware may be a more exact way of thinking about a failure. A switch fails to enable the system, or a relay contact freezes, a short develops in a connector, wires chaff, a BIT flips in firmware. Physical failures can have on adverse effect on the digital design and hazards may result.

Think about software as instruction to an automated system. The instruction is very complicated with complex tasks, processes, sequences, and logic. The human developer/designer conveys this instruction via a form of coded communication in an attempt to define complex tasks, processes, sequences, and logic. All of this information is then compiled and converted from a higher order language to lower level machine language (assembly in digital logic). The digital logic resides in an electromagnetic state in firmware. Some of this programming is also automated. Software does not physically fail, hardware will, humans may make errors while creating the instruction to the automated system. There may be sneak paths in threads or logic. There may be anomalies or malfunctions that are apparent hazards.

Decision errors can be made at any time in the life cycle of the system. Such errors can introduce latent and real time hazards.

It is important is to understand the differences between failures and hazards that may manifest in the sequences within the various phases in the life cycle of the complex system. A so-called software hazard could be the result of combinations of errors in the instruction, coding, logic, compiling and converting, and failures that effect firmware and hardware.

Software risk and control…

Software risk is dependent upon the software (safety) application or its safety criticality. Software can have positive and negative effects on system risk. From a positive view software can mitigate risks when it is used as a hazard control: providing systems monitoring, failure or fault detection and isolation, alarms or alerts, or safe shut down capabilities. Software can also pose negative effects and increase system assurance risk when a software hazard control malfunctions when needed or hazardous misleading information is presented during a safety-critical decision due to a software error.

The degree of effort (or rigor) associated with software safety activities are directly related to software risk and because of complexity there are many factors to consider when addressing software risk: contribution to system risk, the degree of software control over the system, software (safety) application or its safety criticality, the size and complexity of software, the use of legacy or commercial software, the programming languages and techniques, the latent errors or mistakes in the software. To integrate all of these many factors matrixes have been designed. A number of examples are discussed below. There are typical methods of determining the software’s influence or importance on system-level hazards and risks. Two of the most popular methods use software control and behavior categories, which are discussed in MIL-STD-882C and RTCA DO-178B and listed below.

MIL-STD-882C Software Control Category

(I) Software exercises autonomous control over potentially hazardous hardware systems, subsystems or components without the possibility of intervention to preclude the occurrence of a hazard. Failure of the software or a failure to prevent an event leads directly to a hazards occurrence.

(IIa) Software exercises control over potentially hazardous hardware systems, subsystems, or components allowing time for intervention by independent safety systems to mitigate the hazard. However, these systems by themselves are not considered adequate.

(IIb) Software item displays information requiring immediate operator action to mitigate a hazard. Software failure will allow or fail to prevent the hazard’s occurrence.

(IIIa) Software items issues commands over potentially hazardous hardware systems, subsystem, or components requiring human action to complete the control function. There are several, redundant, independent safety measures for each hazardous event.

(IIIb) Software generates information of a safety critical nature used to make safety critical decisions. There are several, redundant, independent safety measures for each hazardous event.

(IV) Software does not control safety critical hardware systems, subsystems, or components and does not provide safety critical information.

RTCA-DO-178B Software Anomalous Behavior Category

(A) Software whose anomalous behavior, as shown by the system safety assessment process, would cause or contribute to a failure of system function resulting in a catastrophic failure condition for the aircraft.

(B) Software whose anomalous behavior, as shown by the System Safety assessment process, would cause or contribute to a failure of system function resulting in a hazardous/severe-major failure condition of the aircraft.

(C) Software whose anomalous behavior, as shown by the system safety assessment process, would cause or contribute to a failure of system function resulting in a major failure condition for the aircraft.

(D) Software whose anomalous behavior, as shown by the system safety assessment process, would cause or contribute to a failure of system function resulting in a minor failure condition for the aircraft.

(E) Software whose anomalous behavior, as shown by the system safety assessment process, would cause or contribute to a failure of function with no effect on aircraft operational capability or pilot workload. Once software has been confirmed as level E by the certification authority, no further guidelines of this document apply.

Generic requirements…

There is also a generic set of requirements that are suitable for the automated elements, digital computer, firmware, and software. Numerous check lists have been developed.[1]

[1] For further information refer to: Raheja, D.G. and Allocco, M., Assurance Technologies Principles and Practices: A Product, Process, and System Safety Perspective, Second Edition, Wiley, 2006, Chapters 9 and 14.

要查看或添加评论，请登录

Mike Allocco, Emeritus Fellow ISSS的更多文章

Toxic Management Results in Uncontrolled Risks

2025年3月8日

Toxic Management Results in Uncontrolled Risks

Sadly, poor management and poor (so-called) leadership are in vogue and we humans may be exposed to uncontrolled risks…
Sorry you may have to consider complex risks, and there are many aspects?

2025年2月1日

Sorry you may have to consider complex risks, and there are many aspects?

Assurance of Acceptable Risk…The assurance of acceptable risk is accomplished by following activities: Conducting…

4 条评论
Given complex systems how does one make risk-based decisions?

2025年1月22日

Given complex systems how does one make risk-based decisions?

Introduction… Consider that there are many essentials that one needs to integrate to enable acceptable levels of risk…

3 条评论
Why apply unique safety analysis methods in considering system risks?

2025年1月22日

Why apply unique safety analysis methods in considering system risks?

Many methods to apply..

2 条评论
Axiomatic System Safety Approaches in the Evaluation of AI-related Risks within Complex Systems

2024年10月20日

Axiomatic System Safety Approaches in the Evaluation of AI-related Risks within Complex Systems

Comprehensive Analyses with System Safety Axioms Currently there is a very high-level motivation to apply AI constructs…

14 条评论
Maybe Understand Comprehensive Safety Analysis… while Dealing with Complex System Risks?

2024年1月30日

Maybe Understand Comprehensive Safety Analysis… while Dealing with Complex System Risks?

Comprehensive Safety Analysis… It has been noted that there are experts in safety analysis that do not understand the…

12 条评论
Safety Philosophy and AI: Considering the Exponential Growth of Science, Engineering and Technology

2024年1月21日

Safety Philosophy and AI: Considering the Exponential Growth of Science, Engineering and Technology

Typical mindsets… Safety thinking does not appear to be keeping up with rapid changes in science, engineering, and…

42 条评论
Talking Heads and Uncontrolled Risks

2023年12月24日

Talking Heads and Uncontrolled Risks

The High-level Safety Professional… There is nothing more unprofessional as a high-level safety official talking up…

28 条评论
System Safety Anyone: Contrived Systems are Deterministic Regardless of Complexity?

2023年7月10日

System Safety Anyone: Contrived Systems are Deterministic Regardless of Complexity?

Some that may have limited experience in the design and implementation of systems are not thrown off by magical…
What Safety Mindsets?

2023年5月21日

What Safety Mindsets?

What Poor Assumptions that Degrades Safety? Assumptions and mindsets… Note that poor assumptions and inappropriate…

4 条评论

See all articles

Complex Systems and Safety Requirements Development

Mike Allocco, Emeritus Fellow ISSS

System Safety Engineering and Management of Complex Systems; Risk Management Advisor...Complex System Risks

Mike Allocco, Emeritus Fellow ISSS的更多文章

社区洞察

其他会员也浏览了

System Requirements Team Leader

The Crucial Role of Knowledge Management Systems in Engineering Consulting

Mastering the Art of Business Process Engineering: A Roadmap to Success

How to define Document Lifecycle in Document Control

Audit and Compliance in document control:

Guidance on ISO 9001:2015 Requirements for Document Control

Benefits of Document control?-Document control certification online

Document Tracking in Document Control:

Document Management Best Practices

Mike Allocco, Emeritus Fellow ISSS的更多文章

Toxic Management Results in Uncontrolled Risks

Sorry you may have to consider complex risks, and there are many aspects?

Given complex systems how does one make risk-based decisions?

Why apply unique safety analysis methods in considering system risks?

Axiomatic System Safety Approaches in the Evaluation of AI-related Risks within Complex Systems

Maybe Understand Comprehensive Safety Analysis… while Dealing with Complex System Risks?

Safety Philosophy and AI: Considering the Exponential Growth of Science, Engineering and Technology

Talking Heads and Uncontrolled Risks

System Safety Anyone: Contrived Systems are Deterministic Regardless of Complexity?

What Safety Mindsets?

社区洞察

其他会员也浏览了

System Requirements Team Leader

The Crucial Role of Knowledge Management Systems in Engineering Consulting

Mastering the Art of Business Process Engineering: A Roadmap to Success

How to define Document Lifecycle in Document Control

Audit and Compliance in document control:

Guidance on ISO 9001:2015 Requirements for Document Control

Benefits of Document control?-Document control certification online

Document Tracking in Document Control:

Document Management Best Practices