This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A new safety-guided design methodology to complement model-based safety analysis for safety assurance

Minghui Sun    Cody H. Fleming University of Virginia, Charlottesville, USA ([email protected]). Iowa State University, Ames, USA ([email protected])
Abstract

With the rapid advancement of Formal Methods, Model-based Safety Analysis (MBSA) has been gaining tremendous attention for its ability to rigorously verify whether the safety-critical scenarios are adequately addressed by the design solution of a cyber-physical human system. However, there is a gap. If specific safety-critical scenarios are not included in the given design solution (i.e., the model) in the first place, the results of MBSA cannot be trusted for safety assurance. To tackle this problem, we propose a new safety-guided design methodology (called STPA+) to complement MBSA. Inspired by STPA, STPA+ treats a system as a control structure, which is particularly fit for systems with complex interactions between human, machine, and automation. Three methods are developed in STPA+ to tackle the possible omissions of safety-critical scenarios caused by incorrectly defined safety constraints, improperly constrained process model, and inadequately designed controller. In this way, STPA+ directly derives an adequately defined design solution as the input to an MBSA verification program and bridges the gap between current MBSA approaches and safety assurance.

keywords:
Safety-guided design, hazard without failure, MBSA, STPA+.
thanks: This work is sponsored by NASA Grant 80NSSC19K1702, with Natasha Neogi and Jon Holbrook acting as technical monitors.

1 Introduction

Cyber-physical human systems are pervasive in today’s world. For example, the modern commercial airplane heavily depends on the collaboration of human pilot and cockpit automation to safely and efficiently transport passengers or cargo from Point A to Point B. Certifying a commercial airplane requires by law evidence that all the possible safety-critical scenarios that the airplane will encounter in the actual operation are adequately addressed, which is especially challenging for new aircraft entrants that have novel features (e.g., 737MAX) and future advanced air mobility concept. Similar challenges abound in many other cyber-physical human system domains, such as automated driving Kramer et al. (2020) and nuclear power plant Ahn et al. (2015).

With the rapid advancement of Formal Methods, Model-based Safety Analysis (MBSA) Sun et al. (2021) has been gaining tremendous attention for its ability to rigorously verify whether the safety-critical scenarios are properly addressed by a design solution. Many MBSA approaches have been developed over the past two decades. The prominent ones include AADL Feiler et al. (2006), AltaRica Prosvirnova (2014), and HipHops Kabir et al. (2018), which according to Bozzano et al. (2015), are the only three languages that “have matured beyond the level of research prototypes”. In general (Fig.1), these mainstream MBSA approaches start with a given design solution by formalizing it into a design model in the target languages; then faults and failures are modeled to achieve a safety model; finally, verification is conducted to see whether the given properties (functional or safety, deterministic or probabilistic) are satisfied.

Refer to caption
Figure 1: Given the function and the hazard, the mainstream MBSA approaches (in the dotted box) contribute to safety assurance by verifying the given design solution against the given properties.

However, there is a gap. Even assuming MBSA (or formal verification in a more general sense) can exhaustively check all the possible scenarios captured by the design solution, there are still open challenges. First, verification is already challenging for complex hybrid systems. More importantly, if certain safety-critical scenarios are not included or considered in the given design solution (the left shoulder of Fig.1), or the properties do not correctly reflect the hazard (the right shoulder of Fig.1), the results of MBSA cannot be fully trusted for safety assurance.

In fact, such a gap has already been identified by one of FAA’s reports Butka et al. (2015) that systems that pass formal verification can still be incorrect. Leveson (2020) further argues “formal verification can only show consistency between two formal models” and “cannot be used as a validation of acceptable safety”. Instead, Leveson proposes Systems Theoretic Process Analysis (STPA). STPA is known for its ability to identify hazardous scenarios associated with control Steck (2018) and define safety properties from the hazard Hobbs et al. (2021). Furthermore, STPA is also designed for analyzing systems with complex human-automation interactions Fleming and Leveson (2016), while the more traditional Fault Tree Analysis, as prescribed by ARP 4761 SAE (1996) for aviation certification, does not even consider humans or software in the fault tree, let alone analyzing the interaction between them. All these traits make STPA an ideal alternative to bridge this gap for cyber-physical human systems.

However, STPA has two weaknesses. First, STPA has to model an existing design solution into a control structure, which ultimately relies on the “art of modeling” and opens the window for inconsistency between the design model and the STPA model. Second, STPA provides less support in identifying the causes of the hazardous scenarios associated with control Leveson and Thomas (2013), which is why many approaches Kramer et al. (2020); Steck (2018) only use STPA to identify the safety-critical scenarios for the system as a whole, similar to, e.g., HAZOP. However, it is equally important to decompose these scenarios into more refined scenarios that the design solution can directly act upon. According to a recent review Zhang et al. (2021) on identifying the safety-critical scenarios in the Automated Driving community, no work is found that identifies the safety-critical scenarios to support defining the design solution.

Therefore, this paper proposes a new safety-guided design methodology, called STPA+, based on STPA to (1) derive the design solution directly and (2) provide better methodological support to identify and then address the safety-critical scenarios that the design solution should directly act upon. Because the safety-critical scenarios caused by failure have been addressed by MBSA (the bottom arrow in Fig.1), STPA+ only focuses on the safety-critical scenarios that happen without the presence of component-level failures. In fact, there is an ISO standard under development (ISO/PAS 21448, SOTIF) that specifically focuses on this type of scenarios. As a result, STPA+ strengthens the two shoulders of Fig.1, and “STPA+MBSA” provides more comprehensive safety assurance for cyber-physical human systems.

Finally, because STPA+ does not need an existing design solution to identify the safety-critical scenarios that a design solution should directly act upon, STPA+ can also be potentially used to design a run-time safety monitor for systems of which one has limited control over the operation, such as a legacy system or an AI-based system. The run-time monitor may function as a modular safety feature to capture hazardous scenarios and make decisions when the original system fails to act.

2 Overview of STPA+

STPA+ shares the same system-theoretic view as STPA: a system can be abstracted as a (hierarchical) control structure (the left of Fig.2) regardless of a human controller, automation, or a team of both, which makes it an ideal candidate to design cyber-physical human systems.

Refer to caption
Figure 2: An overview of STPA+. Left is the control structure that operates counter-clockwise and is designed clockwise; right is the proposed STPA+ comprised of three methods.

A control structure operates in the following order (the red circle of Fig.2): (1) the controller observes the controlled process, (2) the controller issues control actions to manipulate the evolution of the controlled process, (3) the output behavior of the controlled process (called output behavior hereafter) achieves the functional goal and avoids the hazard.

Designing a control structure follows the opposite direction (the blue circle of Fig.2), which translates into the three methods of STPA+ below (the right of Fig.2).

  • Method 1: Translating the function/hazard into constraints on the output behavior. Because these constraints are the “desired end results” of the control structure as a whole, they are prescriptive constraints.

  • Method 2: Properly constraining a model of the controlled process. Because these constraints represent what the controlled process can possibly do, they are descriptive constraints.

  • Method 3: A reference architecture for the controller to issue the right control action at the right time so that the constraints from both directions will be respected.

3 Deriving the prescriptive constraints (Method 1)

The prescriptive constraints are defined on the output behavior. The output behavior comprises three elements: a start time stst, a stop time spsp, and the trajectory evolution in [st,sp][st,sp]. Accordingly, constraints must be derived from the hazard for all three elements. Safety constraints in the current practice are mostly the restrictions on the trajectory evolution. For example, a minimal distance from the terrain is defined to constrain the descent of an airplane when it is descending. Such constraint on the trajectory evolution while the output behavior is ongoing is called performance constraint in this paper. However, the performance constraint does not prescribe when the output behavior should start or stop. The lack of constraints on the start/stop time may lead to the omission of safety-critical scenarios, especially for time-sensitive output behaviors. Therefore, the question is, how to derive the safety constraints on the start/stop time from the hazard.

Refer to caption
Figure 3: An example of the constraints on the start time.

First, consider the start time stst. For example (Fig.3), a time-sensitive output behavior is safe to start only within ST1ST_{1} and ST2ST_{2}, which divides the timeline into different sections with different meanings. The output behavior “must start” when the timeline comes to ST1ST_{1}, otherwise the hazard will surely happen after ST1ST_{1}; the output behavior “can start” when the timeline comes within ST2ST_{2}, because if not, it can still start in ST1ST_{1}; the output behavior “must not start” in the rest of the sections. Clearly, there are three types of constraints for the start time:

  • Must-start time window mst_Tmst\_T: the output behavior must start within this time window, otherwise the hazard will happen after the time window expires.

  • Must-not-start time window nst_Tnst\_T: the output behavior must not start within this time window, otherwise the hazard will happen immediately.

  • Can-start time window cst_Tcst\_T: It is safe whether the intended output behavior starts or not when the time comes to cst_Tcst\_T. Mathematically, the can-start time window is complementary to the mst_Tmst\_T and nst_Tnst\_T, and can be calculated in (1), where 𝒯\mathcal{T} is the time before mst_Tmst\_T expires.

    cst_T=𝒯¬(mst_Tnst_T)cst\_T=\mathcal{T}\cap\neg(mst\_T\cup nst\_T) (1)

Second, we reason about the time windows. For the must-start condition, how can the hazard happen if the intended output behavior has not even started yet? That must be caused by the output behavior temporally immediately before the intended output behavior (called in-behavior hereafter) violating its own performance constraints (denoted as pcipc^{i}). For the must-not-start condition, the hazard is caused by the started intended output behavior violating its own performance constraints (denoted as pcpc). Therefore, the must-start time window is defined to avoid the in-behavior from stopping at the wrong time to violate pcipc^{i}; the must-not-start time window is to avoid the intended output behavior from starting at the wrong time to violate pcpc.

Refer to caption
Figure 4: The merging example.

Taking merging into the traffic for example, with the in-behavior and the intended output behavior and their respective performance constraints defined in Fig.4. The “must-start” time window is determined based on the time left before the car breaches the minimal distance from the end of the merge lane (i.e., the in-behavior violating pcipc^{i}). The “must-not-start” time window is determined based on the time when there is no safe opening to merge into the traffic (i.e., the intended output behavior violating pcpc).

Refer to caption
Figure 5: Deriving the constraints on the start/stop time. The out-behavior is the output behavior temporally immediately after the intended output behavior.

As a result, the constraints on the start/stop time can be summarized in Fig.5. The must-start, must-not-start and can-start time window (mst_T,nst_Tmst\_T,nst\_T and cst_Tcst\_T) are derived based on the performance constraints of the in-behavior pcipc^{i} and the performance constraints of the intended output behavior pcpc. Similarly, the must-stop, must-not-stop and can-stop time window (msp_T,nsp_Tmsp\_T,nsp\_T and csp_Tcsp\_T) are derived based on pcpc and the performance constraints of the out-behavior pcopc^{o}. Eventually, the prescriptive constraints of the output behavior y(t)y(t) where t[st,sp]t\in[st,sp] can be defined in (2), and Y(t),STY(t),ST and SPSP are the final prescriptive constraints.

{y(t)Y(t)pc, where t[st,sp]stSTmst_Tcst_T¬nst_TspSPmsp_Tcsp_T¬nsp_T\begin{cases}y(t)\in Y(t)\subseteq pc,\text{ where }t\in[st,sp]\\ st\in ST\subseteq mst\_T\cup cst\_T\cap\neg nst\_T\\ sp\in SP\subseteq msp\_T\cup csp\_T\cap\neg nsp\_T\\ \end{cases} (2)

4 Deriving the descriptive constraints (Method 2)

Method 2 is to constrain a model of the controlled process properly. An incorrect model can lead to hazardous scenarios that will escape the scrutiny of any model-based analysis. General System Theory Mesarovic and Takahara (1975) provides a generic construct (u(t),x(t),p)𝑓(x˙,y(t))T(u(t),x(t),p)\xrightarrow{f}(\dot{x},y(t))^{T} to model the controlled process, where uu is the control, pp is the set of parameters, xx is the system state, and yy is the output. Since the basic construct can only be determined case-by-case,the question is, how to properly constrain a given construct (f,u,x,p,x˙,y)(f,u,x,p,\dot{x},y) for a controlled process?

Fig.6 provides a map to identify the constraints. Each dotted arrow may yield both the constraints on (u,x,p,x˙,y)(u,x,p,\dot{x},y) and the associated assumptions on the environment and the system/human.

First, ff represents concrete mechanisms (engineered or natural) of the actual process. It may only be able to process a finite set of the inputs and hence have constraints on the inputs (Arrow 1).

Second, the controlled process operates on actual components. For those that are out of the design scope (i.e., the designer has no control), we call them “environment”. Components within the design scope are the “system” to be built or further refined, or “human” to be trained. For the environment, it may have a set that includes all the possible inputs (Arrow 2) and another set in which all the possible outputs must be included (Arrow 3). For the system/human, it is always subject to finite design/manufacture/natural capacities that may yield constraints on both inputs and outputs (Arrow 4).

Third, (u,x,p,x˙,y)(u,x,p,\dot{x},y) is internally constrained by ff. Therefore, the constraints on (u,x,p,x˙,y)(u,x,p,\dot{x},y) also need to satisfy ff mathematically (Arrow 5).

Finally, each constraint may have assumptions on the environment and the system/human for the constraint to be valid in the first place. Therefore, constraints and assumptions should always appear in pairs, and any isolated constraint or assumption must be justified.

Refer to caption
Figure 6: A map to identify the constraints of the controlled process. Each dotted arrow is a type of constraints from the source to the destination.

Therefore, all the constraint-assumption pairs (denoted as {constraint|| assumption on system/human|| assumption on environment}) need to be identified based on Fig.6 to properly constrain the model of the controlled process. For example (Fig.7), Vehicle A (an eVTOL) is instructed to descend to Point W with a constant descent angle γ\gamma. We treat γ\gamma as a system state and derive its constraint-assumptions pair based on Fig.6. Note that Arrow 2 and Arrow 5 do not apply to this example.

Refer to caption
Figure 7: An example of eVTOL descent. The grey landing pad is backup for emergency use only.

Arrow 1. ff here is to maintain a constant descent angle, which can only be achieved when γ\gamma is within Γ1\Gamma_{1} if the battery level is more than BLBL percent and the crosswind is less than CWCW knots. The constraint-assumption pair hence is {Γ1|BL|CW}\{\Gamma_{1}|BL|CW\}.
Arrow 3. To avoid interference with the adjacent landing pad, the vertiport (i.e., the environment) requires γ\gamma to be greater than Γ3\Gamma_{3}. However, when the backup landing pad is activated in an emergency, the vehicle will need to descend at a greater angle. Therefore, the nominal vertiport operation is an assumption on the environment. The constraint-assumption hence is {Γ3|NA|Nom}\{\Gamma_{3}|NA|Nom\}.
Arrow 4. The vehicle can afford a descent angle within Γ4\Gamma_{4} if the local elevation is lower than ELEL and the vehicle payload weighs less than PDPD. The constraint-assumption hence is {Γ4|PD|EL}\{\Gamma_{4}|PD|EL\}.

As a result, the constraints of γ\gamma are Γ1Γ3Γ4\Gamma_{1}\cap\Gamma_{3}\cap\Gamma_{4}, and the associated assumptions are BLCWNomPDELBL\wedge CW\wedge Nom\wedge PD\wedge EL.

Finally, the same procedure is applied to each element of (u,x,p,x˙,y)(u,x,p,\dot{x},y) for the constraint-assumption pair. Eventually, the descriptive constraints can be represented in (3), where ASAS is the set of the assumptions.

{(u(t),x(t),p)𝑓(x˙,y(t))T(u,x,p,x˙,y)(U,X,P,X˙,Y)AS=ASxASuASpASx˙ASy\begin{cases}(u(t),x(t),p)\xrightarrow{f}(\dot{x},y(t))^{T}\\ (u,x,p,\dot{x},y)\in(U,X,P,\dot{X},Y)\\ AS=AS_{x}\wedge AS_{u}\wedge AS_{p}\wedge AS_{\dot{x}}\wedge AS_{y}\end{cases} (3)

5 Reference architecture for a safe controller (Method 3)

Mathematically, the next task is to design a controller to find the control actions to satisfy the constraints from (2) and (3). However, a safe controller in an actual system is more than a mathematical problem solver. It is a decision-maker that constantly monitors the environment and the controlled process, watches out for hazards, reacts to changes, and adjusts the decisions in real-time. To design such a safe controller is to identify all the safety-critical scenarios and design mechanisms into the controller to address them accordingly.

First, a controller in the most general sense, must make the following three decisions.

  1. 1.

    Decide prescriptive constraints: In a dynamic environment, what is safe/unsafe usually changes in real-time. The controller must adjust the prescriptive constraints accordingly.

  2. 2.

    Decide control reference: Because the controller needs control references to generate the control action, this decision is to decide the control references from the prescriptive constraints.

  3. 3.

    Decide control action: This decision is akin to the control algorithm in traditional feedback control: generating control actions from given control references.

Taking “lane merging” in Fig.4 for example, Decision (1) is to find the road section and time window to merge safely by monitoring the traffic; Decision (2) is to decide the specific position and time to merge; Decision (3) is to operate the steering and gas to execute the merge.

Refer to caption
Figure 8: The five types of unsafe scenarios (not involving failure) when making a decision. The “Previously safe decision becomes unsafe” scenario is addressed as “previously safe” for short.

Second, regardless of the decision being made, there are five types of unsafe scenarios (without failure/fault) when making a decision (Fig.8). (i) No-decision refers to the scenarios where no decision can be found. (ii)Previously safe refers to the scenarios where an already generated decision becomes unsafe due to change. (iii)Unsafe timing refers to the scenarios where the decision is made too late, so much so the timing constraints in (2) cannot be met as the operation of the rest of the system also takes time. (iv)Time coupling refers to the scenarios where the result of a decision is affected by the time delay of the system, i.e., time and value are coupled, such as the phenomenon of Pilot-induced Oscillations. (v)Wrong math refers to the scenarios where the algorithm to calculate the decision is wrong or inadequate. Most works in theoretical control or formal methods are to get the math right. Therefore, we do not consider this scenario in the reference architecture. All the unsafe scenarios above are scenarios, if not appropriately addressed, can cause a hazard.

Third, we consider how the safety-critical scenarios explained above (except the“wrong math”) can happen to each of the three decisions that a controller makes. We will continue using the “lane merging” as a running example for the first two decisions.

(1) Decide prescriptive constraints. This decision observes information from outside the controller and generates the prescriptive constraints in (2) based on the procedure of Method 1. Time coupling is not applicable.

No-decision: There are internal conflicts between the constraints, e.g. the entire must-start time window is included by the must-not-start time window (i.e. mst_Tnst_Tmst\_T\subseteq nst\_T). For example, the traffic is dense, and no opening can be found to merge safely till the end of the merging lane.

Previously safe: The prescriptive constraints become stricter due to the change in the environment, or the prescriptive constraints are violated by the output behavior due to uncertainty. For the former scenario, the traffic becomes tighter after the safe merging section and time window are found, which requires the merging section and time window to be adjusted accordingly. For the latter, the car merges at a point out of the selected merging section because a hole in the road disrupts the heading, making the car dangerously close to the traffic.

Unsafe timing: The prescriptive constraints are decided too late. For example, the driver finds the safe merging section just before the car hits the end of the merge lane, which is too late because the car needs more time to initiate and accomplish the merge.

(2) Decide control reference. This decision generates the control reference so that the output behavior will satisfy the prescriptive constraints in (2), and the trajectory evolution will respect the descriptive constraints in (3). Note that the trajectory evolution of the controlled process can be determined by the control reference, given the dynamics, the control algorithm, and the initial condition.

No-decision: The controller cannot find a control reference to satisfy the prescriptive constraints and the descriptive constraints at the same time. For example, the only available merging opening requires an acceleration rate that exceeds the car’s designed performance. Hence, no target merging point is found.

Previously safe: The prescriptive or descriptive constraints can be violated due to the change of the constraints or the deviation of the actual controlled process from the planned trajectory evolution.

  • The prescriptive constraints are going to be violated by the output behavior. For example, the traffic suddenly becomes tighter (i.e., the prescriptive constraints change) when the car is actively merging. The car is now predicted to be unsafely close to the traffic after the merge, and therefore should stop merging.

  • The descriptive constraints are going to be violated by the trajectory evolution. For example, the car enters a road section with a slower speed limit (i.e., the descriptive constraints change) when it is speeding up to merge. The eventual merging speed that is acceptable for the previous speed limit is predicted to violate the new speed limit.

  • The descriptive constraints being violated by the current states of the controlled process. For example, the car enters a downhill while it is speeding up to merge, making the car extra faster than planned (i.e., deviation from the planned trajectory evolution). As a result, the speed limit is violated before the merge. The car must slow down immediately instead of speeding up to merge.

Unsafe timing: The control reference may be issued too late without considering the fact that the execution of the rest of the system takes time, which eventually leads to a violation of the timing constraint in (2).

Time coupling: In the most general sense (Fig.9), the control reference starts to be generated at t1t_{1} and the controlled process starts to seek the control reference at t3t_{3}. When such an event sequence operates fast enough, the states observed at t1t_{1} can be considered the same as the actual states at t3t_{3}. However, when the time delay between t1t_{1} and t3t_{3} is unneglectable, the initial states used to generate the control reference should not be the states observed at t1t_{1}, but rather the predicted states at t3t_{3} based on the observation at t1t_{1}. The coupling between the time delay and the control reference must be addressed properly to avoid the “wrong value” scenario in Fig.8.

Refer to caption
Figure 9: The event sequence from generating the control reference at t1t_{1} to generating the control action at t2t_{2}, to the controlled process actively seeking the control reference at at t3t_{3}. CR and CA stand for control reference and control action respectively.

(3) Decide control action. This decision is akin to the control algorithm in the theoretical control, to generate control action based on given control reference.

No-decision: As long as the control reference is found, the control action is also available because Decision 2 makes sure the descriptive constraints can always be satisfied. However, if the generated control reference needs to be updated, only the control reference after tc+t3t1t_{c}+t_{3}-t_{1} can be updated (tct_{c} is the current time) because of the time delay t3t1t_{3}-t_{1} in Fig.9. As a result, the control action within [tc,tc+t3t1][t_{c},t_{c}+t_{3}-t_{1}] must be updated in a way that the constraints (prescriptive and descriptive) can be satisfied at the same time. No-decision scenario is hence when such control action cannot be found.

Previously safe: For systems that operate at a slower pace, the control actions may be issued a certain time before being executed. This scenario means the issued control actions may lead to a violation of the constraints (prescriptive and descriptive) due to either the change of the constraints or the deviation of the controlled process from the predicted trajectory evolution. For example, an air traffic controller recalls the previous instruction when the spacing constraints change (i.e., constraints change) for the airspace ahead; or the airplane deviates from its planned course due to a wind gust (i.e., deviation), causing the air traffic controller to change its previous instruction.

Unsafe timing: The control action may be issued too late without considering the fact that the execution of the controlled process takes time.

Time coupling: Similar to the previous decision. When the time delay between t2t_{2} and t3t_{3} (Fig.9) is unneglectable, the time delay between t2t_{2} and t3t_{3} must be accounted for when generating the control action.

Finally, the scenarios identified above can already be used to examine whether an existing controller design is safe. However, STPA+ takes one step further by defining sub-decisions within each decision to address all the safety-critical scenarios. Although the details are out of the scope of this paper, we define 33 main sub-decisions, 12 enabling sub-decisions, and their interactions as a reference architecture. The reference architecture works like a class (as in Object-Oriented Programming). In the specific design applications, engineers only need to instantiate the reference architecture and tailor it for their own problems, yielding a controller that has all the safety-critical scenarios (without component-level failure) automatically addressed. We will explain the details of the reference architecture in an extension of this work.

6 Conclusion

MBSA has gained tremendous traction over the past two decades due to its ability to rigorously prove whether all the safety-critical scenarios captured in the design solution are addressed correctly. However, to apply MBSA for certification of safety-critical systems, one still needs to demonstrate that the model used in MBSA includes all the possible safety-critical scenarios that the actual system will encounter in actual operation. Current MBSA approaches are not equipped for this task.

To tackle this problem, we propose a new safety-guided design methodology, called STPA+, to complement MBSA for safety assurance. STPA+ treats a system as a control structure, which is particularly fit for systems with complex interactions between human, machine, and automation. Furthermore, STPA+ comprises three methods. Method 1 is to derive safety constraints from the hazard to make sure the safety constraints adequately reflect the hazard under study. Method 2 is to define the model of the controlled process to make sure the model is properly constrained both explicitly and implicitly. Method 3 is to define a safe controller (human, automation, or a team of both) by providing a reference architecture to make sure the controller defined based on the reference architecture has all the safety-critical scenarios (without component-level failure) addressed. In summary, Method 1 strengthens the right shoulder of Fig.1; Method 2&3 strengthens the left shoulder of Fig.1. Together, “STPA+MBSA” provides a strong and comprehensive argument for the safety assurance of a cyber-physical human system.

In the future, we will demonstrate specifically how STPA+ can address the concerns in ISO/PAS 21448 (Safety of the intended functionality). Moreover, because STPA+ does not require knowledge about the design solution to identify the safety-critical scenarios that a controller must directly act on, we will apply STPA+ to design a run-time monitor as a modular safety feature to assist the human/AI-based controller or the automation controller of a legacy system to capture and act on hazardous scenarios when necessary.

References

  • Ahn et al. (2015) Ahn, W., Chung, M., Min, B.G., and Seo, J. (2015). Development of cyber-attack scenarios for nuclear power plants using scenario graphs. International Journal of Distributed Sensor Networks, 11(9), 836258.
  • Bozzano et al. (2015) Bozzano, M., Cimatti, A., Lisagor, O., Mattarei, C., Mover, S., Roveri, M., and Tonetta, S. (2015). Safety assessment of altarica models via symbolic model checking. Science of Computer Programming, 98, 464–483.
  • Butka et al. (2015) Butka, B., Mandalapu, S., and Kilgore, C. (2015). Advanced verification methods for safety-critical airborne electronic hardware. Federal Aviation Admlinistrtion William J. Hughes Technical Center.
  • Feiler et al. (2006) Feiler, P.H., Gluch, D.P., and Hudak, J.J. (2006). The architecture analysis & design language (aadl): An introduction. Technical report, Carnegie-Mellon Univ Pittsburgh PA Software Engineering Inst.
  • Fleming and Leveson (2016) Fleming, C.H. and Leveson, N.G. (2016). Early concept development and safety analysis of future transportation systems. IEEE Transactions on Intelligent Transportation Systems, 17(12), 3512–3523.
  • Hobbs et al. (2021) Hobbs, K.L., Collins, A.R., and Feron, E.M. (2021). Risk-based formal requirement elicitation for automatic spacecraft maneuvering. In AIAA Scitech 2021 Forum.
  • Kabir et al. (2018) Kabir, S., Walker, M., and Papadopoulos, Y. (2018). Dynamic system safety analysis in hip-hops with petri nets and bayesian networks. Safety science, 105, 55–70.
  • Kramer et al. (2020) Kramer, B., Neurohr, C., Büker, M., Böde, E., Fränzle, M., and Damm, W. (2020). Identification and quantification of hazardous scenarios for automated driving. In International Symposium on Model-Based Safety and Assessment, 163–178. Springer.
  • Leveson (2020) Leveson, N. (2020). Are you sure your software will not kill anyone? Communications of the ACM, 63(2), 25–28.
  • Leveson and Thomas (2013) Leveson, N. and Thomas, J. (2013). An stpa primer. Cambridge, MA.
  • Mesarovic and Takahara (1975) Mesarovic, M.D. and Takahara, Y. (1975). General systems theory: mathematical foundations. Academic press.
  • Prosvirnova (2014) Prosvirnova, T. (2014). AltaRica 3.0: a model-based approach for safety analyses. Ph.D. thesis, Ecole Polytechnique.
  • SAE (1996) SAE (1996). Guidelines and methods for conducting the safety assessment process on civil airborne systems and equipment arp 4761. Technical report, SAE International.
  • Steck (2018) Steck, J. (2018). Methodological approach to identify automation risks of highly automated vehicles using stpa.
  • Sun et al. (2021) Sun, M., Fleming, C.H., and Milich, M. (2021). Defining and reasoning about model-based safety analysis: A review.
  • Zhang et al. (2021) Zhang, X., Tao, J., Tan, K., Törngren, M., Sánchez, J.M.G., Ramli, M.R., Tao, X., Gyllenhammar, M., Wotawa, F., Mohan, N., et al. (2021). Finding critical scenarios for automated driving systems: A systematic literature review. arXiv preprint arXiv:2110.08664.