Дисциплины:

# Fault Tree Analysis

Fault Tree Analysis (FTA) is a technique that can be used to determine the chain of events that causes a disruption to IT services. FTA, in conjunction with calculation methods, can offer detailed models of availability. This can be used to assess the availability improvement that can be achieved by individual technology component design options. Using FTA:

• Information can be provided that can be used for availability calculations
• Operations can be performed on the resulting fault tree; these operations correspond with design options
• The desired level of detail in the analysis can be chosen.

FTA makes a representation of a chain of events using Boolean notation. Figure 4.19 gives an example of a fault tree.

Figure 4.19 Example Fault Tree Analysis

Essentially FTA distinguishes the following events:

• Basic events – terminal points for the fault tree, e.g. power failure, operator error. Basic events are not investigated in great depth. If basic events are investigated in further depth, they automatically become resulting events.
• Resulting events – intermediate nodes in the fault tree, resulting from a combination of events. The highest point in the fault tree is usually a failure of the IT service.
• Conditional events – events that only occur under certain conditions, e.g. failure of the air-conditioning equipment only affects the IT service if equipment temperature exceeds the serviceable values.
• Trigger events – events that trigger other events, e.g. power failure detection equipment can trigger automatic shutdown of IT services.

These events can be combined using logic operators, i.e.:

• AND-gate – the resulting event only occurs when all input events occur simultaneously
• OR-gate – the resulting event occurs when one or more of the input events occurs
• Exclusive OR-gate – the resulting event occurs when one and only one of the input events occurs
• Inhibit gate – the resulting event only occurs when the input condition is not met.

This is the basic FTA technique. This technique can also be refined, but complex FTA and the mathematical evaluation of fault trees are beyond the scope of this publication.

Modelling

To assess if new components within a design can match the stated requirements, it is important that the testing regime instigated ensures that the availability expected can be delivered. Simulation, modelling or load testing tools to generate the expected user demand for the new IT service should be seriously considered to ensure components continue to operate under anticipated volume and stress conditions.

Modelling tools are also required to forecast availability and to assess the impact of changes to the IT infrastructure. Inputs to the modelling process include descriptive data of the component reliability, maintainability and serviceability. A spreadsheet package to perform calculations is usually sufficient. If more detailed and accurate data is required, a more complex modelling tool may need to be developed or acquired. The lack of readily available availability modelling tools in the marketplace may require such a tool to be developed and maintained ‘in-house’, but this is a very expensive and time-consuming activity that should only be considered where the investment can be justified. Unless there is a clearly perceived benefit from such a development and the ongoing maintenance costs, the use of existing tools and spreadsheets should be sufficient. However, some System Management tools do provide modelling capability and can provide useful information on trending and forecasting availability needs.

sdamzavas.net - 2020 год. Все права принадлежат их авторам! В случае нарушение авторского права, обращайтесь по форме обратной связи...