Determining
Reliability for Complex Systems
Part 1 - Analytical Techniques
A complex system is one that cannot be broken down to groups of series and parallel components. In many cases it is not easy to recognize which components are in series and which are in parallel in a complex system. The following network is a good example of such a complex system:
As the figure illustrates, this system cannot be broken down into a group of series and parallel systems. This complicates the problem of determining the system's reliability. If the system can be broken down to series/parallel configurations, it is a relatively simple matter to determine the mathematical or analytical formula that describes the system's reliability. However, for a complex system, determination of the system reliability becomes more involved.
In this article, we will look at some of the techniques that can be employed to determine the mathematical expression that expresses the reliability of the system in terms of the reliabilities of its components. It is assumed that the reliability values for the components have been determined using standard (or accelerated) life data analysis techniques, so that the reliability function for each component is known. With this component-level reliability information available, it then becomes necessary to determine how these component reliability values are combined to determine the reliability function for the overall system.
There are a number of advantages to using analytical techniques to determine system reliability, as opposed to the more common method of using simulation. The primary advantage of the analytical solution is that a mathematical expression that describes the reliability of the system is obtained. Once the system's reliability function has been determined, other calculations on the system can be performed. Such calculations include:
- Determination of the system's pdf.
- Determination of the warranty period.
- Determination of the system's failure rate.
- Determination of the system's MTTF.
In addition, optimization and reliability allocation techniques can be utilized to aid engineers in their design improvement efforts. Another advantage of using analytical techniques is the ability to perform static calculations and analyze systems with a mixture of static and time-dependent components. Finally, the reliability importance of components over time can be calculated with this methodology.
Several methods exist for analytically obtaining the reliability of a complex system:
- Decomposition method
- Event space method
- Path-tracing method
We will examine each of these methods, illustrating the techniques involved with simple system examples.
Decomposition Method
The decomposition method is an application of the law of total probability. It involves choosing a "key" component and then calculating the reliability of the system twice: once as if the key component failed (R=0) and once as if the key component succeeded (R=1). These two probabilities are then combined to obtain the reliability of the system, since at any given time the key component will be failed or operating. Using probability theory, the equation is:
Assuming that the components are statistically independent, this reduces to:
Consider three units in series.
- A is the event of Unit 1 success
- B is the event of Unit 2 success
- C is the event of Unit 3 success
- s is the event of system success
First select a " key" component for the system. Selecting Unit 1, the probability of success of the system is:
If Unit 1 survives, then:
That is, if Unit 1 is operating, the probability of the success of the system is the probability of Units 2 and 3 succeeding.
If Unit 1
fails, then:
That is, if Unit 1 is not operating, the system has failed since a series system requires all of the components to be operating for the system to operate.
Thus the reliability of the system is:
Another Illustration of the Decomposition Method
Consider the following system:
- A is the event of Unit 1 success
- B is the event of Unit 2 success
- C is the event of Unit 3 success
- s is the event of system success
Selecting Unit 3 as the key, the system reliability is:
If Unit 3 survives, then:
That is, since Unit 3 represents half of the parallel section of the system, as long as it is operating, the entire system operates.
If Unit 3 fails, then the system is reduced to:
The reliability of the system is given by:
or:
Event Space Method
The event space method is an application of the mutually exclusive events axiom. All mutually exclusive events are determined, and those which result in system success are considered. The reliability of the system is simply the probability of the union of all mutually exclusive events that yield a system success. Similarly, the unreliability is the probability of the union of all mutually exclusive events that yield a system failure. This is illustrated in the following example.
Consider the following system, with reliabilities R1, R2, and R3 for a given time:
- A is the event of Unit 1 success
- B is the event of Unit 2 success
- C is the event of Unit 3 success
The mutually exclusive system events are:
X1 = ABC - all units succeed X2 = ABC - only Unit 1 fails X3 = ABC - only Unit 2 fails X4 = ABC - only Unit 3 fails X5 = ABC - Units 1 and 2 fail X6 = ABC - Units 1 and 3 fail X7 = ABC - Units 2 and 3 fail X8 = ABC - all units fail |
System events X6, X7, and X8 result in system failure. Thus the probability of failure of the system is:
Since events X6, X7, and X8 are mutually exclusive, then:
And:
Combining terms yields:
Since:
then:
This is of course the same result as the one obtained previously using the decomposition method.
If R1 = 99.5%, R2 = 98.7%, and R3 = 97.3%, then:
or Rs = 99.95%.
Path-Tracing Method
With this method, every path from a starting point to an ending point is considered. Since system success involves having at least one path available from one end of the Reliability Block Diagram (RBD) to the other, as long as at least one path from the beginning to the end of the path is available, the system has not failed. One could consider the RBD to be a plumbing schematic. If a component in the system fails, the "water" can no longer flow through it. As long as there is at least one path for the "water" to flow from the start to the end of the system, the system is successful. This method involves identifying all of the paths the "water" could take and calculating the reliability of the path based on the components that lie along that path. The reliability of the system is simply the probability of the union of these paths. In order to maintain consistency of the analysis, starting and ending blocks for the system must be defined.
Consider the following system:
The successful paths for this system are X1 = ABD and X2 = ACD. The reliability of the system is simply the probability of the union of these paths.
Thus:
In the following system, a starting and an ending node must be defined.
Assume the following starting and ending nodes:
The paths for this system are X1 = 1,2 and X2 = 3. The probability of success for the system is given by:
or:
A
modified version of this method is used by ReliaSoft BlockSim to calculate the analytical solution to system
reliability diagrams.
The examples used to illustrate these techniques used fairly simple
systems to simplify the mathematics involved. The same techniques can be
used to determine the reliability of more complex systems. It should be
fairly obvious that the expressions for the system reliability will get
larger as the number of components in the system increases. The way the
components are arranged reliability-wise will also have an effect on the
size of the final system reliability term. In fact, even moderately-sized
complex systems can prove to be too unwieldy to solve by hand. Computer
programs can be employed to solve these large complex systems, but to the
best of our knowledge,
BlockSim is the only software package available that is
capable of this type of analysis.
While these analytical techniques for determining system reliability can yield results not available with other techniques, there are also some drawbacks. The biggest disadvantage of the analytical method is that formulations can become very complicated. The more complicated a system is, the larger and more difficult it will be to analytically formulate an expression for the system's reliability. For particularly detailed systems, this process can be quite time-consuming, even with the use of computers. Furthermore, when the maintainability of the system or some of its components must be taken into consideration, an analytical solution may be impossible to compute. In these situations, the use of simulation methods may be more advantageous than attempting to develop a solution analytically. We will take a look at these simulation methods in the next article in this series.