RCM Background

Reliability Centered Maintenance (RCM) analysis provides a structured framework for analyzing the functions and potential failures for a physical asset (such as an airplane, a manufacturing production line, etc.) in order to develop a scheduled maintenance plan that will provide an acceptable level of operability, with an acceptable level of risk, in an efficient and cost effective manner.

According to the SAE JA1011 standard, which describes the minimum criteria that a process must comply with to be called "RCM," a reliability centered maintenance process answers the following seven questions:

  • What are the functions and associated desired standards of performance of the asset in its present operating context (functions)?
  • In what ways can it fail to fulfill its functions (functional failures)?
  • What causes each functional failure (failure modes)?
  • What happens when each failure occurs (failure effects)?
  • In what way does each failure matter (failure consequences)?
  • What should be done to predict or prevent each failure (proactive tasks and task intervals)?
  • What should be done if a suitable proactive task cannot be found (default actions)?

Although there is a great deal of variation in the application of RCM, this topic provides a brief general overview of common RCM techniques and requirements.

Prepare for the Analysis

As with almost any project, some preliminary work will be required to prepare for the RCM analysis. Some important up-front activities include:

  • Assemble Analysis Team: One of the first steps in performing an RCM analysis is to assemble a cross-functional team of knowledgeable individuals to perform the analysis. The team should be large enough to make sure that relevant viewpoints and knowledge are represented but not too large. If the team is too large, it will be difficult to have productive discussions during meetings and it will be a waste of an extremely valuable resource – the time and patience of your organization’s subject matter experts.

The composition of the team at any particular meeting may vary depending on the focus of the discussion. Team members should be familiar with the RCM analysis process, as it is practiced by the organization. In addition, a skilled facilitator can help to make sure that team meeting time is used effectively and the analysis is performed correctly.

  • Establish Ground Rules and Assumptions: Identifying and documenting the ground rules and assumptions that will be followed can facilitate the analysis process by making sure that all members of the team understand and accept the conditions of the analysis. Issues to be discussed when preparing for the analysis may include:
    • Identify the scope of the analysis project and address other project management issues, such as schedule, budget, meeting procedures, etc.
    • Define the expected operational environment for the equipment and any other assumptions that may affect the analysis. For example, if the equipment is expected to be operated within a specific temperature/humidity range, will potential failures that occur beyond these ranges be considered in the analysis?
    • Agree on the definitions of failure that will be followed during the analysis.
  • Gather and Review Relevant Documentation: The analysis team may identify existing references that will provide valuable input to the RCM analysis activity, such as operation manuals, previous maintenance plans, prior failure reports, etc.

Tip: You can use the analysis plan feature for a variety of project planning features, including the ability to document the members of the analysis team and the ground rules and assumptions.

Select the Equipment to Be Analyzed

Because RCM analysis requires an investment of time and resources, the organization may wish to focus analysis resources on selected pieces of equipment based on safety, legal, economic and other considerations. Two methods of equipment selection that are commonly employed are directly supported within the software.

The Selection Questions method consists of a series of yes/no questions that are designed to identify whether analysis is indicated for a particular piece of equipment. For example, the MSG-3 guideline, which is used to develop initial scheduled maintenance plans for the aircraft industry, proposes four questions (from the ATA MSG-3 Operator/Manufacturer Scheduled Maintenance Development guidelines, Revision 2003.1, Air Transport Association of America, 2003):

  • Could failure be undetectable or not likely to be detected by the operating crew during normal duties?
  • Could failure affect safety (on ground or in flight), including safety/emergency systems or equipment?
  • Could failure have significant operational impact?
  • Could failure have significant economic impact?

If the analysts answer "yes" for at least one of these questions, then detailed analysis is indicated for the equipment.

The Criticality Factors method consists of a set of rating scales designed to evaluate the criticality of the equipment in terms of relevant factors, such as safety, maintenance, operations and environmental impact. Each factor is rated according to a predefined scale where the higher ratings indicate higher criticality. The equipment’s overall criticality value can then be used as a ranking and/or as a threshold. For example, the analysis team may choose to start on the equipment with the highest criticality and proceed down the list as resources allow. Alternatively, the team may agree to perform detailed analysis for equipment with a criticality value higher than a specified threshold or to perform detailed analysis for all equipment with a criticality value in the top 20%.

Other methods may also be applied, such as Pareto analysis of equipment based on downtime, unreliability or another relevant metric. Whichever method (or combination of methods) is employed, the goal is to focus RCM analysis resources on the equipment that will provide the maximum benefit to the organization in terms of safety, legal, operational, economic and related priorities.

Tip: You can use the Risk Discovery (Equipment Selection) feature to apply the Selection Questions method or the Criticality Factors method.

Identify the Functions

One of the primary tenets of the RCM approach is that maintenance activities should be focused toward preserving equipment functionality. Therefore, it follows that the first step in analyzing a particular piece of equipment is to identify the functions it is intended to perform.

Many RCM references recommend including specific performance requirements in function descriptions, which will help to specifically identify functional failures. For example, "To provide hydraulic pressure of 3000 psi +/- 200 psi."

Identify the Functional Failures

Functional failures describe ways that the equipment may fail to perform its intended functions. This may include failure to perform a function, poor performance of a function, over-performance of a function, performing an unintended function, etc. As mentioned above, the performance limits that have been identified for the function may provide a guide to the functional failure description. For example, if the function is "To provide hydraulic pressure of 3000 psi +/- 200 psi" then the functional failures might include: "Provides hydraulic pressure of more than 3200 psi," "Provides hydraulic pressure of less than 2800 psi," etc.

Identify and Categorize the Effects of Failure

Identifying and evaluating the effects of failure will help the team to prioritize and choose the appropriate maintenance strategy to address a potential failure. Many RCM references contain logic diagrams that can be used to categorize the effects of failure. The following logic diagram is provided as an example in the SAE JA1012 “Guide to the Reliability-Centered Maintenance (RCM) Standard.” (From SAE JA1012 “A Guide to the Reliability-Centered Maintenance (RCM) Standard,” issued in January 2002) Note: Only the Failure Effect Categorization portion of the logic is presented here. For the Task Selection portion, see Select the Appropriate Maintenance Tasks.

This diagram has 5 questions with 6 failure effect categories. Other published logic diagrams may consist of 3 or 4 questions with 4 or 5 failure effect categories.

Tip: You can use any of the predefined logic diagrams that are shipped with the software or define your own custom logic diagram with 4, 5 or 6 categories.

Identify the Causes of Failure (Failure Modes)

The cause of failure (sometimes also called failure mode) represents the specific reason for the functional failure at the actionable level (i.e., the level at which is will be possible to apply a maintenance strategy to address the potential failure). This determination is based on engineering judgment and relies on the team’s experience and skill with the RCM analysis process.

The SAE JA1012 guideline presents a useful demonstration of the many levels of detail that can be used to describe failure modes. For example (from SAE JA1012 "A Guide to the Reliability-Centered Maintenance (RCM) Standard," issued in January 2002):

     • Pump set fails

          • Pump fails

               • Impeller fails

                    • Impeller comes adrift

                         • Mounting nut undone

                              • Nut not tightened correctly

                                   • Assembly error

The recommendation states that "failure modes should be described in enough detail for it to be possible to select an appropriate failure management policy, but not in so much detail that excessive amounts of time are wasted on the analysis process itself."

Select the Appropriate Maintenance Tasks

Once you have identified the ways that the equipment might fail to perform its intended functions and evaluated the consequences of these failures, the next step is to define the appropriate maintenance strategy for the equipment. Although there is variation among practitioners regarding the terminology used to describe the available maintenance techniques, the typical options that the RCM analysis team may recommend include:

  • Run-to-Failure – fix the equipment when it fails but do not perform any scheduled maintenance actions.
  • Scheduled Inspections
    • Failure Finding Inspections – inspect the equipment on a scheduled basis to discover hidden failures. If the equipment is found to be failed, initiate corrective maintenance.
    • On-Condition Inspections – inspect the equipment on a scheduled or ongoing basis (condition monitoring) to discover conditions that indicate a failure is about to occur. If the equipment is found to be about to fail, initiate preventive maintenance.
  • Scheduled Preventive Maintenance
    • Service – perform lubrication or other minor servicing actions on a scheduled basis. These actions may renew the equipment to some extent but they are not expected to have the same effect as a full repair or replacement.
    • Repair – repair or overhaul the equipment on a scheduled basis.
    • Replace – replace the equipment on a scheduled basis.
  • Design Change – re-design the equipment, select different equipment or make some other one-time change to improve the reliability/availability of the equipment.

The RCM analysis team’s decision of which strategy (or strategies) to employ for each potential failure may be based on judgment/experience, a predefined logic diagram (often connected to the failure effect categorization), cost comparisons, or some combination of factors. To continue with the SAE JA1012 example introduced in Identify and Evaluate (Categorize) the Effects of Failure, the following picture shows the questions to be considered for failures with effects that have been categorized as "Hidden Operational." The full logic diagram also includes a separate set of questions for each of the other five categories.

Tip: You can use one of the predefined sets of task selection questions that are shipped with the software or define your own custom questions and task types.

Another approach is to compare normalized cost values and select the maintenance task that provides the desired level of availability for the minimum cost. For example, the team may recommend a run-to-failure maintenance strategy if a) the issue does not have an impact on safety, b) the run-to-failure approach provides an acceptable level of equipment availability (uptime) and c) the cost per uptime is less than it would be with a scheduled repair/replacement.

Tip: You can simulate the operation of the equipment for a specified period of time in order to make estimates about the cost and availability that you can expect from potential maintenance strategies.

RCM References

  • Reliability-Centered Maintenance by F. Stanley Nowlan and Howard F. Heap of United Airlines, issued in December 1978.
  • ATA MSG-3 "Operator/Manufacturer Scheduled Maintenance Development," updated in March 2003.
  • NAVAIR 00-25-403 "Guidelines for the Naval Aviation Reliability-Centered Maintenance Process," issued in February 2001.
  • SAE JA1011 "Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes," issued in August 1999.
  • SAE JA1012 "A Guide to the Reliability-Centered Maintenance (RCM) Standard," issued in January 2002.
  • Reliability-Centered Maintenance (2nd Edition) by John Moubray, published in 1997.
  • Reliability Centered Maintenance: Gateway to World Class Maintenance by Anthony M. Smith, published in 1993.
  • "Practical Application of Reliability-Centered Maintenance" by the Reliability Analysis Center, issued in 2003.
  • MIL-STD-2173(AS) "Reliability-Centered Maintenance Requirements for Naval Aircraft, Weapons Systems and Support Equipment," issued in January 1986.
  • "NASA Reliability Centered Maintenance Guide for Facilities and Collateral Equipment," issued in February 2000.