Analysis to determining Root Causes of Failures is an important procedure that should be used to improve your maintenance and operations organizations and processes. Using your Computerized Maintenance Management System (CMMS) or Enterprise Asset Management System (EAMS) to capture useful data to support these endeavors will be greatly improved by following some basic principles.
We have been working with customers for over 35 years to implement CMMS / EAMS solutions worldwide. During this time we have seen many different approaches to try and integrate these systems into existing or new processes and procedures for the surrounding business environment. This series of “Success Topics” is our attempt to share some of the recurring topics that we feel provide the best return on investment of time and effort to address.
Root Cause Failure Analysis are “engineering” procedures used to determine what actions should be taken to improve equipment reliability or reduce maintenance costs on a case by case basis. By this approach, problem equipment or processes are looked at in depth to determine why there are unexpectedly high rates of failure.
Best Approaches to Root Cause Failure Analysis
Typically a team is designated to perform these analysis on a periodic basis. Often the team is composed of Operations, Maintenance and Engineering representatives. The process often looks like this:
- Bad Actors (equipment with higher rates of failure than expected) are identified,
- Work History is assembled or maintenance staff interviewed to determine actions taken,
- One or more Root Causes are proposed and a plan is made to investigate, if required.
- A decision is made on the Root Cause and actions to be taken are decided upon to correct the situation.
Correcting the situation will result in one or more actions, depending upon the Root Cause. In general, these fall under the following categories:
- Operator Procedures improved or changed,
- Preventive Maintenance or Predictive Maintenance activities modified or new ones put in place,
- Corrective Maintenance procedures for the equipment are modified,
- Equipment is torn down and rebuilt (overhauled),
- Or the Equipment or Process is reengineered.
This article intends to deal with how you should use your CMMS / EAMS to better support this analysis,
How is the CMMS / EAMS Used?
Initially, your CMMS/EAMS system would be used to help identify the bad actors, as well as provide history of actions taken. Subsequent to proposed actions to improve the situation, the system would be used to support new PM or Predictive actions. But most import is the ability to help support the identification of bad actors and work history.
Bad Actor Identification
In general, operations will have their list of equipment or processes that are most disruptive. And maintenance staff should also generate lists of equipment that they feel need help. But included in this list should be equipment with high levels of downtime, or whose maintenance costs are increasing. The best ways to identify these are via reports from your CMMS/EAMS, such as:
- Ranking equipment by downtime over past year or two.
- Ranking equipment by maintenance costs over past year or two.
- Listing equipment whose maintenance costs exceed 30% of replacement or purchase cost.
In order for your system to accurately produce such reports, you must be tracking corrective maintenance actions and capturing solid downtime and cost information. (Click here to read our CMMS Success Topic – Corrective Work Orders)
Support for Root Cause Failure Analysis
Your CMMS/EAMS system should be able to provide useful and accurate information for the history of actions taken on the equipment. This is the basis for the analysis. There are usually four pieces of information relative to this analysis:
Symptom or Problem Code or Description – operator designation of symptom of problem.
Failure or Reason Code or Description – maintenance’s designation of why the problem occurred.
Work Performed and Action Taken Codes – maintenance’s designation of what was done to fix the problem.
Spare Parts used – what components failed.
Symptom or Problem Designation
This usually a coded entry entered by the person reporting the problem. This should be a limited, simple list. Examples are; unable to maintain pressure, won’t start, excess noise or vibration, overheating, etc. This list should be tailored to your operations and equipment types. If the list is more than 10 or 12 items, users often just pick the most generic one at the top of the list.
This information is often used to separate out work orders unrelated to the failure being investigated. For example, work orders responding to a “won’t start” problem may not be relevant for a problem of misaligned labeling by a bottle labeling process.
Failure or Reason Code
Maintenance typically enters this code to help explain the reason for the problem. This should not be an attempt at root cause identification, but an attempt to define what was encountered during the repair. Again, the list should be kept short, or limited by the equipment type, so the user need only pick from 10 or 12 items. Examples are: lubrication issue, power surge, corrosion, failed component, out of calibration, operator error, etc. The technician should provide more detail in the Work Performed text.
More elaborate reporting can be implemented by identifying specific failure modes for the equipment type. For example, a butterfly valve will have different modes of failure than an electric motor. High-end CMMS/EAMS systems will support this setup and filtering of selections that the user then picks from, based upon the equipment on the identified work order.
Work Performed text
This is where the technician should provide good information about what was done and any comments that will help with root cause analysis. Users should be trained in how best to use this field to support the reporting of history and subsequent analysis.
Spare Parts used
Parts should be issued to work orders so that it is easy to identify failed components. This process will be greatly facilitated by the creation of parts lists. (Click here to read our CMMS Success Topic – Spare Parts Inventory Control)
Root Cause Failure Analysis is a critical tool that should be used by all organizations to optimize their equipment availability and maintenance costs. This article provides some suggestions on how best to use your CMMS/EAMS to support this process. The actual Root Cause Failure Analysis process is outside the scope of this article. But feel free to contact us and we will recommend some references with more details.
To request a free live demo to learn more about how GP MaTe can assist with Root Cause Failure Analysis click here.
About GP MaTe
GP MaTe is a User-friendly maintenance and material management system that facilitates maintenance planning and inventory control. Our product has many optional modules that support Safety (PSM, MOC and LOTO), Budgeting, Multi-plant information sharing, and Operator Tours and Data Collection. The system is available in many languages and supports vendor currency conversions.