Industrial Machinery Failure Types and Implications for Maintenance Approaches

Industrial machinery can fail in many different ways and for many different reasons. For critical and/or expensive equipment, it is a major challenge to find a way to detect potential failures before they happen and to take action to prevent or minimize the effects. Closely tied to this is the tradeoff between the cost of detection and the cost of failure. We discussed some of these tradeoffs in the blog “Condition Monitoring & Predictive Maintenance: Cost-Benefit Tradeoffs.”

When assessing how equipment might fail, several industry studies* have identified six primary failure types which may be considered:

    • Type A: Lower probability of failure in early- and mid-life of the asset, with a dramatic increase in probability of failure in late-life. This is typical for mechanical devices, such as engines, fans, compressors, and machines.
    • Type B: Higher initial probability of failure when the asset is new, with a much lower/steady failure probability over the rest of the asset’s life. This is often the profile for electronic devices such as computers, PLCs, etc.
    • Type C: Lower initial probability of failure when the asset is new, with an increase to a steady failure probability in mid- and late-life. These are often devices with high stress work conditions, such as pressure relief valves.
    • Type D: Consistent probability of failure throughout the asset life, similar failure probability in early-, mid- and late-life. This can cover many types of industrial machines, often with stable, proven design and components.
    • Type E: Higher probability of failure in early- and late-life, a lower and constant probability of failure in mid-life (often called a “bathtub curve”). This can be devices that “settle in” after a wear-in period and then experience higher failures at the end of life, such as bearings.
    • Type F: Lower probability of failure when new, with a gradual increase over time and without the steep increase in failure probability at the end of life of Type A. This is often typical where age-based wear is steady and gradual in equipment such as turbine engines and structural components (pressure vessels, beams, etc.).

Age-related and non-age-related failures

These six failure types fall into two categories: age-related and non-age-related failures. The studies show that 15-30% of failures are age-related (Types A, E & F) and 70-85% of failures are non-age-related (Types B, C & D). The age-related failures have a clear correlation between the age of the asset and the likelihood of failure. In these cases, preventative maintenance at regular time-based intervals may be appropriate and cost-effective. The non-age-based failures are more “random,” due to improper design/installation, operator error, quality issues, machine overuse, etc. In these cases, preventative maintenance will likely not prevent failure and may waste time and money on unnecessary maintenance.

Table is based on data from studies conducted by United Airlines (1978), Broberg (1973), U.S. Navy (1993 MSDP) and U.S. Navy (2001 SUBMEPP) and ARC Consulting

The fact that approximately 80% of failures are non-age-related has major implications for manufacturers trying to decide on a maintenance approach. The traditional preventative-maintenance approach is not likely to address these failures and may even cause failures when improperly done. It is therefore important to consider a more proactive approach, such as condition-based monitoring or predictive maintenance, especially for assets that are critical to the process and/or expensive.

Preventative maintenance and regular inspection may be a good approach for assets more likely to experience age-based failures in Types A, E, and F. These include fans, bearings, and structural components – and in many cases, the cost of condition monitoring or predictive maintenance may not be worth the cost. But for critical components or equipment, such as bearings on an expensive milling machine or transfer line, it may be worthwhile to apply condition monitoring or predictive maintenance.

And when the assets are more likely to experience non-age-related failures (Types B, C, and D), the proactive approaches are better. Many industrial machines and industrial control/motion components fall into this category, and condition monitoring or predictive maintenance can significantly reduce preventative maintenance costs and unplanned failures while improving machine uptime and Overall Equipment Effectiveness.

You can use this information to improve your maintenance operations. Start by considering your maintenance approach(es), especially for your most critical assets:

    • Are they more likely to experience age-related failures or non-age-related failures?
    • Should you change your maintenance approach to be more proactive?
    • What components and indicators should you measure?

We’ll discuss ideas on how to assess your equipment for condition monitoring/predictive maintenance and what you might measure in separate blogs.

* Studies conducted by United Airlines (1978), Broberg (1973), U.S. Navy (1993 MSDP) and U.S. Navy (2001 SUBMEPP)

Condition Monitoring & Predictive Maintenance: Cost-Benefit Tradeoffs

In a previous blog post, we discussed the basics of the Potential-Failure (P-F) curve, which refers to the interval between the detection of a potential failure and occurrence of a functional failure. In this post we’ll discuss the cost-benefit tradeoffs of various maintenance approaches.

In general, the goal is to maximize the P-F interval, which is the time between the first symptoms of impending failure and the functional failure taking place. In other words, you want to become aware of an impending failure as soon as possible to allow more time for action. This, however, must be balanced with the cost of the methods of prevention, inspection, and detection.

There is a trade-off between the cost of systems to detect and predict the failures and how soon you might detect the condition. Generally, the earlier the detection/prediction, the more expensive it is. However, the longer it takes to detect an impending failure (i.e. the more the asset’s condition degrades), the more expensive it is to repair it.Every asset will have a unique trade-off between the cost of failure prevention (detection/prediction) and the cost of failure. This means some assets probably call for earlier detection methods that come with higher prevention costs like condition monitoring and analytics systems due to the high cost to repair (see the Prevention-1 and Repair-1 curves in the Cost-Failure/Time chart). And some assets may be better suited for more cost-efficient but delayed detection or even a “run-to-failure” model due to lower cost to repair (the Prevention-2 and Repair-2 curves in the Cost-Failure/Time chart).

 

There are four basic Maintenance approaches:

:

Reactive

The Reactive approach has low or even no cost to implement but can result in a high repair/failure cost because no action is taken until the asset has reached a fault state. This approach might be appropriate when the cost of monitoring systems is very high compared to the cost of repairing or replacing the asset. As a general guideline, the Reactive approach is not a good strategy for any critical and/or high value assets due to their high cost of a failure.

Reactive approaches:

      • Offer no visibility
      • Fix only if it breaks – low overall equipment effectiveness (OEE)
      • High downtime
      • Uncertainty of failures

Preventative

The Preventative approach (maintenance at time-based intervals) may be appropriate when failures are age related and maintenance can be performed at regular intervals before anticipated failures occur. Two drawbacks to this approach are: 1) the cost and time of preventative maintenance can be high; and 2) studies show that only 18% of failures are age related (source: ARC Advisory Group). 82% of failures are “random” due to improper design/installation, operator error, quality issues, machine overuse, etc. This means that taking the Preventative approach may be spending time and money on unnecessary work, and it may not prevent expensive failures in critical or high value assets.

Preventative approaches:

      • Scheduled tune ups
      • Higher equipment longevity
      • Reduced downtime compared to reactive mode

Condition-Based

The Condition-Based approach attempts to address failures regardless of whether they are age-based or random. Assets are monitored for one or more potential failure indicators, such as vibration, temperature, current/voltage, pressure, etc. The data is often sent to a PLC, local HMI, special processor, or the cloud through an edge gateway. Predefined limits are set and alerts (alarm, operator message, maintenance/repair) are only sent when a limit is reached. This approach avoids unnecessary maintenance and can give warning before a failure occurs. Condition-based monitoring can be very cost-effective, though very sophisticated solutions can be expensive. It is a good solution when the cost of failure is medium or high and known indicators provide a reliable warning of impending failure.

Condition-based approaches:

      • Based on condition (PdM)
      • Enables predictive maintenance
      • Improves OEE, equipment longevity
      • Drastically reduces unplanned downtime

Predictive Analytics

Predictive Analytics is the most sophisticated approach and attempts to learn from machine performance to predict failures. It utilizes data gathered through Condition Monitoring, and then applies analysis or AI/Machine Learning to uncover patterns to predict failures before they occur. The hardware and software to implement Predictive Analytics can be expensive, and this method is best for high-value/critical assets and expensive potential failures.

Predictive Analytics approaches:

      • Based on patterns – stored information
      • Based on machine learning
      • Improves OEE, equipment longevity
      • Avoids downtime

Each user will need to evaluate the unique attributes of their assets and decide on the best approach and trade-offs of the cost of prevention (detection of potential failure) against the cost of repair/failure. In general, a Reactive approach is only best when the cost of failure is very low. Preventative maintenance may be appropriate when failures are clearly age-related. And advanced approaches such as Condition Based monitoring and Predictive Analytics are best when the cost of repair or failure is high.

Also note that technology providers are continually improving condition monitoring and predictive solutions. By lowering condition monitoring system costs and making them easier to set up and use,  users can cost-effectively move from Reactive or Preventative approaches to Condition-Based or Predictive approaches.