Industrial Machinery Failure Types and Implications for Maintenance Approaches

Industrial machinery can fail in many different ways and for many different reasons. For critical and/or expensive equipment, it is a major challenge to find a way to detect potential failures before they happen and to take action to prevent or minimize the effects. Closely tied to this is the tradeoff between the cost of detection and the cost of failure. We discussed some of these tradeoffs in the blog “Condition Monitoring & Predictive Maintenance: Cost-Benefit Tradeoffs.”

When assessing how equipment might fail, several industry studies* have identified six primary failure types which may be considered:

    • Type A: Lower probability of failure in early- and mid-life of the asset, with a dramatic increase in probability of failure in late-life. This is typical for mechanical devices, such as engines, fans, compressors, and machines.
    • Type B: Higher initial probability of failure when the asset is new, with a much lower/steady failure probability over the rest of the asset’s life. This is often the profile for electronic devices such as computers, PLCs, etc.
    • Type C: Lower initial probability of failure when the asset is new, with an increase to a steady failure probability in mid- and late-life. These are often devices with high stress work conditions, such as pressure relief valves.
    • Type D: Consistent probability of failure throughout the asset life, similar failure probability in early-, mid- and late-life. This can cover many types of industrial machines, often with stable, proven design and components.
    • Type E: Higher probability of failure in early- and late-life, a lower and constant probability of failure in mid-life (often called a “bathtub curve”). This can be devices that “settle in” after a wear-in period and then experience higher failures at the end of life, such as bearings.
    • Type F: Lower probability of failure when new, with a gradual increase over time and without the steep increase in failure probability at the end of life of Type A. This is often typical where age-based wear is steady and gradual in equipment such as turbine engines and structural components (pressure vessels, beams, etc.).

Age-related and non-age-related failures

These six failure types fall into two categories: age-related and non-age-related failures. The studies show that 15-30% of failures are age-related (Types A, E & F) and 70-85% of failures are non-age-related (Types B, C & D). The age-related failures have a clear correlation between the age of the asset and the likelihood of failure. In these cases, preventative maintenance at regular time-based intervals may be appropriate and cost-effective. The non-age-based failures are more “random,” due to improper design/installation, operator error, quality issues, machine overuse, etc. In these cases, preventative maintenance will likely not prevent failure and may waste time and money on unnecessary maintenance.

Table is based on data from studies conducted by United Airlines (1978), Broberg (1973), U.S. Navy (1993 MSDP) and U.S. Navy (2001 SUBMEPP) and ARC Consulting

The fact that approximately 80% of failures are non-age-related has major implications for manufacturers trying to decide on a maintenance approach. The traditional preventative-maintenance approach is not likely to address these failures and may even cause failures when improperly done. It is therefore important to consider a more proactive approach, such as condition-based monitoring or predictive maintenance, especially for assets that are critical to the process and/or expensive.

Preventative maintenance and regular inspection may be a good approach for assets more likely to experience age-based failures in Types A, E, and F. These include fans, bearings, and structural components – and in many cases, the cost of condition monitoring or predictive maintenance may not be worth the cost. But for critical components or equipment, such as bearings on an expensive milling machine or transfer line, it may be worthwhile to apply condition monitoring or predictive maintenance.

And when the assets are more likely to experience non-age-related failures (Types B, C, and D), the proactive approaches are better. Many industrial machines and industrial control/motion components fall into this category, and condition monitoring or predictive maintenance can significantly reduce preventative maintenance costs and unplanned failures while improving machine uptime and Overall Equipment Effectiveness.

You can use this information to improve your maintenance operations. Start by considering your maintenance approach(es), especially for your most critical assets:

    • Are they more likely to experience age-related failures or non-age-related failures?
    • Should you change your maintenance approach to be more proactive?
    • What components and indicators should you measure?

We’ll discuss ideas on how to assess your equipment for condition monitoring/predictive maintenance and what you might measure in separate blogs.

* Studies conducted by United Airlines (1978), Broberg (1973), U.S. Navy (1993 MSDP) and U.S. Navy (2001 SUBMEPP)

Predictive Maintenance vs. Predictive Analytics, What’s the Difference?

With more and more customers getting onboard with IIoT applications in their plants, a new era of efficiency is lurking around the corner. Automation for maintenance is on the rise thanks to a shortage of qualified maintenance techs coinciding with a desire for more efficient maintenance, reduced downtime, and the inroads IT is making on the plant floor. Predictive Maintenance and Predictive Analytics are part of almost every conversation in manufacturing these days, and often the words are used interchangeably.

This blog is intended to make the clear distinction between these phrases and put into perspective the benefits that maintenance automation brings to the table for plant management and decision-makers, to ensure they can bring to their plants focused innovation and boost efficiencies throughout them.

Before we jump into the meat of the topic, let’s quickly review the earlier stages of the maintenance continuum.

Reactive and Preventative approaches

The Reactive and Preventative approaches are most commonly used in the maintenance continuum. With a Reactive approach, we basically run the machine or line until a failure occurs. This is the most efficient approach with the least downtime while the machine or line runs. Unfortunately, when the machine or line comes to a screeching stop, it presents us with the most costly of downtimes in terms of time wasted and the cost of machine repairs.

The Preventative approach calls for scheduled maintenance on the machine or line to avoid impending machine failures and reduce unplanned downtimes. Unfortunately, the Preventative maintenance strategy does not catch approximately 80% of machine failures. Of course, the Preventative approach is not a complete waste of time and money; regular tune-ups help the operations run smoother compared to the Reactive strategy.

Predictive Maintenance vs. Predictive Analytics

As more companies implement IIoT solutions, data has become exponentially more important to the way we automate machines and processes within a production plant, including maintenance processes. The idea behind Predictive Maintenance (PdM), aka condition-based maintenance, is that by frequently monitoring critical components of the machine, such as motors, pumps, or bearings, we can predict the impending failures of those components over time. Hence, we can prevent the failures by scheduling planned downtime to service machines or components in question. We take action based on predictive conditions or observations. The duration between the monitored condition and the action taken is much shorter here than in the Predictive Analytics approach.

Predictive Analytics, the next higher level on the maintenance continuum, refers to collecting the condition-based data over time, marrying it with expert knowledge of the system, and finally applying machine learning or artificial intelligence to predict the event or failure in the future. This can help avoid the failure altogether. Of course, it depends on the data sets we track, for how long, and how good our expert knowledge systems are.

So, the difference between Predictive Maintenance and Predictive Analytics, among other things, is the time between condition and action. In short, Predictive Maintenance is a stepping-stone to Predictive Analytics. Once in place, the system monitors and learns from the patterns to provide input on improving the system’s longevity and uptime. Predictive Maintenance or Preventative Maintenance does not add value in that respect.

While Preventative Maintenance and Predictive Maintenance promises shorter unplanned downtimes, Predictive Analytics promises avoidance of unplanned downtime and the reduction of planned downtime.

The first step to improving your plant floor OEE is with monitoring the conditions of the critical assets in the factory and collecting data regarding the failures.

Other related Automation Insights blogs:

How IO-Link Sensors With Condition Monitoring Features Work With PLCs

As manufacturers continually look for ways to maximize productivity and eliminate waste, automation sensors are taking on a new role in the plant. Once, sensors were used only to provide detection or measurement data so the PLC could process it and run the machine. Today, sensors with IO-Link measure environmental conditions like temperature, humidity, ambient pressure, vibration, inclination, operating hours, and signal strength. By setting alarm thresholds, it’s possible to program the PLC to use the resulting condition monitoring data to keep machines running smoothly.

Real-time data for real-time response

A sensor with condition monitoring features allows a PLC to use real-time data with the same speed it uses a sensor’s primary process data. This typically requires setting an alarm threshold at the sensor and a response to those alarms at the PLC.

When a vibration threshold is set up on the sensor and vibration occurs, for example, the PLC can alert the machine operator to quickly check the area, or even stop the machine, to look for a product jam, incorrect part, or whatever may be causing the vibration. By reacting to the alarm immediately, workers can reduce product waste and scrap.

Inclination feedback can provide diagnostics in troubleshooting. Suppose a sensor gets bumped and no longer detects its target, for example. The inclination alarm set in the sensor will indicate after a certain degree of movement that the sensor will no longer detect the part. The inclination readout can also help realign the sensor to the correct position.

Detection of other environmental factors, including humidity and higher-than-normal internal temperatures, can also be set, providing feedback on issues such as the unwanted presence of water or the machine running hotter than normal. Knowing these things in real-time can stop the PLC from running, preventing the breakdown of other critical machine components, such as motors and gearboxes.

These alarm bits can come from the sensors individually or combined together inside the sensor. Simple logic, like OR and AND statements, can be set on the sensor in the case of vibration OR inclination OR temperature alarm OR humidity, output a discrete signal to pin 2 of the sensors. Then pin 2 can be fed back through the same sensor cable as a discrete alarm signal to the PLC. A single bit showing when an alarm occurs can alert the operator to look into the alarm condition before running the machine. Otherwise, a simple ladder rung can be added in the PLC to look at a single discrete alarm bit and put the machine into a safe mode if conditions require it.

In a way, the sensor monitors itself for environmental conditions and alerts the PLC when necessary. The PLC does not need to create extra logic to monitor the different variables.

Other critical data points, such as operating hours, boot cycle counters, and current and voltage consumption, can help establish a preventative and predictive maintenance schedule. These data sets are available internally on the sensors and can be read out to help develop maintenance schedules and cut down on surprise downtimes.

Beyond the immediate benefits of the data, it can be analyzed and trended over time to see the best use cases of each. Just as a PLC shouldn’t be monitoring each alarm condition individually, this data must not be gathered in the PLC, as there is typically only a limited amount of memory, and the job of the PLC is to control the machines.

This is where the IT world of high-level supervision of machines and processes comes into play. Part two of my blog will explore how to integrate this sensor data into the IT level for use alongside the PLC.

Condition Monitoring & Predictive Maintenance: Addressing Key Topics in Packaging

A recent study by the Packaging Machinery Manufacturers Institute (PMMI) and Interact Analysis takes a close look at packaging industry interest and needs for Condition Monitoring and Predictive Maintenance. Customer feedback reveals interesting data on packaging process pain points and the types of machines and components which are best monitored, the data which should be gathered, current maintenance approaches, and the opportunity for a better way: Condition Monitoring and Predictive Maintenance.

What keeps customers awake at night?

The PMMI survey indicates that form, fill & seal machines are very critical to packaging processes and more likely to fail than many other machines. Also critical to the process and a common failure point are filling & dosing machines, and labeling machines.

These three categories of machines are in use in primary packaging and are often the key components in the production line; the downstream processes are usually less critical. They often process a lot of perishable products at high speeds, therefore, any downtime is a big problem for overall equipment effectiveness (OEE), quality, and profitability.

In terms of the components on these machines that are most likely to fail, the ones are pneumatic systems, gearboxes, motors/drives, and sensors.

How can customers reduce unplanned downtime and improve OEE?

Our data shows that the top customer issue is unplanned machine breakdowns, but many packaging firms use reactive or preventative maintenance approaches, which may not be effective for most failures. An ARC study found that only about 20% of failures are age-related. The 80% of failures that are non-age-related would likely not be addressed by reactive or preventative maintenance programs.

A better way to address these potential failures is to monitor the condition of critical machines and components. Condition monitoring can provide early detection of machine deterioration or impending failure and the data can be used for predictive maintenance. Many “smart sensors” can now measure vibration, temperature, humidity, pressure, flow, inclination, and many other attributes which may be helpful in notifying users of emerging problems. And some of these “smart sensors” can also “self-monitor” and help alert users to potential failures in the sensor itself.

What are packaging customers actually doing?

The good news is that the packaging industry is moving forward to find a better way and users understand that Condition Monitoring/Predictive Maintenance gives them the opportunity to prevent unplanned failures, reduce unplanned downtime, and improve OEE, quality and profitability. About 25% of customers have already implemented some sort of Condition Monitoring / Predictive Maintenance, while about 20% are piloting it and 30% plan to implement it. This means that 75% of customers are very interested in Condition Monitoring/Predictive Maintenance, by far the most interest in any technology discussed in the PMMI survey.

Where do you start?

    • Look for the machines which cause you the most frustration. PMMI identified form, fill & seal, filling & dosing, and labeling machines, but there are other machines, including bottling, cartoning, and case/tray handling, that could fail and cause production downtime or damaged product.
    • Consider where, when, and how equipment can fail. Look to your own experience, ask partners with similar machines or perhaps the equipment supplier to help you determine the most common failure points and modes.
    • Analyze which parts of the machine fail. Moving parts are usually the highest potential failure point. On packaging machines, these include motors, gearboxes, fans, pumps, bearings, conveyors, and shafts.
    • Consider what to measure. Vibration is common, and often assessed in combination with temperature and humidity. On some machines, pressure, flow, or amperage/voltage should be measured.
    • Determine the most appropriate maintenance program for each machine. Consider the costs/benefits of reactive, preventative, condition-based monitoring or predictive approaches. In some cases, it may be OK to let a non-critical, low-value asset “run-to-failure,” while in other cases it might be worth investing in Condition Monitoring or Predictive Maintenance to prevent a critical machine’s costly failure.
    • Start small by implementing condition monitoring on one or two machines, and then scaling up once you’ve learned what does and doesn’t work. Using a low-cost sensor, which can be easily integrated with existing controls architectures or added on externally, is also a great way to start.

Condition Monitoring and Predictive Maintenance offer packaging firms a “better way” to address key topics including machine downtime, failures, and OEE. Users can move from a reactive to a proactive maintenance approach by monitoring attributes such as vibration and temperature on critical machines and then analyzing the data. This will allow them to detect and predict potential failures before they become critical, and thereby, reduce unplanned downtime, improve OEE, and save money.

Condition Monitoring & Predictive Maintenance: Machine Failure Indicators & Detection Methods

In our previous blogs, we discussed the basics of the P-F (Potential – Functional Failure) curve and the cost-benefit tradeoffs of various maintenance approaches. We’ll now describe the measures that can be taken to discover failure indicators along the P-F curve.The basic concept of the P-F curve is that as a machine or asset deteriorates, various symptoms/indicators emerge. The early-stage indicators may be harder to detect and may require more sophisticated and expensive systems to analyze, but they give you more time to take action to prevent a catastrophic failure. They allow users to choose times to service a machine when it’s less disruptive to the manufacturing process and when only minor maintenance actions, such as changing lubricant, replacing a filter or balancing a fan, are needed rather than major parts repair/replacement. The later-stage indicators may be more obvious and simpler to notice, but they may require extensive and expensive maintenance since greater deterioration has taken place.

Some monitoring methods can be done on a continual basis by using a permanently mounted sensor that takes samples at intervals of once an hour or more often. Others can only be done on a one-time or periodic basis, as when a sensor is brought in for special analysis, perhaps once a month or less often.

Common indicators and detection methods

This version of the P-F curve lists several common indicators and detection methods, in rough order of when they might start to reveal deterioration in an asset:

    • Ultrasonic Spike Energy. Ultrasonic condition monitoring sensors are often expensive and used in portable systems to take one-time readings, but they can provide very early potential failure detection.
    • Vibration Analysis. Sensors and evaluation tools can range from very simple and low cost to sophisticated and expensive. The vibration analysis is done on either a one-time, periodic, or continual basis and often gives an early insight into emerging problems.
    • Oil Analysis. An oil analysis may signal the need for additional, relatively simple maintenance actions, such as lubricating bearings, changing lubricant, or scheduling maintenance. This can usually be done on a one-time basis, but perhaps periodically, such as monthly or annually.
    • Temperature Analysis. This analysis can indicate emerging “hot spots” on a machine, such as bad bearings or excessive friction that signal a future failure. Depending on the measurement system and asset, it can be an early or a late indicator of impending failure.
    • Pressure & Flow. These indicators can fall into either the predictive or the fault domain, depending on implementation. If a proactive approach is taken, they might be condition indicators that can provide an early indication of potential failure; if a reactive approach is taken, they might be indicators of a functional failure (failure already occurring).
    • Audible Noise. Noise is often an indicator of deterioration moving into the fault domain, and requiring more immediate action than vibration, temperature, or ultrasonic indicators.
    • Hot to Touch. Generally, once bearings, motors, or shafts become hot to the touch, failure is imminent and quick action is needed to avoid catastrophic failure.
    • Mechanically Loose. This indicator may fall into preventative maintenance (maintenance performed at time-based intervals rather than based on need) and may not catch impending failures until it is too late. Parts, which are obviously loose, can indicate a deeper problem, often close to failure.
    • Ancillary Damage. This detects when other parts of the machine/assets are being damaged prior to a catastrophic failure (for example, a damaged belt due to belt misalignment caused by a failing bearing). Generally, when this is found, it is too late to prevent the failure of the asset.

This list does not cover all possible indicators. Machine users and builders may have others depending on their unique application – other potential methods to detect asset deterioration include monitoring of current, corrosion, or leaks.

The “best” indicator and approach will depend on each user’s and each asset’s unique risk/cost/benefit profile. Machine builders and users should work closely with an experienced condition monitoring solution provider who provides multiple solutions to help consider and assess the tradeoffs associated with various approaches.

Does Your Stamping Department Need a Checkup? Try a Die-Protection Risk Assessment

If you have ever walked through a stamping department at a metal forming facility, you have heard the rhythmic sound of the press stamping out parts, thump, thump. The stamping department is the heart manufacturing facility, and the noise you hear is the heartbeat of the plant. If it stops, the whole plant comes to a halt. With increasing demands for higher production rates, less downtime, and reduction in bad parts, stamping departments are under ever-increasing pressure to optimize the press department through die protection and error-proofing programs.

The die-protection risk assessment team

The first step in implementing or optimizing a die protection program is to perform a die-protection risk assessment. This is much like risk assessments conducted for safety applications, except they are done for each die set. To do this, build a team of people from various positions in the press department like tool makers, operators, and set-up teams.

Once this team is formed, they can help identify any incidents that could occur during the stamping operations for each die set and determine the likelihood and the severity of possible harm. With this information, they can identify which events have a higher risk/severity and determine what additional measures they should implement to prevent these incidents. An audit is possible even if there are already some die protection sensors in place to determine if there are more that should be added and verify the ones in place are appropriate and effective.

The top 4 die processes to check

The majority of quality and die protection problems occur in one of these three areas: material feed, material progression, and part- and slug-out detections. It’s important to monitor these areas carefully with various sensor technologies.

Material feed

Material feed is perhaps the most critical area to monitor. You need to ensure the material is in the press, in the correct location, and feeding properly before cycling the press. The material could be feeding as a steel blank, or it could come off a roll of steel. Several errors can prevent the material from advancing to the next stage or out of the press: the feed can slip, the stock material feeding in can buckle, or scrap can fail to drop and block the strip from advancing, to name a few. Inductive proximity sensors, which detect iron-based metals at short distances, are commonly used to check material feeds.

Material progression

Material progression is the next area to monitor. When using a progressive die, you will want to monitor the stripper to make sure it is functioning and the material is moving through the die properly. With a transfer die, you want to make sure the sheet of material is nesting correctly before cycling the press. Inductive proximity sensors are the most common sensor used in these applications, as well.

Here is an example of using two inductive proximity sensors to determine if the part is feeding properly or if there is a short or long feed. In this application, both proximity sensors must detect the edge of the metal. If the alignment is off by just a few millimeters, one sensor won’t detect the metal. You can use this information to prevent the press from cycling to the next step.

Short feed, long feed, perfect alignment

Part-out detection

The third critical area that stamping departments typically monitor is part-out detection, which makes sure the finished part has come out of the stamping

area after the cycle is complete. Cycling the press and closing the tooling on a formed part that failed to eject can result in a number of undesirable events, like blowing out an entire die section or sending metal shards flying into the room. Optical sensors are typically used to check for part-out, though the type of photoelectric needed depends on the situation. If the part consistently comes out of the press at the same position every time, a through-beam photo-eye would be a good choice. If the part is falling at different angles and locations, you might choose a non-safety rated light grid.

Slug-ejection detection

The last event to monitor is slug ejection. A slug is a piece of scrap metal punched out of the material. For example, if you needed to punch some holes in metal, the slug would be the center part that is knocked out. You need to verify that the scrap has exited the press before the next cycle. Sometimes the scrap will stick together and fail to exit the die with each stroke. Failure to make sure the scrap material leaves the die could affect product quality or cause significant damage to the press, die, or both. Various sensor types can ensure proper scrap ejection and prevent crashes. The picture below shows a die with inductive ring sensors mounted in it to detect slugs as they fall out of the die.

Just like it is important to get regular checkups at the doctor, performing regular die-protection assessments can help you make continuous improvements that can increase production rates and reduce downtime. Material feed, material progression, part-out and slug-out detection are the first steps to optimize, but you can expand your assessments to include areas like auxiliary equipment. You can also consider smart factory solutions like intelligent sensors, condition monitoring, and diagnostics over networks to give you more data for preventative maintenance or more advanced error-proofing. The key to a successful program is to assemble the right team, start with the critical areas listed above, and learn about new technologies and concepts that are becoming available to help you plan ways to improve your stamping processes.

Maximize the Benefits of Open-Source Code in Manufacturing Software

The rise of many players in manufacturing automation, along with factories’ growing adoption of Industrial Internet of Things (IIoT) and automation solutions, present a suitable environment for open-source software. This software is a value-adding solution for manufacturers, regardless of their operation technology and management requirements, due to the customization, resiliency, scalability, accessibility, cost-effectiveness, and quality it allows.

Customization

Software developers who use open-source code provide software with a core code that establishes specific features and allows users to access it and make changes as necessary. The process is much like being able to complete an author’s writing prompt or change the end of a story. Unlike a closed system that locks users in, open-source allows them to adapt and modify the code to meet a particular need or application.

This add-on coding system provides endless customization. It enables communities (i.e., users) to add or remove features beneficial in an integration phase, such as features for user testing or to find the best solution for a machine.

Customization is also valuable regarding data visualizations; users can develop dashboards and visuals that best describe their operations. Suppose a sensor provides real-time condition monitoring data over a particular machine. In that case, it’s possible to customize the code supporting the software that gathers and processes the data for specific parameters or to calculate specific values.

Resiliency

Additionally, open-source code is resilient to change because it can be modified quickly. The ability to quickly add or remove features and adapt to cyber environments or specific applications also makes it volatile. Like exposure to pathogens can help strengthen an immune response to said pathogens, so can an open-source code be made stronger by its exposure to different environments and applications to be ready to face cybersecurity threats. Implementing an open code isn’t any less risky (cybersecurity-wise) than closed codes due to the testing and enhancements made by so many coders or programmers. However, it is up to the implementer to use the same rules that apply to other closed source software. The implementer must be aware of the code’s source and avoid code from non-reputable sources who could have modified it with negative intentions. Overall, the code is resilient, adaptable, and agile to adapt given a new environment.

Scalability

The add-on and customization aspects of open-source also allow the code to be highly scalable. This scalable implementation happens in two dimensions: adoption timeline and application-based. Both are important to guarantee user acceptance and that it meets the operation and application requirements. Regarding the adoption timeline, scalability allows modification of the software and code to meet users’ expectations. Open-sourced code enables the implementation of features for user testing and feedback. The ultimate solution will include multiple iterations to meet the users’ needs and fulfill operation expectations.

On the other hand, this code is scalable based on the application(s), such as working on different machines, multiples of the same machine with different purposes, or adding/dropping features for specific uses. Say, for example, there are three of the same machine (A, B, and C), but they are in different environments. Machine A is in an environment that is 28°F , B is at room temperature, and C is exposed to constant wash-down. In this case, the condition monitoring software defines the acceptable parameters for each scenario, avoiding false alarms from erroneous triggers. In this example, the base code is adapted to include specific features based on the application.

Accessibility

In general, cost-effective and high-quality open-source code is available online. There are additional resources such as free coding tutorials that don’t require any licenses as well. Moreover, when programmers update an open code, they must make the new version available again, ensuring that the code is accessible and up to date.

Cost-effectiveness and quality

Regarding cost-effectiveness, using community open-source code significantly reduces the cost of developing, integrating, and testing software built in-house. It also reduces the implementation time and makes for better production operations. Essentially, it is high-quality, reliable code created by trusted sources for multiple coders and users.

“The application drives the technology” mantra is at the heart of open-source software development—a model where source code is available for community members to use, modify, and share. IIoT enablers and providers in the manufacturing industry own a particular solution that is then available for manufacturers to adapt to their specific operational requirements. With the increasing adoption of data-collecting technologies, it is in manufacturers’ best interest to seek software providers who grant them the flexibility to adjust software solutions to meet their specific needs. Automation is a catalyst for data-driven operation and maintenance.

Condition Monitoring & Predictive Maintenance: Cost-Benefit Tradeoffs

In a previous blog post, we discussed the basics of the Potential-Failure (P-F) curve, which refers to the interval between the detection of a potential failure and occurrence of a functional failure. In this post we’ll discuss the cost-benefit tradeoffs of various maintenance approaches.

In general, the goal is to maximize the P-F interval, which is the time between the first symptoms of impending failure and the functional failure taking place. In other words, you want to become aware of an impending failure as soon as possible to allow more time for action. This, however, must be balanced with the cost of the methods of prevention, inspection, and detection.

There is a trade-off between the cost of systems to detect and predict the failures and how soon you might detect the condition. Generally, the earlier the detection/prediction, the more expensive it is. However, the longer it takes to detect an impending failure (i.e. the more the asset’s condition degrades), the more expensive it is to repair it.Every asset will have a unique trade-off between the cost of failure prevention (detection/prediction) and the cost of failure. This means some assets probably call for earlier detection methods that come with higher prevention costs like condition monitoring and analytics systems due to the high cost to repair (see the Prevention-1 and Repair-1 curves in the Cost-Failure/Time chart). And some assets may be better suited for more cost-efficient but delayed detection or even a “run-to-failure” model due to lower cost to repair (the Prevention-2 and Repair-2 curves in the Cost-Failure/Time chart).

 

There are four basic Maintenance approaches:

:

Reactive

The Reactive approach has low or even no cost to implement but can result in a high repair/failure cost because no action is taken until the asset has reached a fault state. This approach might be appropriate when the cost of monitoring systems is very high compared to the cost of repairing or replacing the asset. As a general guideline, the Reactive approach is not a good strategy for any critical and/or high value assets due to their high cost of a failure.

Reactive approaches:

      • Offer no visibility
      • Fix only if it breaks – low overall equipment effectiveness (OEE)
      • High downtime
      • Uncertainty of failures

Preventative

The Preventative approach (maintenance at time-based intervals) may be appropriate when failures are age related and maintenance can be performed at regular intervals before anticipated failures occur. Two drawbacks to this approach are: 1) the cost and time of preventative maintenance can be high; and 2) studies show that only 18% of failures are age related (source: ARC Advisory Group). 82% of failures are “random” due to improper design/installation, operator error, quality issues, machine overuse, etc. This means that taking the Preventative approach may be spending time and money on unnecessary work, and it may not prevent expensive failures in critical or high value assets.

Preventative approaches:

      • Scheduled tune ups
      • Higher equipment longevity
      • Reduced downtime compared to reactive mode

Condition-Based

The Condition-Based approach attempts to address failures regardless of whether they are age-based or random. Assets are monitored for one or more potential failure indicators, such as vibration, temperature, current/voltage, pressure, etc. The data is often sent to a PLC, local HMI, special processor, or the cloud through an edge gateway. Predefined limits are set and alerts (alarm, operator message, maintenance/repair) are only sent when a limit is reached. This approach avoids unnecessary maintenance and can give warning before a failure occurs. Condition-based monitoring can be very cost-effective, though very sophisticated solutions can be expensive. It is a good solution when the cost of failure is medium or high and known indicators provide a reliable warning of impending failure.

Condition-based approaches:

      • Based on condition (PdM)
      • Enables predictive maintenance
      • Improves OEE, equipment longevity
      • Drastically reduces unplanned downtime

Predictive Analytics

Predictive Analytics is the most sophisticated approach and attempts to learn from machine performance to predict failures. It utilizes data gathered through Condition Monitoring, and then applies analysis or AI/Machine Learning to uncover patterns to predict failures before they occur. The hardware and software to implement Predictive Analytics can be expensive, and this method is best for high-value/critical assets and expensive potential failures.

Predictive Analytics approaches:

      • Based on patterns – stored information
      • Based on machine learning
      • Improves OEE, equipment longevity
      • Avoids downtime

Each user will need to evaluate the unique attributes of their assets and decide on the best approach and trade-offs of the cost of prevention (detection of potential failure) against the cost of repair/failure. In general, a Reactive approach is only best when the cost of failure is very low. Preventative maintenance may be appropriate when failures are clearly age-related. And advanced approaches such as Condition Based monitoring and Predictive Analytics are best when the cost of repair or failure is high.

Also note that technology providers are continually improving condition monitoring and predictive solutions. By lowering condition monitoring system costs and making them easier to set up and use,  users can cost-effectively move from Reactive or Preventative approaches to Condition-Based or Predictive approaches.

IO-Link Event Data: How Sensors Tell You How They’re Doing

I have been working with IO-Link for more than 10 years, so I’ve heard lots of questions about how it works. One line of questions I hear from customers is about the operating condition of sensors. “I wish I knew when the IO-Link device loses output power,” or, “I wish my IO-Link photoelectric sensor would let me know when the lens is dirty.” The good news is that it does give you this information by sending Event Data. That’s a type of data that is usually not a focus of users, although it is available in JSON format from the REST API.

There are three types of IO-Link data:

      • Process Data – updated cyclically, it’s important to users because it contains the data for use in the running application, like I/O change of states or measurement values like temperature and position, etc.
      • Parameter Data – updated acyclically, it’s important to users because it’s the mechanism to read and write parameter values like setpoints, thresholds, and configuration settings to the sensor, and for reading non-time critical values like operating hours, etc.
      • Event Data – updated acyclically, it’s important to users because it provides immediate updates on device conditions.

Let’s dig deeper into Event Data. An Event is a status update from the IO-Link device when a condition is out of its normal range. The Event is labeled as a Warning or Error based on the severity of the condition change.

When an Event occurs on the IO-Link device, the device sets the Event Flag bit in the outgoing data packet to the IO-Link Master. The Master receives the Event Flag and then queries the IO-Link device for the Event information.

It is important to note that this is a one-time data message. The IO-Link device only sends the Event Flag at the moment the condition is out of range, and then again when the condition is back in range.

Event Data Types, Modes, and Codes

Event Data has three following three components:

      • Event Type – categorized in three ways
        • Notification – a simple event update; nothing is abnormal with the IO-Link device
        • Warning – a condition is out of range and risks damaging the IO-Link device
        • Error – a condition is out of range and is affecting the device negatively to the point that it may not function as expected
      • Event Mode – categorized in three ways
        • Event notice – usually associated with Event Type notifications, message will not be updated
        • Event appears – the condition is now out of range
        • Event disappears – the condition is now back in range
      • Event Code
        • A two-byte Hex code that represents the condition that is out of range

IO-Link condition monitoring sensor

To bring all these components together, let’s look at a photoelectric IO-Link sensor with internal condition monitoring functions and see what Events are available for it in this device manual screenshot. This device has Events for temperature (both warning and error), voltage, inclination (sensor angle is out of range), vibration, and signal quality (dirty lens).

By monitoring these events, you have a better feel for the conditions of your IO-Link device. Along with helping you identify immediate problems, this can help you in planning preventive and planned maintenance.

An IO-Link condition monitoring sensor uses Event Data similarly to report when conditions exceed the thresholds that you have set. For example, when the vibration level exceeds the threshold value, the IO-Link device sends the Warning event flag and the IO-Link Master queries for the event data. The event data consists of an Event Type, an Event Mode, and an Event Code that represents the specific alarm condition that is out of range. Remember this is a one-time action; the IO-Link sensor will not report this again until the value is in an acceptable range.

When the vibration level is back in range, the alarm condition is no longer present in the IO-Link device, the process repeats itself. In this case the Event Type and Event Code will be the same. The only change is that the Event Mode will report Event Disappears.

Within the IO-Link Specification there is a list of defined Event Codes that are common across all vendors. There is also a block of undefined Event Code values that allow vendors to create Event Codes that are unique to their specific device.

“I wish the IO-Link device would let me know….” In the end, the device might be telling you what you want to know, especially if the device has condition monitoring functions built into it. If you want to know more about condition monitoring in your IO-Link devices, check out the Event section in the vendor’s manuals so you can learn how to use this information.

Control Meets IIoT, Providing Insights into a New World

In manufacturing and automation control, the programmable logic controller (PLC) is an essential tool. And since the PLC is integrated into the machine already, it’s understandable that you might see the PLC as all that you need to do anything in automation on the manufacturing floor.

Condition monitoring in machine automation

For example, process or condition monitoring is emerging as an important automation feature that can help ensure that machines are running smoothly. This can be done by monitoring motor or mechanical vibration, temperature or pressure. You can also add functionality for a machine or line configuration or setup by adding sensors to verify fixture locations for machine configuration at changeovers.

One way to do this is to wire these sensors to the PLC and modify its code and use it as an all-in-one device. After all, it’s on the machine already. But there’s a definite downside to using a PLC this way. Its processing power is limited, and there are limits to the number of additional processes and functions it can run. Why risk possible complications that could impact the reliability of your control systems? There are alternatives.

External monitoring and support processes

Consider using more flexible platforms, such as an edge gateway, Linux, and IO-Link. These external sources open a whole new world of alternatives that provide better reliability and more options for today and the future. It also makes it easier to access and integrate condition monitoring and configuration data into enterprise IT/OT (information technology/operational technology) systems, which PLCs are not well suited to interface with, if they can be integrated at all.

Here are some practical examples of this type of augmented or add-on/retrofit functionality:

      • Motor or pump vibration condition monitoring
      • Support-process related pressure, vibration and temperature monitoring
      • Monitoring of product or process flow
      • Portable battery based/cloud condition monitoring
      • Mold and Die cloud-based cycle/usage monitoring
      • Product changeover, operator guidance system
      • Automatic inventory monitoring warehouse system

Using external systems for these additional functions means you can readily take advantage of the ever-widening availability of more powerful computing systems and the simple connectivity and networking of smart sensors and transducers. Augmenting and improving your control systems with external monitoring and support processes is one of the notable benefits of employing Industrial Internet of Things (IIoT) and Industry 4.0 tools.

The ease of with which you can integrate these systems into IT/OT systems, even including cloud-based access, can dramatically change what is now available for process information-gathering and monitoring and augment processes without touching or effecting the rudimentary control system of new or existing machines or lines. In many cases, external systems can even be added at lower price points than PLC modification, which means they can be more easily justified for their ROI and functionality.