3 Most Common Troubleshooting Mistakes (And How To Avoid Them)
Equipment troubleshooting can be both an art and a science at the same time. If taken to the next level, a systematic approach to troubleshooting can elevate it beyond just trial-and-error and into a streamlined process to identify and rectify adverse conditions. When executed correctly, troubleshooting can help plant maintenance operations overcome work order backlog, lost production, and safety and compliance issues much more efficiently.
What is troubleshooting?
Most asset-intensive businesses take steps to minimize unexpected equipment failure by implementing a combination of robust preventive, predictive, and condition-based maintenance strategies. A proactive approach to routine asset maintenance such as this is widely agreed to be the best way to address equipment issues before catastrophic breakdowns occur.
However, it is impractical to expect that such an approach will prevent every possible equipment breakdown - it is nearly impossible to understand or detect every possible failure mode on today’s complex machines. Therefore, companies need a systematic approach to troubleshooting equipment deficiencies to minimize unplanned downtime.
Troubleshooting is the process of identifying or narrowing down the cause of equipment malfunction when the problem itself is not immediately apparent. The entire process is heavily based on collecting maximum data from various sources to help identify the most likely cause of the breakdown.
That being said, efficient and effective troubleshooting can be very challenging, given the high number of potential unknowns involved.
Common troubleshooting mistakes
Let's take a look at 3 of the most common troubleshooting mistakes prevalent across most industries.
Insufficient maintenance and component history
Insufficient history on the equipment can be an Achilles heel to troubleshooting. This includes insufficient information on prior operating or commissioning history, which helps establish baselines or previous maintenance history on the equipment.
For example, if the commissioning, factory testing, or normal operating data are not available for a piece of equipment, it may be extremely difficult to know what the expected results would be when you obtain the data during troubleshooting. Likewise, if recent maintenance history details are not available, decision-makers may be blind to any recently implemented changes on the equipment or sub-components that were worked on. After all, one of the most common aspects of a systematic approach to problem-solving is asking questions such as - “what has changed since the equipment last performed within normal parameters?”, and “what was the last major intrusive change on the equipment?”
Not capturing enough data
The easiest, most reliable method to determine the apparent or root causes of equipment malfunction involves having good data on your equipment. For example, when using a datalogger to capture non-instrumented signals, if the troubleshooting process does not capture the right range of parameters for a long-enough period, it may be challenging to pinpoint the anomalies or see a trend that may span across multiple parameters.
Also, the process of eliminating non-plausible or low probability causes depends on the availability and accuracy of the associated data. Combining details from operators regarding the circumstances of the failure with information on the equipment’s maintenance and inspection history makes it easier for the technicians and engineers to zero in on the cause of the problem.
Inadequate troubleshooting procedures
Even the most knowledgeable and seasoned maintenance technicians may make errors; after all, humans by nature are error-prone to some extent. Inadequate troubleshooting procedures that lack depth and specificity increase the likelihood of human performance errors, and may result in the intended scope of troubleshooting not being performed.
Troubleshooting best practices
Now, let’s run through some of the best practices to avoid common mistakes or minimize their hindrance.
Increase accessibility to maintenance history and documents
A maintenance software such as Computer Maintenance Management System (CMMS) enables you to measure maintenance performance by helping track data such as the Preventive Maintenance (PM) completion percentage, frequency of breakdowns, and scope of prior repairs on specific equipment. A CMMS automatically records work completed on equipment and, importantly, provides maintenance technicians the opportunity to add comments about what they found when performing the inspection in the form of work reports or as-found conditions. It provides a simple method for cross-referencing symptoms of the current adverse conditions with elements of past issues.
Additionally, using CMMS work planners can link to important documents such as maintenance manuals, drawings, videos, photos, OEM websites, and the troubleshooting procedures for future reference as well. All the information linked to specific equipment can be stored, in a way that it is easily accessible the next time around.
Use a systematic approach
Troubleshooting usually follows a systematic, four-step approach: investigating the problem, planning a response, testing the solution, and resolving the issue. The first three steps are often iterated multiple times before a successful resolution can be reached.
Effective troubleshooting starts with eliminating ambiguity. Finding the apparent or root cause of an issue quickly, and then resolving it effectively, is the winning formula over the longer term. Root cause analysis (RCA) is one such tool that will help achieve this goal.
RCA is essentially a widely-adopted technique that helps pinpoint the most likely reason behind a failure. The method consists of asking a series of “Why” questions until you get to the core of the problem. RCA has two benefits - first, it helps identify the immediate cause of failure and fix it quickly; second, it leads to the core of the issue and a long-term solution.
Build a detailed task list
Creating detailed task lists is one way to bolster troubleshooting execution. A task list, as part of the troubleshooting process, outlines a series of tasks required to complete a larger job. This is a common CMMS feature that helps ensure crucial steps aren’t missed.
For example, the larger scope of the troubleshooting process may include multiple instrumentation tie-ins with prerequisites and post-requisites for each. Additionally, the troubleshooting process itself may include additional verifications and inspections. If you want the job done right, you need to be as specific as possible, including pass/fail criteria, back out conditions, contingency plans, condition monitoring, etcThe task list acts as a guide when testing possible solutions - maintenance technicians can either double down on a smoking gun or disqualify a diagnosis as quickly as possible.
Additionally, a comprehensive task list can act as a great place-keeping tool like a checklist, especially for troubleshooting evolutions that span multiple shifts or days with turnovers between maintenance crews.
Conclusion
Because equipment failures cannot be avoided altogether, you need to develop a process for dealing with them. The most effective one is systematic troubleshooting supported by a modern CMMS. By centralizing asset data, CMMS solutions make it easy for you to implement maintenance troubleshooting that identifies the cause of sudden equipment failures, and then mobilize the required resources to get equipment up and running as quickly as possible.
Author Bryan Christiansen is the founder and CEO of Limble CMMS.