Reliability What Is It?
Concepts used throughout the world to describe improvements to production throughput are often termed “reliability”. But just what does the term really mean? What work performed by a plant’s staff is considered “reliability” and how does this work actually improve the profitability of a company? What issues will make plant management embark on a reliability project? Who should be involved in the reliability project? The long answers to these questions will vary from facility to facility, however the driving factor will always be to reduce costs.
Reliability has to do with the elimination of failure modes, and the management of resources to minimise the frequency of unavoidable failures. If a plant decides to introduce a reliability plan, the actual steps agreed on are affected by the definitions used. If the definitions are not well thought through a high degree of confusion is likely to occur. One definition can lead the site in a totally different direction than another. It is therefore important to challenge and agree on the definition used. These definitions must then be structured in such a way that all people in the plant understand what they mean and what role they play in achieving them.
In many facilities manufacturing is seen as the customer of maintenance – this thinking will, in my opinion, work against the overall objective – the only customers that exist in reality are those who purchase the end product, and delivering a quality product to the customer is key for survival. Reliability requires a partnership approach – that is everybody working together for the common goal. Senior management must subscribe to and be a part of this.
Definition of Reliability
The definition that follows is the definition that I prefer; it is one that has been derived from many definitions that I have seen over time:
“The probability that a component or system will perform a required function for a given time when used under stated operating conditions.”
Reliability and maintenance improvement initiatives can deliver substantial financial as well as other benefits to a company. Many initiatives fail to deliver and sustain the expected results due to lack of long-term top management support. Because maintenance is seen as a cost within the organisation Maintenance management is often under such high pressure to control expenditure that they are unable to carry out the work required to deliver the possible breakthrough results.
Data must be refined into information if it is to be of any real benefit for making good decisions.
Much of the work we carry out every day in maintaining our plant and equipment is accepted by all without question – “that’s the way it has always been carried out” thus implying it must be the correct and the best way. However, if we were to look around we would probably see that the production requirements have changed drastically since the maintenance strategy was first formulated. So is it reasonable to assume that the maintenance strategy of yesterday will deliver the production needs of today? On the one hand we continue to unquestioningly stick to the “tried and trusted”, yet on the other hand we cannot stop equipment failing. There is a subtle connection between the two; that is maintenance is seen as a FIXER of equipment and not an equal partner in the overall management of the asset. It is also part of the plant’s culture.
In the current economic climate, managers of manufacturing facilities are under a constant battle to increase profitability and minimise costs. Increasing production is often one of the options taken to decrease the unit cost. This move is often taken without any consideration of maintenance requirements or the impact it will have on plant reliability. The usual question is “will the pump deliver the flow?” – No consideration is given to how long it will sustain the increased flow for. This is part of the “customer” mentality in the plant’s culture mentioned previously. Then, when downtime starts to increase resulting in production loss it is seen as a maintenance problem, whereas the issue has its origins in management’s failure to assess the risk and take action accordingly. Whether senior managers realise it or not the deliverance of that “minimum cost / maximum profit” is dependant on equipment operating safely and being reliable. Hence maintenance has a big part to play in delivering the end product.
Taking the definition of Reliability already mentioned: “The probability that a component or system will perform a required function for a given time when used under stated operating conditions.” The most important part of this statement is: “required function … given time … stated operating conditions”. If we begin to change any of these, then the strategy to maintain the equipment must change if the equipment is to remain healthy and deliver the quality product required. This will minimise the need for maintenance intervention on an emergency basis.
If we as humans live a healthy lifestyle, it helps us to stay away from the doctor (major breakdown). However, this healthy lifestyle has a cost in terms of time and what we might eat and drink. We have to take some exercise, have time to relax and eat the correct foods, (maintenance), if we are to give ourselves the best chance of staying healthy (reliable). To achieve the required result we must match the exercise and diet to our lifestyle. If we are young, athletic and involved in sport our needs (maintenance) will be vastly different from someone who is retired. So the message is that strategy must match your needs and your operating conditions.
“Prevent the event” is smart maintenance of yourself or that critical piece of production equipment more commonly referred to as “proactive maintenance”. Just as we spend money in our effort to maintain ourselves in a healthy and productive state, it is important to understand the effort and requirements needed to keep your plant and equipment in a reliable condition. Healthy equipment will provide the means to great business success. Without reliable equipment certain business failure eventually awaits. Reliability for any business begins with the management and is hugely dependant on how they communicate the need for a failure free environment.
Reliability engineering is a strategic task concerned with predicting and avoiding failures. For quantifying reliability issues it is important to know why, how, how often, and costs of failures. Reliability issues are bound to the physics of failure mechanisms so the failure mechanisms can be mitigated. In the real world many potential failures are seldom well known or well understood, which makes failure prediction a very difficult task indeed.
A current management fad is to take a maintenance mentality organisation and change job titles to include the word reliability. This provides style but no substance, as tools and approaches for reliability and maintenance are as different as night and day. Then management wonders why the new “reliability” organisation continues to function as before when the maintenance approach was fast repairs, which were considered the key to success.
Both reliability engineering and maintenance engineering have roots in each other’s territory and thus must know about each other’s roles, responsibilities, and tools. Reliability technology predicts failures and with the use of longer-range tools reduces the cost of failures.
The prevention of failure has a cost, just as repairing failures has a cost. Thus both reliability and maintenance activities are ruled by finance just as improvement decisions are always about finance and alternatives. Reliability projects require financial analysis so as to correctly define cash outflows. This is the language understood best in the boardroom.
Engineering is responsible for defining when failures will occur so that the cost can be presented in net present value (NPV) terms. This relies on predictions from reliability engineers for the mode of failure also provides information about failure severity. The cost of failures must also include gross margin losses from production outages and lost opportunity production—this is particularly true for continuous process operations when the production is sold beyond the current manufacturing level.
Start With a Business Need
Reliability practices begin in the boardroom! Key financial variables that affect the site should be identified – these are the ones that will make the largest difference to the profit margin. It is important to understand clearly, for example, whether it is maintenance costs that need to be decreased, or should it be a reduction in the maintenance costs per unit produced. Many examples are available showing a reduction in spending that led to a loss several times that amount in lower production due to decreased reliability. The site strategy needs to emphasise the few variables where the focus should be kept.
Once reliability is adopted as a measure of overall business loss, it becomes much easier to get management support for reliability initiatives. This support is critical for project success. By having a clear definition of reliability, it is easy to see how the lack of reliability of equipment or processes contributes to the different key performance indicators – e.g., overall equipment effectiveness (OEE) or cost.
One of the key issues is the difficulty in defining a reliability problem. The first question asked is typically, “What is causing the gap in OEE or cost?” The first-level answer to this is usually easy to see, e.g., it may be that availability is the reason OEE is low. The lower and more detailed the level at which a response is required, however, the more difficult providing that response gets – due to lack of information. Now most sites have many systems for gathering data, so much data in fact that they do not know what they have. Data on its own is useless; it must be refined into information if it is to be of any real benefit.
Have a Systematic Approach to Improvement
A great misunderstanding exists in industry, that is, having a process in-place to do a thing guarantees control over its outcomes. Just because you can show a documented procedure is not surety it will produce the required result. Changes in process introduce variability: the cause of most of our operating and business problems.
A business process includes people, the materials worked on, documents, selection process, training carried out, learning achieved, and the work environment – in fact everything within a business can affect the outcome. Variability is the range of possible outcomes. Many businesses still use processes and practices long believed to be suitable, not comprehending that these processes naturally contain inherent volatility that make their equipment fail. Are you trying to achieve impossible results using engineering and maintenance strategies not capable of delivering the performance required? Trying to improve production equipment reliability using old maintenance and engineering strategy is an exercise in futility. It will cause great waste, produce distress for all concerned and lead to emotional burnout for all involved. The only approach that can work is to change the strategy and culture to match the requirements of now.
Any strategy requires four critical success factors that have to be integrated closely together to achieve success.
Management Support for the Concept
Unless the senior managers are sold on the fact that these processes are critical to achieving the objectives, little focus will be put on them and they will ultimately become discredited and not fully implemented. This is a major financial loss because people are not available to work on other initiatives. It is for this reason that a small number of initiatives should be implemented simultaneously. Any strategy requires four critical success factors: People, Process (management), Technology and Information. While every facility will have these ingredients, it is only by integrating them closely together that success will be achieved.
The first few chosen should aim for a quick return and be issues that the managers drive and show a personal interest in. Success will breed success and commitment. These successes will help drive the culture change as it goes through the normal phases.
The strategy should be in place from the beginning and communicated to all, however it is important to keep working on the culturechange as it is this change that will bring about the full realisation of benefits. By that it is possible to find a way forward:
- don’t accept the need for change
- accept need, but don’t know where to go
- know where to go, but not how to get there
- know how, but doubt it can be achieved
- make changes, but they are cosmetic only
- make changes, but no benefits – model doesn’t align with real world
- model aligns today, but not tomorrow
- successful transition – model keeps in step with a changing world.
Reliability policies must integrate safety, quality, risk, and financial requirements for the company to achieve the business objectives. Reliability policies must be understandable to the common person and come from top levels of management for credibility, legitimacy, constancy of purpose for improvements, and setting the organisation to work for a common objective.
Management has a big role in reliability issues, which guide design of equipment and continues through maintenance of equipment and systems. Management must address the issues and state the general requirements so everyone understands. The issue of reliability is to provide an environment where:
- safety performance and awareness is increased
- production targets are consistently achieved
- reduction in cost of maintenance per unit of production
- continued upward trend on reliability – success breeds success
- moral hugely improved – celebrate success
- a sense of calm exists.
- equipment is “owned”.
Management must also communicate the cost for failure – it can’t be a secret for those that need to make financial decisions and procedures must be established to communicate the costs of assumed values or calculated values for communicating to the organisation the high cost of certain failures. The fact that a human life is priceless does not compute but society allows certain risks, which then allow calculated values for communication purposes, as time/cost/finance is the language of commerce, decisions, and action.
Define Reliability in a Business Needs Context
Keep a clear focus always on the fact that there is a distinct difference between reliability (reducing the need for intervention) and minimising the consequences (fixing it faster / reducing product loss). Many people are passionate about repairing, so focus can easily turn to these issues and reliability gets neglected – this is the real culture change. It is critical to understand that improving reliability reduces both time and cost to repair. Most other initiatives address only one or the other at a time.
Reliability initiatives are best focused on the most frequent issues, as these are the ones that will give the fastest evidence of improvement. Addressing an issue that only occurs every three to five years will need another three years minimum to see any benefit. Some reliability benefits are:
- greater job satisfaction.
- less repetition.
- safer working environment.
- lower maintenance costs.
- more time to do the “right things”
- reduced production variables
- reduced business risk.
Potential problems must come to leaders’ attention – as a key to success in any organisation. Leaders need to have a clear view of how their organisations conduct maintenance. If you have inherited equipment with low reliability and many failures, you may need improvements to gain competitive advantage. Remember best performers face brutal facts. They deal with:
- inherited poor practices
- resistance to change – “That’s the way we have always done it”
- poor purchasing specifications
- training – lack of or poorly delivered
- working relationships that are counterproductive
- performance indicators that drives the wrong behaviour.
- the wrong people.
You must have a path forward to successfully achieve reliability. It is possible to achieve the reliability required to sustain your process at the rate you require.
»»Who is John Coleman? John Coleman BSc. (Hons) MEng. MIEI, MIAM, M.I. Ref. Eng. is Maintenance Facilitator at the Rusal’s Aughinish Alumina Refinery Central Workshop. He is also the Chairman of MEETA, the Irish Maintenance Society.
Developing an asset criticality ranking (ACR) is an important part of any reliability and performance improvement initiative. The criticality ranking enables an organization to prioritize and justify a wide range of activities and investments.
The most applied asset management methodologies by Infrastructure Managing Companies are usually those that have the deepest impact into the profit & loss account.