Connecting reliability improvement to the business goals - don’t just focus on risk
Most maintenance and reliability professionals understand the benefits of improved reliability. Reducing downtime, avoiding costly failure, improving safety and other benefits would immediately come to mind. But the question is, do they truly understand the impact of reliability measures on reaching set business goals?
The question is, do maintenance and reliability professionals truly understand why the organization invests in a reliability improvement programme? Do they measure the success of the reliability improvement programme via KPIs that reflect the business goals? Do they establish asset strategies that focus on achieving the business goals? Can they (and do they) make it clear to senior management how their programme helps the management team achieve their goals?
It could be argued that if the answer to those questions is “no” then there is a problem with the programme, and serious questions about the sustainability of the programme should be asked.
In the author’s experience, most maintenance and reliability people, including condition monitoring technicians and analysts, have laser focus on mitigating risk. They consider most assets “critical” and do their utmost to ensure that they do not fail unexpectedly. That may result in the following:
- Performing “preventive maintenance” whether those tasks add value or not
- Performing extended “planned maintenance” outages regardless of whether the maintenance work performed is necessary or not
- Requiring equipment to be shut down for maintenance whether that is actually the best strategy or not
- Holding spares whether it be cost effective or not
- Performing condition monitoring tests on machines whether it can be justified or not
- Measuring success based on schedule compliance, percentage of condition-based versus reactive maintenance tasks, reduction in unplanned downtime, and other metrics that relate more to the execution of the programme rather than the goals of the business
A person who did not know any better might consider that the asset strategy was purely designed to cover their backsides. Failure cannot occur on their watch. Therefore failure must be avoided at all cost. Yes, failure will harm the business, and must be avoided. But in the process of reducing those risks we may also increase our maintenance costs and reduce our capacity. We must balance the cost of avoiding failure with the cost to the business.
What would you do if you were CEO?
If you were to put yourself in the shoes of a CEO of a manufacturing business, what do you think your priorities would be?
- The CEO will have performance targets. Whether they are in place to meet the objectives of the private owners of the business, or public investors and shareholders, they must achieve those targets. Their future, (and the value of their bonuses), depends on it.
- The CEO must protect the brand of the organization. That includes keeping key customers happy and staying out of the news because of some form of disaster.
- The CEO must appease the regulators and insurers.
When asked, a ‘C’ level manager said the goal of their mining company was to “turn as much rock into as much money as possible”. Maintenance and reliability-improvement activities can help the company achieve that goal, but only with the right asset strategy.
Any proposal to the CEO (or other senior level management) will be measured against these goals. What will it cost, what is the likelihood of success, and what are the benefits? To win their support you have to show how improved reliability is a good, safe investment. And at least annually, you need to remind them how you are delivering on that investment and how you are helping them achieve their goals.
The same questions need to be asked of senior management of public utilities, senior military officials, and so on. You need to know what they hope to achieve and make it crystal clear how improving reliability helps them achieve their goals.
Isn’t it all about risk?
Reliability improvement programmes, and the design of asset strategies, typically focus on the need to understand and mitigate risk. An asset criticality ranking will be developed so that priority can be given to the assets that pose the greatest risk. And there is no doubt that risk management is important. However, in the author’s opinion, too much focus is given to avoiding the “bad things” and too little focus is on achieving the “good things”.
“Good things” and “bad things”?
If we step back for a moment, still wearing the shoes of the CEO, we could perhaps consider four points of pressure – four aspects that affect the business. Let’s call it the “business process review”.
First, we have the performance targets that drive the business: achieving shareholder value, delivering a return to the business owners, etc. It is essential that you understand what those targets are. The true measure of success of your reliability improvement programme is whether the business can achieve these targets.
Second, we have to consider the constraints on the business; what stops the business from being able to achieve those targets. Those constraints may include the availability of capital, cash flow, availability of raw materials, etc. We have to keep those constraints in mind. When we choose our asset strategies we may need to consider which have the lowest costs, which conserve cash, which results in the lowest waste, etc.
Third, now we can consider risk. What are all the “bad things” that can harm the business? Equipment failure can result in safety incidents, harm to the environment, extended periods of downtime, expensive repairs, and harm to our customers. We have to consider these risks and prioritize our asset strategy according to the severity and consequences of these risks.
And forth, we have to consider how improved reliability presents an opportunity for us to achieve those targets, something that is rarely done in the author’s experience. If the goal of the business is to produce high quality product at a defined production volume, then we need to turn our minds to how our asset strategy will enable those goals to be met. Rather than just thinking about how we can stop the “bad things” from happening, we have to think about how we can make the “good things” happen.
You may be thinking that the opportunities are just the opposite of the risks. They are and they aren’t.
When we focus on the risks associated with downtime for example, we will consider the types of failure that will result in extended downtime. We should consider proactive tasks that reduce the likelihood of developing the fault conditions, and we will monitor the equipment so that we are forewarned. But now we need to turn our minds to everything that can be done so that the equipment delivers peak performance; the highest quality, highest throughput, most efficient start-up, minimal interruption at shift changeover or product changeover, and yes, the least number of breakdowns.
Imagine sitting in a room with operators, maintenance people, and engineering, and focusing your attention on what makes the plant achieve its best performance. In a separate meeting you can discuss how you can avoid failure. In this meeting the focus will be achieving the targets. You are wearing a different hat now. It is an important perspective, and it is a perspective that enables the business to maximize its opportunities.
What can we do with this information?
The bottom line is that we need an asset strategy that considers these four areas. That strategy will need to change with time (as the constraints on the business change). And that asset strategy will definitely be different in different areas of the business, and different areas of the plant. For example, there will be areas of the production process that can tolerate small amounts of downtime, whereas in the final assembly for example, production lost due to downtime, slowdowns and minor stoppages can never be recovered.
Why is all this important?
It is important for two reasons. First, you need the support of senior management because ultimately, they control funding. While a lot of good reliability improvement tasks can be performed “under the radar” with budgets managed at a much lower level, the true benefits cannot be gained in this “stealth” mode.
Second, and much more importantly, is that you can’t improve reliability unless everyone in the organization is involved. You must have a “culture of reliability”. Everyone involved in design, procurement, engineering, spares management, work management, and especially operations/production must understand what they can do to improve reliability so that the business achieves its goals. If senior management is not fully on board with the concept of reliability improvement, you will never achieve that goal. And senior management will only be fully on board if they see how a reliable plant helps them achieve their goals.
If you understand how reliability improvement enables the business to achieve its goals, and you can quantify those benefits, and you can articulate those benefits to senior management, then you will gain their support. With that support you will have the essential ingredient to developing the culture of reliability (because without that the programme cannot achieve its full potential).
Now you must maintain that focus so that your asset strategy, and every decision you make (and dollar you spend), adds value to the business.
If you continuously communicate the value you are delivering, you will maintain the senior management support, you will continue to improve the culture, and the reliability improvement initiative will be sustained for many years.
Neste Engineering Solutions has performed a dynamic simulation for Kiilto Oy, a producer of chemical industry products. The purpose of the dynamic simulation was to get a better understanding of how Kiilto's production facility's polymerizing reactor behaves in possible disturbances. The production facility is located in Tampere, Finland.
We have all read about it: leak detection should be a top priority since, if no leak detection program has been implemented, leaks can account for 30 to 40% of consumed volume. So, why is this issue still on the table? Why is it difficult to change things in the field?