Blog Post
Reliability metrics are crucial to keeping your equipment running and production goals met. Learn about MTBF and MTTR.
For most businesses, meeting production goals is their highest and all-consuming priorities. One key strategy to meeting those goals is to track reliability metrics, or metrics that measure the reliability of a system or machine. Two important reliability metrics are mean time between failure (MTBF) and mean time to repair (MTTR or MTR).
In this piece, we’ll dive into what these metrics mean, how to calculate them, and how to make them an effective part of your company’s asset management strategy.
But first, let’s discuss why these metrics are so critical to your business.
Running equipment until it breaks down is up to 10 times as costly as utilizing preventive maintenance programs, according to industry experts. This is why reliability metrics are so crucial. They offer vital insights into the health of your equipment, and can help to extend the time between machine failures, and even predict downtime in your equipment or services.
When downtime is more predictable, time and money can be better budgeted, projected production targets can be more realistic, and the cost of replacing machinery can be outright avoided for years to come.
Now, let’s delve into the differences between the two metrics.
While MTBF and MTTR often go hand-in-hand, they are very different reliability metrics.
Mean time between failure calculates the average amount of time that a piece of equipment operates without an unscheduled failure. For instance, over a period of three months, a snack company’s industrial packaging machine runs as needed for 40 consecutive days, and has three unscheduled failures. This is the information needed to calculate the MTBF.
On the other hand, mean time to repair determines the average amount of time it takes to repair a piece of equipment. Let’s say it takes the snack company’s technician an average of 34.7 minutes to fix the packaging machine when it fails—this is the MTTR.
Though MTBF and MTTR have different applications, both of these calculations are important to predicting future downtime of equipment and improving overall reliability.
MTBF measures the average time between repaired failures of a product, excluding planned downtime. This measurement gives a company valuable insight into how reliable a product is.
The higher the MTBF, the more reliable the product. The lower the MTBF, the less reliable, or the more frequent failures occur. The goal is for MTBF to be as high as possible.
For instance, an Operations Manager of a root beer company is questioning whether to replace their industrial bottle capper machine, which would be an expensive, but necessary investment for the company’s production process. After reviewing the MTBF stats (and other reliability metrics), the manager realizes that the machine has a higher reliability than he initially thought, and it could be improved if they were to replace one particular part, instead of the entire machine. This saves the company thousands of dollars and lost production time, while improving the functionality of their existing machinery.
You can calculate MTBF by dividing total operational time by the total number of failures or breakdowns.
MTBF = total uptime / total number of failures
For example, an automotive robotic painter runs for 90 work days and has two total failures over that time period. So, the calculation would be:
MTBF = 90 days/2 repairs
MTBF = 45 days
This means that on average, the automotive robotic painter runs for 45 days without failure.
It's important to note that MTBF does not account for the time it takes to repair the issue, only the time between failures.
MTBF is extremely helpful because manufacturers can use this metric to plan production of products. While exact failures can’t be perfectly predicted, the frequency and necessity of scheduled maintenance can.
Of course, the MTBF of your equipment won’t always be high—machines age, parts break down, technicians come and go. If you’re less than satisfied with a component’s MTBF, there are steps you can take to improve it. Let’s discuss a few ways to increase the time between unscheduled failures, or increase MTBF.
Use trained, skilled employees and follow best practices: Make sure employees are skilled and experienced. Furthermore, provide proper equipment-specific training for new employees and new products. Using the product as intended is vital in reducing failures.
Use quality replacement parts: Cheap replacement parts will often yield cheap fixes. Use quality, recommended replacement parts to extend the time between failures and maintain the overall health of your machinery.
Follow routine maintenance recommendations: Staying on top of routine maintenance will cut down on unexpected downtime. Routine maintenance is ideal because it can be scheduled and worked into the planned downtime.
MTTR, though it can have multiple meanings, is usually the average amount of time needed to repair a system after an unscheduled breakdown.
The MTTR (or MTR) includes the entire repair process. It begins when the repairs start and it ends when they are tested and the component is fully operational.
Imagine that an injection molding company installs a new thermal heater. The MTTR is unusually high—meaning that repairs are taking an extremely long time. The operations managers cannot figure out why since the machine is new. After doing some digging, they realize it’s because their technician isn’t familiar with this new model. Once they secure proper training for their technician, the MTTR makes an immediate decline. The utilization of MTTR helped the managers quickly troubleshoot an issue, while keeping production on track.
It's important to make sure you know which metric is being discussed when dealing with MTTR, since metrics and their meanings will be different for each organization. Of all of the MTTR meanings, mean time to recovery and mean time to repair are some of the more common ones. MTTR can also stand for mean time to recovery, mean time to resolve or mean time to respond.
You can calculate MTTR by dividing the total maintenance time by the total number of maintenance actions over a period of time.
MTTR = total maintenance time / total number of maintenance actions
For example, say a commercial oven breaks down twice during hours of operation. One repair takes 20 minutes to fix; the second takes 40. So, the calculation would be:
MTTR = (20 + 40) / 2
MTTR = 30
So on average, it takes 30 minutes to perform a repair on the commercial oven.
MTTR is designed to provide an average repair time, so results in the field may vary based on factors like severity, availability of parts and qualified repair personnel, and more. However MTTR is useful in pointing out issues, like when it’s time to replace a component, implement more training for staff, or update SOPs.
The goal is always to have a low MTTR, since that means it takes less time to perform repairs. A higher MTTR means more downtime, which affects availability of the component and the production of goods or services. A high MTTR can negatively impact your business.
Here are some best practices to help reduce MTTR:
Implement proper training: Make sure all employees responsible for repairs are trained properly on each specific type of equipment. Proper training can improve efficiency of repairs.
Optimize repair processes: Is the repair process as streamlined as possible? Cut out unnecessary steps or make changes to ensure the process is comprehensive and as efficient as possible. Checklists can be helpful in optimizing repair processes as well.
Track machine performance: Tracking machine performance can cut down on the assessment phase of the repair process since the analyzed data can point to specific issues and help employees reserve resources for future repairs.
Utilize tools to help streamline maintenance deployment: Maintenance and reliability applications help streamline the process and provide a command center for maintenance management.
Reliability metrics aid in the availability and productivity of components vital to a company’s growth. UpKeep is an essential tool to help manage and even improve these metrics.
UpKeep is an easy-to-use, mobile-first maintenance and reliability application designed to improve the deployment of technicians and efficiency in handling work orders. This platform includes a centralized command center that allows technicians to manage and exchange the necessary information on work orders in one place. It also enables users to create custom dashboards and reporting to track important maintenance data that will help improve reliability.
Furthermore, UpKeep also utilizes IoT sensors to offer real-time monitoring. This IoT data can play a major role in improving reliability. If that’s not enough, UpKeep has a variety of preventive maintenance checklists to help you start or improve your preventive maintenance program, and ultimately reduce equipment downtime. Get a free UpKeep product tour today!
Article
Asset Operations Management for Maintenance Teams
Article
Asset Operations Management for Reliability Teams
Article
Asset Operations Management for Operations Teams
4,000+ COMPANIES RELY ON ASSET OPERATIONS MANAGEMENT
Your asset and equipment data doesn't belong in a silo. UpKeep makes it simple to see where everything stands, all in one place. That means less guesswork and more time to focus on what matters.




![[Review Badge] Gartner Peer Insights (Dark)](https://www.datocms-assets.com/38028/1673900494-gartner-logo-dark.png?auto=compress&fm=webp&w=336)
