Primer of Managing Maintenance in a Chemical process plant

By Joel Levitt


We build a chemical plant to produce a particular product, in a particular quantity at a particular quality. Once the plant is built we need three items to converge for the manufacturing to take place.


  1. Equipment Custody (control) in good working order and up to specifications
  2. Bill of Materials (all of the resources such as raw materials, specialized tooling and consumables)
  3. Labor with the requisite competencies.


Whenever a product changes (such as color, size, specifications) one or more of the three elements has to change also. If the plant makes Latex paint and they want to change colors they have to issue a new Bill of Material for the new batch. Everything that goes on in the plant with regards to production can be traced back to these three elements.


 These elements are extensively dealt with by the plant’s MRP (Material Requirements Plan) activated by the marketing forecast. This plant we are discussing makes a ‘product.’ All of the processes to make the product, deliver the product are intensively studied, discussed and pondered by top management.   Extensive thought and billions of dollars have been spent developing systems and procedures to insure that these three elements converge at the right time, in the right sequence and at the right place.


For the plant to make product it must have machine capacity capable of delivering the marketing forecast as defined by the MRP. Within the design limits of the equipment chosen, maintenance delivers that capacity. Good maintenance practices deliver that capacity consistently without surprises for as long as the company needs the output.


Maintenance operates in a very similar way and also has a product. The product of maintenance is the capacity or uptime to make the ultimate product. So production makes products and maintenance makes capacity to make production.


Not surprisingly the same three elements must be present for maintenance to ‘make’ its product.

  1. Custody (control) of the equipment to make repairs and for preventive activities
  2. Bill of Materials (all of the resources such as parts, tools, equipment and consumables necessary for all maintenance activity)
  3. Labor with the requisite competencies either employees or contractors. In this case repair, troubleshooting, installation competencies.


All of the maintenance management effort is geared toward managing one or more of these elements. As an aside there is one continuing issue of managing maintenance is that unlike the process of production. In maintenance top management spends very little effort, time or funds to optimize, study or even understand the maintenance process.


Maintenance Management Effort


How it is related to the three elements necessary for successful maintenance

PM (Preventive Maintenance) schedule

The PM schedule needs to be rolled out for an entire year (or other long period) and meshed with the production schedule. The MPR schedule for production must be synchronized with the PM schedule.

PM activity

Preventive maintenance activity assures capacity by detecting impending failure and by life extension. The activity itself (being maintenance activity) requires equipment custody, bills of material fulfillment and competence)

PdM (Predictive Maintenance)

Predictive maintenance predicts impending breakdowns and alerts maintenance managers. This gives the maintenance department time to plan the job, order materials and schedule to job when it will impact the production the least.

PCR (Planned Component Replacement)

Technique to improve reliability of a system that has significant failure consequences. PCR minimizes future unscheduled custody changes.

Inventory management

The second item of the three is the bill of material and inventory management assures that the material is available.


Planning identifies the resources for the maintenance job so that everything is ready and available to minimize the impact on the production schedule. The resources compromise both the bill of materials (parts, tools) and labor requirements (competence)


The schedule is designed to minimize the impact of maintenance activity to productive outputs. Keep in mind the product of maintenance is equipment availability, so the maintenance schedule should minimize disruption to production (for example by scheduling work during product changeovers or color changes)


The third element is competence. Training helps align the competence of the maintenance crew with the competency needed to take care of the equipment.

RCM (Reliability Centered Maintenance)

Technique to maximize equipment availability and increase safety. An overall approach to maximization of uptime.

TPM (Total Productive Maintenance)

Spreading of competence to perform basic maintenance to operations. Increases the amount of basic life extension maintenance to improve equipment capacity. In its best implementations TPM is a holistic focus on improvement of Overall Equipment Effectiveness


Use of outsiders is indicated when there are not enough available hours of competent personnel or no personnel with the competence needed at all. It is also used to minimize overall cost to deliver the maintenance product.


Quality of maintenance work is related to proper training (competence), proper tools and parts (bill of materials) and enough time (custody).

Maintenance Technical Library

Where all the detailed maintenance information is stored.


PM is a series of tasks performed at a frequency dictated by the passage of time, the amount of production (cases of beer made), machine hours, mileage, or condition (differential pressure across a filter) that either:


1. Extend the life of an asset. Example: Greasing a gearbox will extend its life. All the tasks with ‘E’ in the box in the chart following are life extension type tasks




2. Detect that an asset has had critical wear and is about to fail or break down. Example: A quarterly inspection shows a small leak from a pump seal. Finding this leak allows you to repair it before a catastrophic breakdown. All the tasks with ‘D’ in the box are impending breakdown detection type tasks


These tasks should be directed at how the asset will fail. The rule is the tasks should repair, delay or detect the unit's most dangerous, most expensive or most likely failure modes (in that order).


Caveat: Even with the best PM systems there will still be failures and breakdown. Your goal is to reduce the breakdowns to levels financially and operationally appropriate, consistent with a safe and secure environment. In a hazardous chemical environment the threshold might be significantly higher (much more intensive PM) then in a non-hazardous environment. Through early detection, the breakdowns that occur will be of a reduced size and scale. Ideally we would also convert the breakdowns that are left into learning experiences to improve the delivery of maintenance service (and of course improve the PM system itself).


2nd Caveat (And this is a big one): Insure that whatever PM performed you factor in iatrogenic failures. Iatrogenic is a fancy word that formally means: a symptom or illness brought on unintentionally by something that a doctor does or says. In our case it means any breakdown or service due to an action of the mechanic or service caused breakdown. In fact, every interaction with a machine or other asset has some probability of going badly. The more PM activity on any asset, the higher the probability of iatrogenic failure. Training, proper tools, adequate time, limiting distraction and good working conditions reduce these kinds of failures. PM, PdM Defined courtesy of The Complete Guide to Preventive and Predictive Maintenance by Joel Levitt, Published by Industrial Press


Task List (several types) driven by frequency or event


Adjust- E


Interview Operator-D

Inspect -D

Low Tech: human senses

High Tech (PdM)

Any other Tasks D or E

Deferred Maintenance that you hope doesn’t bite you

Corrective Maintenance added to Backlog that could be planned

Data from all breakdowns and disruptive events

Design and Engineering Review (can be PMO or RCM Style)

Short Repair

Data from Manufacturer and other plants

Make notes about machine condition-D

Take readings-D

Everything OK -no action needed

D-Detect impending failure

E- Extend Life


























The flow of all information in a maintenance department can be organized around PM as shown above.


Description of the details (from the chart above).

Row 1

Task List: a list of all the tasks or actions to be performed at that time (there are 4 major types of task lists- unit, string, standing, and future benefit).


Types of clocks and frequency: PM is periodic. The clocks determine how often or when to perform tasks on the task list. Measured in days, units, tonnage, cycles, miles, or even readings (such as temperature), changes to readings, findings (oil slick on floor under truck).  Almost any trigger can be designed into a PM clock.


Row 2 and 3 (D-detect failure, E-extend life, these rows represent possible types of tasks)

Inspect: Stop, look, and listen, using human senses or instruments (PdM) (also row 3)

PdM (Predictive maintenance): Any inspection carried out with high technology tools that use advanced technology to detect when failures will occur.  Such tools can increase your returns and give you more time to intervene before failure. (also row 3)

TLC (tighten, lube, clean): Start with the basics. Caring for your equipment is the core of the PM approach. This care does not require any fancy equipment or techniques, just basic care. Much of the benefit from PM flows from TLC.

Adjust: Making the equipment work optimally by tightening, changing, fine-tuning,  or modifying the machine set-up or operation.

PCR (Planned Component Replacement) also called scheduled replacement. One of the tools in your pouch is PCR.  This technique has been made popular by the airlines. PCR can improve reliability in  many circumstances.

Readings: Writing down or  entering data concerning measurements of pressure, temperature or other parameters. Spotting  trends  in these readings can frequently uncover problems before they impact production or safety.

Interview operator: Ask questions about machine operation and note answers. Many problems are apparent to the operator or driver  before they are obvious to anyone else.

Notes about machine condition: These notes  are related to readings and  will tell the skilled observer  of any  subtle changes taking place in the asset.


Row 4

There are four outcomes from a PM inspection:

  1. Everything  okay - no action needed
  2. Deferred maintenance item. – You will ignore this problem and hope  the unit doesn’t fail. The problem is that these deferred items have a way of coming back to haunt you. They only rarely go away by themselves. .  Deferred maintenance items have been studied and when looked at economically they tend to deteriorate at a great rate compared to the interest rate  that  could be received on the money not spent by deferring them in the first place. In common language it rarely pays to  defer an action  unless the machine or process is being closed down.
  3. Corrective maintenance: is  any item found by inspection that you plan to schedule. We call this plannable maintenance (it can be planned). The goal of the inspection process is plannable maintenance. With this kind of work you have the lead-time to work efficiently. We say plannable not planned  because not every firm is committed to planning maintenance activity (for details see Maintenance Planning, Scheduling and Coordination, Industrial Press) 
  4. Short repairs:  these are repairs done by the PM person when they are doing the PM, including repairs of short duration with the tools and materials that the PM person carries. These actions are different from temporary repair. A short repair is a complete repair that can be accomplished in a short time. This subject is discussed in depth in a later chapter.  Short repairs are an easy way to improve productivity.


Row 5

All data flows to the design and engineering review. One of the primary reasons for collecting data is to use it in the review (and redesign) of breakdowns and disruptive events. These events include data from breakdowns, data from manufacturers, readings, reports of machine condition, and all work orders. RCM or PCO style design and engineering review uses the structures of RCM to manage the process.


RCM (Reliability Centered Maintenance): One of the most important approaches to PM, and was developed in the aviation industry.  One result from a review of what happened is feedback to the task list in the form of details of increased (decreased) frequency, depth, or technology. Additional details about RCM can be found further along in this section.


PMO (PM Optimization) is an offshoot of RCM and recognizes the difficulty (and sometimes futility) of RCM in a mature operational plant. PMO embodies techniques to optimize the PMs that are done to get the most reliability from the least resources.


Tasks (row 2 and 3) are organized into Task Lists

Each task is marked off when it is complete. Some tasks require readings or measurements. There should always be room on the bottom or side of the task list to note comments and readings.  Actionable items should be highlighted to make it easy for the reader


PM systems also include:

1. A record keeping system to track PM, failures and equipment utilization. Part of the job of the PM effort is creating baselines for other analysis activity.


2. All types of predictive activities. These include both human sense inspection and the use of instruments for taking measurements and readings. Included is in­specting of production for quality. PM includes the recording of all data for statistical and trend analysis.  


3. Short or minor repairs.  In this sense short repairs refer to repairs that can be done completely and properly in a short time. Short repairs are a great boost to productivity since there is little or no lost time. Short repairs are to be written up for equipment history because minor or small problems often signal larger underlying problems.


4. Writing up any conditions that require attention (conditions which will lead or poten­tially lead to a failure). Specifically this refers to write-up of corrective maintenance action items onto work orders or work requests. This also includes reports about machine condition. All PM systems have to effectively manage the feedback cycle from conditions written up and work order creation to successful work completion and possibly re-inspection.


5. Scheduling and actually doing these corrective repairs written up by PM inspectors within a reasonable timeframe (before they fail for starters). A necessary step before scheduling is planning each job so that maintenance resources are used in an optimum way.


6.  Keep the PM process going and refine the task lists and task frequencies. One way to accomplish this is using the frequency and severity of failures to refine PM task list. In other words when you experience too many failures on equipment under the PM program examine the task list and add items, add depth, increase frequency or add technology.


7. Management of and investigation into trends uncovered by inspections and an intelligent determination of who should do what analysis and when they should do it.


8. Continual training and upgrading of inspector's skills, improvements to PM technology


Of course the goal is not PM! The goal is the results from PM, higher reliability. This seems to be a contradiction. It is not. PM requires both equipment custody and money for parts and labor. Both can be scarce resources and are extremely expensive. The goal is the highest stable long term output. In other words the day we can run 100% uptime with 100% safety without PM, will be the last day for PM.


Introducing and Managing Predictive Maintenance (PdM)


The ideal situation in maintenance is to be able to peer inside your components and replace them right before they fail. Technology has been improving significantly in this area. Tools are available that can predict corrosion failure on a transformer, can thread through, examine and videotape boiler tubes, or detect a bearing failure weeks before it happens.


Scientific application of proven predictive techniques increases equipment reliability and decreases the costs of unexpected failures. Predictive Maintenance is a maintenance activity geared to indicating where a piece of equipment is on the P-F (performance failure curve) and predicting its useful life. 


The nature of Failure

The original authors of the RCM methods thought long and hard about how machines fail. It is essential to understand failure to effectively design tasks that will either detect failure or tasks that will extend life.  In the P-F curve shown below the nature of failure is dissected.


One key is that these engineers determined that for each failure mode there was a precipitating event that starts the train of events that ends in failure. In this text the event is called the critical event (CE). This event could be damage, contamination, heating, overloading, corrosion, even operations abuse.


The P-F (Performance- failure curve) describes the performance characteristics from the CE point to the ultimate total loss of performance which we generally call failure. Note that the slope of the curve (decay in performance) starts slowly and increases at an increasing rate. This means that there may be little or no discernable decay in performance in the beginning (near the CE). And as time goes on, once you get over the hump the performance decays rapidly.


The curve shows two facts that are essential for the proper determination of the task frequency. The first fact is how long does it take from when an event happens to ultimate failure? This fact determines the frequency of the task to be effective. Any task must be done about twice as often as this interval. The second fact is what does Inspection and PdM buy you? In other words, what depth task is necessary to achieve the reliability goal?


D-(The amount of time or use between the CE and the ultimate failure). Generally this point cannot be detected with existing technology. It is a limit or a maximum theoretical time to failure.


C- is a point in the decay where very high tech PdM technology can detect either the decay in performance or the unfolding failure itself.  The C point is the maximum practical time between inspection and failure. If PdM is used the PM inspection has to occur sometime between points A and C. Generally if you pick a frequency ½ C-A to ¾ C-A at least one inspection will fall between A and C. Using a PdM task gives you the greatest amount of time between detection and failure.


B-is a point where skilled technicians without technology can detect impending failure. No technology is needed at this point.  This is the greatest interval without using PdM.


A-is a point where, even unskilled workers, if told what to look for will be able to detect the failure. Although anyone can detect the failure there is not much time to intervene.








Performance curve


                                         TIME,  USAGE


P-F curve (Performance-Failure)







There are several ways to measure the P-F interval

The PM inspection routines are designed to detect the failure modes unfolding and determine when failure will take place so an effective intervention can be made. Since we cannot yet see the failure modes directly (point D), the goal is to find a measure that is easy to use and is more directly proportional to wear.


How is PdM different from PM inspection?

While any inspection activity on the PM task list is predictive, PdM is reserved where instruments are used for the readings.   


Condition based maintenance

Condition based maintenance is related to predictive maintenance. In condition based maintenance the equipment is inspected, and based on a condition further work or inspections are done. For example in traditional PM a filter is changed monthly. In condition based maintenance the filter is changed when the difference in pressure between a gauge before the filter and a gauge after the filter reaches a particular value. The differential pressure notes the condition of the filter.


The most sophisticated application of condition based maintenance has a computer processor monitoring the asset on an on-going basis. When the measured parameter exceeds the engineering limit the computer issues an alarm. If it exceeds a higher second limit the computer might issue an emergency shutdown sequence.


All of the predictive techniques we are going to discuss (except condition based methods) should be on a PM task list and controlled by the PM scheduling system. The PdM tasks should be coordinated with other PM activity.  Most PdM tasks are handled in string based PMs (short activities on many assets).


PdM Checklist

1. What is our objective for a predictive maintenance program?  Do we want to reduce downtime, maintenance costs or the stock level in storerooms?  What is the most important objective?

2. Are we, as an organization, ready for predictive maintenance?

     A. Do we have piles of data that we already don't have time to look at?

     B. If one of the PM mechanics comes to us asking for a machine to be rebuilt do we have           time to rebuild a machine that is not already broken?

     C. Could we get downtime on a critical machine on the basis that it might breakdown?

     D. Are we willing to invest significant time and money to training?  Do we have the       patience to wait out a long learning curve?

3. Is (are) the specific technique(s), the right technique(s)?

     A. Does the return justify the extra expense?

     B. Do you have existing information systems to handle, store and act on the reports?

     C. Is it easy and convenient to integrate the predictive activity and information flow with        the rest of the PM system?

     D. Is there a less costly technique to get the same information?

     E. Will the technique minimize interference to our users?

     F. Exactly what critical wear are we trying to locate?

4. Is this the right vendor?

     A. Will they train you and your staff?

     B. Do they have an existing relationship with your organization?

     C. Is the equivalent equipment available elsewhere?

     D. In the case of a service company, are they accurate?

     E. How do their prices compare with the value received, to the marketplace?

     F. Can the vendor provide rental equipment (to try before you buy), can they provide a turnkey service giving you reports and hot line service for urgent problems?

5. Is there any other way to handle this instead of purchase?

     A. Can we rent the equipment?

     B. Can we use an outside vendor for the service?


An excellent treatment of the whole field of predictive maintenance can be found in John Moubray's book RCMII published by Industrial Press, NY.


Technologies are grouped around detecting deterioration in the 6 effects: dynamic (vibration), particle (ferragraphy), chemical (water analysis), physical (crack detection), temperature (infrared), electrical (ampere monitoring).






Primary situations


Chemical analysis

Analyze oil, other lubricant or water

Much knowledge is needed to analyze results. Heavy equipment costs so contractors are used for expertise and specialized equipment. Widest variety of tests for specialized needs

Vibration analysis

Rotating equipment

Highest learning curve. Equipment has come down in cost. Can be the first way to detect critical wear

Temperature Measurement

Electrical, process heat, roofing, energy efficiency

Easy to use and very easy to understand. Costs for low-end units have plummeted. High end coming down nicely.


Air/vacuum leaks, high voltage lines, thickness, cracks

Not expensive. Easy to use. Multi-use for predictive and troubleshooting.


Anything you would like to be able to see inside of like pipes, engines, gear boxes, etc

Used mostly when there is a known problem. Is an extension of the mechanics most developed sense.


Chemical analysis     

One of the most popular families of techniques to predict current internal condition and im­pending failures is chemical analysis.   There are 7 basic types of chemical analysis. The first two are related to particle size and composition:


Type                                                                Material

1. Atomic Emission  (AE) Spectrometry                    all materials

2. Atomic Absorption (AA) Spectrometry                 all materials

3. Gas chromatography                                               gases emitted by faults

4. Liquid chromatography                                          lubricant degradation

5. Infra-red spectroscopy                                            similar to AE

6. Fluorescence spectroscopy                                      assessment of oxidation products

7. Thin layer activation                                               uses radioactivity to measure wear


Oil analysis is a significant subset of all of the chemical analysis that is used for maintenance. The two spectrographic techniques are commonly used to look at the whole picture. They report all metals and contamination.  This is based on the fact that dif­ferent materials give off different characteristic spectra when burned. The results are expressed in PPT or PPM (PPT-Parts per Thousand, PPM-Parts per million, PPB-Parts per billion).




Vibration Analysis               


A widely used method in plant/machinery maintenance. A study in the city of Houston's wastewater treatment department showed $3.50 return on investment for every $1.00 spent on vibration monitoring.  The same study showed that a private company might get as much as $5.00 return per dollar spent.


Each element of a rotating asset vibrates at characteristic frequencies. A bent shaft will always peak at twice the frequency of the rotation speed. A ball bearing, on the other hand, might vibrate at 20 times the frequency of rotation.


There are over 9 different types of vibration analysis. Each individual technique focuses on one aspect of the way assets deteriorate that is detectable by vibration. Techniques include octave band analysis, narrow band frequency analysis, real time analysis, proximity analysis, shock pulse monitoring, kurtosis, acoustic emission, and others.


The most popular is broadband analysis. This analysis measures the changes in amplitude of the vibration by frequency over time. This amplitude by frequency is plotted on an XY axis chart and is called a signature (also for a given service load). Changes to the vibration signature of a unit means that one of the rotating elements has changed characteristics.  These elements include all rotating parts such as shafts, bearings, motors, and power transmission com­ponents. Also included are anchors, resonating structures and indirectly connected equipment.



Temperature Measurement            

Friction (or electrical resistance) creates heat.  Temperature is the single greatest enemy for lubrication oils and for the power transmission components. Ad­vanced technologies in detection, imaging, and chemistry allow us to use temperature as a diagnostic tool.


Today, there is technology to photograph by heat rather than reflected light. Hot­ter parts show up as redder (or darker). Changes in heat will graphically display problem areas where wear is taking place or where there is excessive resistance in an electrical circuit. Infrared is unique since it is almost entirely non-interruptive. Most inspections can be safely completed from 10 or more feet away and out of danger.


Ultrasonic Inspection

One of the most exciting families of technologies is based on ultrasonics. It is widely used in medicine and moved to factory inspection and maintenance.  There are four or five techniques that make up this family.


In one of the most common and inexpensive techniques an ultrasonic transducer transmits high frequency sound waves and picks up the echo (pulse- echo). Echoes are caused by changes in the density of the material tested. The echo is timed and the processor of the scanner converts the pulses to useful information such as density changes and distance.


Ultrasonics can determine the thickness of paint, metal, piping, corrosion and almost any homogenous material.  New thickness gauges (using continuous transmission techniques) will show both a digital thickness and a time based scope trace. The trace will identify corrosion or erosion with a broken trace showing the full thickness and an irregular back wall. A multiple echo trace shows any internal pits, voids and occlusions (which cause the multiple echoes).


Advanced Visual Techniques                                  

The first applications of advanced visual technology used fiber optics in bore scopes. In fiber optics, fibers of highly pure glass are bundled together. Some of the instruments can articulate to see the walls of a boiler tube. The focus on some of the advanced models is 1/3" to infinity.   The limitation of fiber optics is length. The ad­vantages are cost (about 50% or less of equivalent video technology and level of tech­nology (they don't require large amounts of training to support). 


Another visual technology gaining acceptance is ultra-small video cameras. These are used for inspection of the interior of large equipment, boiler tubes, and pipelines. They use a miniature television camera smaller then a pencil (about 1/4" in diameter and 1" long) with a built-in light source. Some models allow small tools to be manipulated at the end, others can snake around obstacles. It is extensively used to inspect pipes and boiler tubes.


Other instruments not discussed here but should be considered part of your predictive maintenance tool box are meggers, pyrometers, VOM meters, strain gauges, tem­perature sensitive tapes and chalk.


PCR Planned Component Replacement

 One strategy formerly widely recommended by the aircraft manufacturers and used by the airlines is scheduled component replacement (and return of the component to the depot for rebuild).


Using statistics, if we look at the failure interval we should be able to predict when the next failure will take place. In fact, if we had enough equipment of the same type in similar service we should be able to assign probabilities to the likelihood of failure. The field of statistics, when applied to failures of like equipment in like service can do just that. One outcome of this approach is PCR (planned component replacement) 


PCR is an option on the PM task list. The novelty of this option is the elimination of failure because components are removed and replaced after so many hours or cycles but before failure. Depending on the sub-strategy some of the components are then returned for inspection, rebuilding, or remanufacturing, and others are discarded.  The result of this strategy is controlled maintenance costs and low downtime. The strategy does not work when the new component experiences high initial `burn-in' type failures.


For example, fleets with time sensitive loads realized that breakdown costs with downtime are sufficiently high to justify PCR.  It is standard procedure in some fleets to replace hoses, tires, belts, filters, and some hard components, well before failure on a scheduled basis. These soft items (belts, hoses) are called planned discard since there is no intention of using them elsewhere.


PCR is an expensive option. Even in the aircraft industry, significant effort has gone into improving reliability so that fewer components would be in the periodic rebuild program. According to John Moubray in RCM II, after an extensive RCM analysis the number of overhaul items (planned rebuild items) went from 339 on the Douglas DC 8 to just 7 items on the larger and more complex DC 10. Although  the number has dropped dramatically, PCR is still an important tool to the maintenance professional.


PCR is divided into two sub-strategies called planned discard (where you throw away the component)  and planned rebuild (for rebuildable components like truck engines).


Planned discard is where a component is removed before failure and discarded. Common examples would include belts, filters, small bearings, inexpensive wear parts, etc. One fleet replaces hoses every two years during its major rebuild cycle to reduce the number of unscheduled hose failures.


Planned rebuild is for major components that are rebuildable such as engines, transmissions, gearboxes, pumps, compressors, etc. Components on aircraft are the best examples of this strategy. The items are removed after a fixed number of operating hours or take-off/landing cycles. They are sent to a certified rebuilder, brought back to specification, and returned to stock to be put on another aircraft.




MRO-Inventory management

The second element of managing maintenance is the Bill of Material. Having the whole Bill of Material work smoothly is essential to the ability to deliver maintenance to customers.  The storeroom or maintenance warehouse is considered the single largest barrier to productive maintenance delivery.


Goal of storeroom: Have the right part available in time to support essential maintenance activity while using the least organizational resources.


While you are doing that also:

Give the parts needed quickly to maintenance customers

Know where everything is.

Keep track of quantities

Protect parts from spoilage

Report facts to management


Why bother managing the storeroom?

The biggest reason is the cost and consequence of downtime. Related but opposite reason is the high cost of wasted money represented by the wrong maintenance inventory.  Ideally management balances downtime costs with the cost of the inventory.


Inventory Management does not mean that the inventory is too high (it might be too low). Management should respond to real business conditions




Typical Maintenance Worker’s Day - Reactive versus Pro-Active


Reactive without planning and scheduling

Proactive with planning and scheduling

Receiving Instructions



Obtaining tools and materials



Travel to and from job (both with and without tools and materials)



Coordination delays



Idle at job site



Late starts and early quits



Authorized breaks and relief



Excess personal time (extra breaks, phone calls, smoke breaks, slow return from lunch and breaks, etc)






Direct actual work accomplished (as a percentage of the whole day)



 The 35% direct work in reactive mode versus the 65% direct work in pro-active mode provides a simple but clear justification for establishment of a Planning, Coordination, and Scheduling


Simple Justifications For Planner Position

Two Crews without planning

Two Crews with planning











Total Craftspeople both crews


Total Craftspeople both crews


Direct work percentage


Direct work percentage


Equivalent Full Time workers


Equivalent Full Time workers

Improved Output from Planning and Scheduling 77%


The same two supervisors with planner support (even if it was established without a headcount increase) will have 12.4 full-time equivalents pulling wrenches. This is a net productivity improvement of 77%  which is a fine return from any investment.


Another way to appreciate the advantage of job planning is to depict what happens within an individual job without planning.  Technicians jump into the work without forethought.  Shortly they encountered a delay for lack of a spare part, tool, or authorization.  This sequence may be repeated several times before the job is completed.  In the planned mode, the needs are anticipated and provided for before a technician is assigned.  The comparison is graphically presented below. 


                                    Professional Planning Versus Planning on the Run

Planning on the run



Professional planning




Each dollar invested in planning typically saves three to five dollars during work execution and the duration of a planned job is commonly only half as long as that of an unplanned job. 



Nature of maintenance enables planning by setting up your crews correctly

v  The PPM group provides reliable service:  This group concentrates on preventive/predictive maintenance (PM/PdM together called PPM) work.  An optimized PM/PdM process, requires 15% to 20% of maintenance labor. Neither Emergency nor Scheduled Backlog work should be allowed to interrupt this Group. 

v  The Emergency Group Provides Prompt Response to True Urgent Needs: It has responsibility for handling essentially all urgent demands; requesting assistance only when necessary.  In other words, this group protects the other two groups from interruption.  It cannot fulfill this objective 100% of the time, unless staffed for peak demand, which is not the optimization sought.  Therefore, they need to request help about 10% of the time.  Because the PPM Crew should not be interrupted, the Planned Backlog Relief Crew provides assistance. In a pro-active maintenance environment, prompt response to urgent work requires approximately 10% of maintenance labor resources … assuming multi-skilled personnel staff the Emergency Group. Urgent work, by its nature offers little opportunity to plan or schedule except in a most rudimentary way.  


v  The Planned Backlog Relief Group Provides Timely Relief of those Work Requests with Adequate Lead Time to be Planned:  Backlog consists of all plannable-work (non emergency or urgent).  In a pro-active environment, the bulk of maintenance workload should be plannable.  Sixty-five to seventy-five percent (65% to 75 %) of maintenance resources should work in this mode.  As stated previously, the Planned Group is called upon to support the Response Group whenever it encounters peak demands (10% of the time by design).  Still allows the Planned Group (constituting 75% of the maintenance workforce) to be scheduled and assigned to well prepared jobs and not be interrupted 90% of the time.  This is far more pro-active than most maintenance departments anywhere.



Steps in planning


Development of Work Programs

Work Programs are the vehicles whereby maintenance resources are perpetually balanced with maintenance workload. Without this balance, deferred maintenance increases progressively and the benefits of preventive/predictive maintenance (PPM) cannot be realized.   If deferred maintenance is significant, equipment obviously is not in proper condition.   PPM is not conceived to put equipment in proper condition but to keep it in that condition once it is achieved.


Because maintenance is managed by controlling backlog within established limits, the current backlog upon maintenance crews (measured in weeks) must be calculated and analyzed.    A Work Program should be developed for each maintenance team at least monthly. 


Work Program


Work Programs consist of four sections:

q  Gross labor hours authorized each week including budgeted overtime and contract support. 

q  Quantify resources committed to various indirect activities including vacation, absenteeism, training, meetings, etc.  This sub total is subtracted from Gross resources to calculate the resources available during a typical week for direct work.

q  Deduct average weekly consumption of direct work resources for response to urgent conditions such as equipment failures causing production downtime.

q  Deduct all projected PPM work for next week

q  Finally the resources available for backlog relief are identified. This is the number that the planner goes into the coordination meeting with. The available hours are divided into current levels of “Ready” and “Total” backlog to quantify backlog weeks compared to established benchmarks. 



Steps of the Planning Process

  • After defining and describing the job objectives and scope, list the steps to be performed; thus defining “what” is to be done. Determine if the job or a substantially similar job has been done before and you have a job plan already on the shelf. If the job is not familiar to the planner, a visit to the job site and a discussion with the requestor might be the next step. Pinning down exactly what is to be done is sometimes half the battle.
  • A preliminary go-ahead is usually given at some point early in the process.  If the job is a routine repair (even a large one) the level of approval might be casual. If the job is new, special, or outside the regular business, approval might be very formal, with top management signatures required. Summarize the required budget for the job, thus defining “how much” it will cost and obtain whatever authorizations will be required.
  • Taking the physical location and spaces around the equipment into consideration, plan the manner in which the work is to be accomplished, thus defining “how” the job will be performed. Include ideas about how many people can work efficiently and consider the movement of materials into and out of the area.
  • Establish duration and manpower needs required to “perform” the work. Determine the skill sets needed, and any special licensing requirements (i.e., certified welder) Clarify the sequence of skills required throughout job performance.
  • Will the work be done in-house, will a contractor be called in, or will there be some combination of the two do the job?  These steps define  “who” is to do the “what.”
  • Next, identify and arrange for delivery of all spare parts, materials, consumables, special tools, PPE (personal protective equipment) and equipment necessary to do the job. Determine if the items are in stock.  If so, reserve them.  If not, determine where and when they can be obtained.  The latter step should be accomplished with the support of Purchasing.
  • Then, determine essential reference materials and include them in the Planned Job Package.
  • Finally, planning is not completed until everyone knows what is going to happen. Communicate to all parties involved as well as to management, the plan of WHO is to do WHAT, HOW, by WHEN and for HOW MUCH.
  • The last step moves you out of planning and into coordination and scheduling. Coordinate with the asset custodian to select a mutual time when maintenance can have the resources available and operations can release the asset.  Schedule the work; thus defining “when” the job will be performed.
  • When all these steps have been accomplished, a job plan has been prepared with a schedule for execution according to priorities and equipment availability established with Operations. Supervisors are relieved of preparatory activity, so that they can devote effort to leading job execution. Craftsmen will be far more productive because delays and conflicts are resolved on paper rather than on the shop floor.


A planning package may include (larger and high hazard jobs might have more of these items):

Work order

Work planning sheet

Job plan with details by task with step-by-step procedures. Time for each step (task), summarized by resource group and for the total job

(very large jobs)

Labor deployment plan by craft and skill including labor-hour estimates. Consider contract as well as in-house resources. Use of the GANTT bar chart or PERT network chart to help plan task sequencing to assigned crews is important. Do everything possible to minimize the time the asset is out of service.

Job Hazard Safety Analysis looks at all hazards of the job and seeks to remove or mitigate the hazards.

Per-shutdown work list (prefabricated parts)

Bill of Material.  List all materials needed for the job, including an acquisition plan for major items. Determine if the material is authorized inventory or a direct purchase item. The planning package should include spares reservation and staged location. 


Materials control list

Shutdown protocol and start-up protocol


Executive Summary


A chemical plant is built to produce a particular product, in a particular quantity at a particular quality. Once the plant is built we need three items to meet for the manufacturing to take place. These are Equipment Custody (control) in good working order and up to specifications, Bill of Materials and Labor with the requisite competencies. Whenever a product changes (such as color, size, specifications) one or more of the three elements has to change also. Good maintenance practices deliver that capacity consistently without surprises for as long as the company needs the output. The product of maintenance is the capacity or uptime to make the ultimate product. So production makes products and maintenance makes capacity to make production.