FAILURE MODE AND EFFECT ANALYSIS - METHODOLOGY
Failure mode and effect analysis (FMEA)
is a systematic way of assuring that every conceivable potential failure
of a design/process has been considered with the objective of minimising
the probability of failure.
1 INTRODUCTION
-
Definition
-
Timing
-
Preparation
-
Method
2 OCCURRENCE
3 SEVERITY OF EFFECT
4 DETECTION
1 INTRODUCTION Definition Failure mode
and effect analysis (FMEA) is a systematic way of assuring that every
conceivable potential failure of a design/process has been considered.
The object of using FMEA is to minimise the probability of failure. More
precisely, IEEE Std 352-1975: Guide for General Principles of
Reliability Analysis of Nuclear Power Generating Station Protection
Systems, one of the definitive works on FMEA, defines the purposes of an
FMEA as being to:
- assist in selecting design
alternatives with high reliability and high safety potential during
early design phase
- ensure that all conceivable failure
modes and their effects on operational success of the system have
been considered
- list potential failures and
identify the magnitude of their effects
- develop early criteria for test
planning and the design of the test and check-out systems
- provide a basis for quantitative
reliability and availability analyses
- provide historical documentation
for future reference to aid in analysis of field failures and
consideration of design changes
- provide input data for trade off
studies
- provide basis for establishing
corrective action priorities
- assist in the objective evaluation
of design requirements related to redundancy,
failure detection systems, fail-safe
characteristics and automatic and manual override .(IEEE Std 352-1975)
Timing The FMEA should be an integral
part of the early design evaluation and should be periodically updated
to reflect changes in design or application. An updated FMEA should be a
major consideration in design reviews, inspections, or other major
system review points in the program. During the design phase, an FMEA
should be performed or updated at the following program stages:
-
Concept formulation or selection
-
Preliminary design or layout
-
Completion of detail part design
-
Design improvement programs
The FMEA may also be performed with
limited design information in which case the basic questions to be
answered by an FMEA are as follows:
-
How can each part conceivably fail?
-
What mechanisms might produce these
modes of failure?
- What
could the effects be if these failures did occur?
- Is the
failure in the safe or unsafe direction?
- How is
the failure detected?
-
What inherent provisions are provided
in the design to compensate for the failure?
Preparation Before undertaking an FMEA it
is essential to undertake certain preparatory steps; the scope will
depend on the complexity of the system/article being studied.
-
Definition of the system/article to be
analyzed and its mission.
-
Description of the operation of the
system.
-
Identification of failure categories.
-
Description of the environmental
conditions.
Method Data is entered into a table (see
below) under the following headings:
- Part;
each system component or part being analyzed is named (or referenced
by other appropriate designator such as circuit reference
-
Function; brief note as to function of part.
-
Potential failure mode; this should cover ever way in which the part
could fail and should include random and degradation failures. Ask
'How could it fail?' not 'Will it fail?'
-
Potential effect of failure; brief description of the consequences
of failure.
-
Severity; see the section on severity below.
-
Potential causes of failure; what caused this failure mode.
-
Occurrence; see section on occurrence below for guidance.
- How will
the potential failure be detected? Some failures are obvious to the
person using the subject of the FMEA, but if this is not the case,
the means by which the failures can be detected should be listed.
-
Detection; see section on detection below for guidance.
- Risk
Priority Number (RPN) = Severity * Occurrence * Detection
-
Actions: Detail recommended actions
.Example failure mode and effect
analysis table for ball-point pen The Occurrence is the assessment of
the probability that the specific cause of the Failure mode will occur.
It is part subjective, but the wording should describe the probability.
Failure history is helpful in increasing the truth of the probability.
Questions of the following type are helpful:
Part |
Function |
PotentialFailure Mode |
Potential
effectsof failure |
SEVERITY
|
Potentialcauses offailure |
OCCURRENCE |
How will
thepotential failure be detected? |
DETECTION
|
RPN |
Actions
|
Outer tube |
Provides grip for writer |
Hole gets blocked |
Vacuum on ink supply stops flow |
7 |
Debris ingress into hole |
3 |
Check clearance of hole |
5 |
105 |
Make hole larger |
Ink |
Provide writing medium |
Incorrect viscosity |
High flow |
4 |
Too much solvent |
2 |
QC on ink supply |
4 |
32 |
Introduce more rigid QC |
Ink |
Provide writing medium |
Incorrect viscosity |
Low flow |
4 |
Too little solvent |
2 |
QC on ink supply |
3 |
24 |
No action required |
-
What statistical data is available from
previous or similar process designs?
-
Is the process a repeat of a previous
design, or have there been some changes?
-
Is the process design completely new?
-
Has the environment in which the
process is to operate changeable?
-
Have mathematical or engineering
studies been used to predict failure
The Ranking and suggested criteria are:
Severity is an assessment of the seriousness of the Effect and refers
directly to the potential failure mode being studied. The Customer in
process FMEA is both the internal and where appropriate, external
Customer. The severity ranking is also an estimate of how difficult it
will be for the subsequent operations to be carried out to its
specification in Performance, Cost, and Time
Notional probability of failure
|
Evaluated Failure Rates
|
Cpk |
Rank |
Remote: Failure is unlikely. No
Failures ever associated with almost identical processes
|
1 in 1,500,000 |
>1.67 |
1 |
Very Low: Only Isolated
Failures associated with almost identical processes |
1 in 150,000 |
1.50 |
2 |
Low: Isolated Failures
associated with similar processes |
1 in 15,000 |
1.33 |
3 |
Moderate: Generally associated
with processes similar to previous processes Failures, but not
in 'major' proportions |
1 in 2,000 |
1.17 |
4 |
|
1 in 400 |
1.00 |
5 |
|
1 in 80 |
0.83 |
6 |
High: Generally associated with
processes similar to previous processes that have often failed
|
1 in 20 |
0.67 |
7 |
|
1 in 8 |
0.51 |
8 |
Very High: Failure is almost
inevitable |
1 in 3 |
0.33 |
9 |
|
1 in 2 |
<0.33 |
10 |
The Ranking and suggested criteria are:
Effect |
Criteria |
Severity of Effect |
Rank |
None |
|
No Effect |
1 |
Very Minor |
Minor disruption to production
line |
A portion of the product may
have to be reworked. Defect not noticed by average customers;
cosmetic defects. |
2 |
Minor |
Minor disruption to production
line. |
A portion of the product may
have to be reworked. Defect noticed by average customers;
cosmetic defects. |
3 |
Very Low |
Minor disruption to production
line. |
The product may have to be
sorted and reworked. Defect noticed by average customers;
cosmetic defects. |
4 |
Low |
Some disruption to product
line. |
100% of product may have to be
reworked. Customer has some dissatisfaction. Item is fit for
purpose but may have reduced levels of performance. |
5 |
Moderate |
Some disruption to product
line. |
A portion of the product may
have to be scrapped. Customer has some dissatisfaction. Item is
fit for purpose but may have reduced levels of performance.
|
6 |
High |
Some disruption to product
line. |
Product may have to be sorted
and a portion scrapped. Customer dissatisfied. Item is useable
but at reduced levels of performance. |
7 |
Very High |
Major disruption to production
line. |
100% of product may have to be
scrapped. Loss of primary function. Item unusable. Customer very
dissatisfied. |
8 |
Hazard with warning
|
May endanger machine or
operator. |
Failure occurs with warning.
The failure mode affects safe operation and involves
noncompliance with regulations |
9 |
Hazard without warning
|
May endanger machine or
operator |
Failure occurs without warning.
The failure mode affects safe operation and involves
noncompliance with regulations |
10 |
This is an assessment of the probability
that the proposed Process Controls will detect a potential cause of
Failure or a Process weakness. Assume the Failure has occurred and then
assess the ability of the Controls to prevent shipment of the part with
that defect. Low Occurrence does not mean Low Detection - the Control
should detect the Low Occurrence. Statistical sampling is an acceptable
Control. Improving Product and/or Process design is the best strategy
for reducing the Detection ranking - Improving means of Detection still
requires improved designs with its subsequent improvement of the basic
design. Higher rankings should question the method of the Control.
The Ranking and suggested criteria are:
Detection |
The likelihood the Controls will
detect a Defect |
Rank |
Almost Certain |
Current controls are almost
certain to detect the Failure Mode. Reliable detection controls
are known with similar processes. |
1 |
Very High |
Very High likelihood the
current controls will detect the Failure Mode. |
2 |
High |
High likelihood that the
current controls will detect the Failure Mode. |
3 |
Moderately High |
Moderately high likelihood that
the current controls will detect the Failure Mode. |
4 |
Moderate |
Moderate likelihood that the
current controls will detect the Failure Mode. |
5 |
Low |
Low likelihood that the current
controls will detect the Failure Mode |
6 |
Very Low |
Very Low likelihood that the
current controls will detect the Failure Mode |
7 |
Remote |
Remote likelihood that the
current controls will detect the Failure Mode |
8 |
Very Remote |
Very Remote likelihood that the
current controls will detect the Failure Mode |
9 |
Almost Impossible |
No known controls available to
detect the Failure Mode. |
10 |
|