This paper presents the use of decision tree for selecting best statistical features that will discriminate the fault conditions of the gear box from the signals extracted.. The statisti
Trang 1Vibration-based fault diagnosis of spur bevel gear box
using fuzzy technique
Department of Mechanical Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, Tamil Nadu 641105, India
Abstract
To determine the condition of an inaccessible gear in an operating machine the vibration signal of the machine can be continuously monitored by placing a sensor close to the source of the vibrations These signals can be further processed to extract the features and identify the status of the machine The vibration signal acquired from the operating machine has been used to effectively diagnose the condition of inaccessible moving components inside the machine Suitable sensors are kept at various locations to pick up the signals produced by machinery and these signals are very meaningful in condition diagnosis surveillance To determine the important charac-teristics and to unravel the significance of these signals, further analysis or processing is required
This paper presents the use of decision tree for selecting best statistical features that will discriminate the fault conditions of the gear box from the signals extracted These features are extracted from vibration signals A rule set is formed from the extracted features and fed to a fuzzy classifier The rule set necessary for building the fuzzy classifier is obtained largely by intuition and domain knowledge This paper also presents the usage of decision tree to generate the rules automatically from the feature set The vibration signal from a piezo-electric transducer is captured for the following conditions – good bevel gear, bevel gear with tooth breakage (GTB), bevel gear with crack at root of the tooth (GTC), and bevel gear with face wear of the teeth (TFW) for various loading and lubrication conditions The statistical features were extracted and good features that discriminate the different fault conditions of the gearbox were selected using decision tree The rule set for fuzzy classifier is obtained by once using the decision tree again A fuzzy classifier is built and tested with representative data The results are found to be encouraging
Ó 2008 Elsevier Ltd All rights reserved
Keywords: Feature selection; Statistical features; Decision tree; Gear box; Fuzzy; Fault detection
1 Introduction
A faulty gear system could result in serious damage if
defects occur to one of the gears during operation
condi-tion Early detection of the defects, therefore, is crucial to
prevent the system from malfunction that could cause
dam-age or entire system halt Diagnosing a gear system by
examining the vibration signals is the most commonly used
method for detecting gear failures The conventional
meth-ods for processing measured data contain the frequency
domain technique, time domain technique, and time– frequency domain technique These methods have been widely employed to detect gear failures The use of vibra-tion analysis for gear fault diagnosis and monitoring has been widely investigated and its application in industry is well established (Cameron & Stuckey, 1994; Gadd &
is particularly reflected in the aviation industry where the helicopter engine, drive trains and rotor systems are fitted with vibration sensors for component health monitoring The raw vibration signal in any mode from a single point
on a machine is not a good indicator of the health or con-dition of a machine Vibration is a vectorial parameter with three dimensions and requires to be measured at several carefully selected points
0957-4174/$ - see front matter Ó 2008 Elsevier Ltd All rights reserved.
doi:10.1016/j.eswa.2008.01.010
*
Corresponding author Tel.: +91 4222656422; fax: +91 4222656274.
E-mail addresses: n_saravanan@ettimadai.amrita.edu , nsaro_2000@
yahoo.com (N Saravanan).
www.elsevier.com/locate/eswa Expert Systems with Applications 36 (2009) 3119–3135
Expert Systems with Applications
Trang 2Vibration analysis can be carried out using Fourier
transform techniques like Fourier series expansion (FSE),
Fourier integral transform (FIT) and discrete Fourier
transform (DFT) (Collacott, xxxx) After the development
of large-scale integration (LSI) and the associated
micro-processor technology, fast Fourier transform (FFT)
ana-lyzers became cost effective for general applications The
raw signatures acquired through a vibration sensor needed
further processing and classification of the data for any
meaningful surveillance of the condition of the system
being monitored
machine (SVM) and Fuzzy classifier are widely used as
1998; Jack & Nandi, 2000a; Nandi, 2000; Samanta &
Al-Baulshi, 2003; Samanta, Al-Al-Baulshi, & Al-Araimi, 2003;
generalization of the results in models that can over fit
the data (Samanta et al., 2003) SVM has high classification
accuracy and good generalization capabilities for crisp data
the problem at hand, the nature of the fault itself is fuzzy
in nature Fuzzy classifier models the physical problem
under study more closely The flow chart of the fault
diag-nostic system is shown inFig 1
1.1 Different phases of present work
The signals obtained are processed further for machine
condition diagnosis as explained in the flow chartFig 1
2 Experimental studies
DC motor (0.5 hp) with speed up to 3000 rpm is the basic
drive A short shaft of 30 mm diameter is attached to the shaft of the motor through a flexible coupling; this is to minimize effects of misalignment and transmission of vibra-tion from the motor The shaft is supported at its ends through two roller bearings From this shaft the drive is transmitted to the bevel gear box by means of a belt drive
and the full lubrication level is 110 mm and half lubrication level is 60 mm
SAE 40 oil was used as a lubricant An electromagnetic spring-loaded disc brake was used to load the gear wheel A torque level of 8 N-m was applied at the full-load condi-tion The various defects are created in the pinion wheels and the mating gear wheel is not disturbed With the sensor mounted on top of the gear box vibrations signals are obtained for various conditions The selected area on the top of the gearbox for mounting the sensor is made flat and smooth to ensure effective coupling between the sensor and the gearbox The sensor used is a piezoelectric acceler-ometer (Dytran model) which is mounted on the flat
Vibration Signals
Feature Selection Using J 48 Algorithm
Rule Generation
Test data set
Modeling Fuzzy
system
Fuzzy inference engine
Fuzzy output
Machine Condition Diagnosis
Fig 1 Flowchart for bevel gear box health diagnosis.
Bevel
Fig 2 Fault simulator setup.
Pinion Wheel Gear
Wheel
Electromagnetic spring loaded disc brake
Fig 3 Inner view of the bevel gear box.
Trang 3surface using direct adhesive mounting technique The
accelerometer is connected to the signal-conditioning unit
(DACTRAN FFT analyzer), where the signal goes through
the charge amplifier and an analogue-to-digital converter
(ADC) The vibration signal in digital form is fed to the
computer through a USB port The software RT Pro-series
that accompanies the signal-conditioning unit is used for
recording the signals directly in the computer’s secondary memory The signal is then read from the memory and pro-cessed to extract different features
2.1 Experimental procedure
In the present study, four pinion wheels whose details
wheel and was assumed to be free from defects In the other three pinion wheels, defects were created using electron dis-charge machine (EDM) in order to keep the size of the defect under control The details of the various defects are depicted inTable 2and its views are shown inFig 4 The size of the defects is in-line with work reported in literature (Gadd & Mitchell, 1984) The vibration signal from the piezoelectric pickup mounted on the test bearing was taken, after allowing initial running of the bearing for sometime The sampling frequency was 12,000 Hz and sample length was 8192 for all speeds and all conditions The sample length was chosen arbitrarily, however, the fol-lowing points were considered Statistical measures are more meaningful, when the number of samples is more
On the other hand, as the number of samples increases the computation time increases To strike a balance, sample length of around 10000 was chosen In some feature extrac-tion techniques, which will be used with the same data, the number of samples is to be 2n The nearest 2n–10,000 is
8192 and hence, it was taken as sample length Many trials were taken at the set speed and vibration signal was stored
Table 1
Details of faults under investigation
Gears Fault description Dimension (mm)
G2 Gear tooth breakage (GTB) 8
G3 Gear with crack at root (GTC) 0.8 0.5
Table 2
Gear wheel and pinion details
Chordal tooth thickness (mm) 3.93 0.150 3.92 0.110
Chordal tooth height (mm) 2.53 2.55
c
Fig 4 (a)View of good pinion wheel (b) View of pinion wheel with face wear (GFW) (c) View of pinion wheel with tooth breakage (GTB).
Trang 4in the data file The raw vibration signals acquired for
var-ious experimental conditions form the gearbox using FFT
are shown inFig 5
3 Feature extraction
Statistical analysis of vibration signals yields different
reported (James Li & Wu, 1989) use these in combinations
to elicit information regarding bearing faults Such
proce-dures use allied logic often based on physical
Consider-ations A fairly wide set of these parameters is selected as
a basis for our study, as detailed below
a Mean b Standard error c Median d Standard
devi-ation e Sample variance f Kurtosis g Skewness h Range
i Minimum j Maximum k Sum
All the above mentioned statistical features were
extracted for the vibration signals obtained for various
conditions and fed as an input to J 48 algorithm for select-ing the best features which classify the different fault conditions
4 Descriptive statistics The statistical features are explained below
4.1 Standard deviation This is a measure of the effective energy or power con-tent of the vibration signal and clearly indicates deteriora-tion in the bearing condideteriora-tion The following formula was used for computation of standard deviation
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
nPx2ðPxÞ2
nðn 1Þ
s
:
-0.4
-0.2
0
0.2
Sample No.
Good-Dry-Unload
-0.4 -0.2 0 0.2
Good-Dry-FullLoad
Sample No.
-0.4
-0.2
0
0.2
Good-HalfLub-Unload
Sample No.
-0.2 0 0.2
Good-HalfLub-FullLoad
Sample No.
-0.2
0
0.2
Good-FullLub-Unload
Sample No.
-0.2 0 0.2
Good-Full-FullLoad
Sample No.
-0.4 -0.2 0 0.2
Sample No.
GTB-Dry-Unload
-0.4 -0.2 0 0.2
GTB-Dry-FullLoad
Sample No.
-0.4 -0.2 0 0.2
GTB-HalfLub-Unload
Sample No.
-0.4 -0.2 0 0.2
GTB-HalfLub-FullLoad
Sample No.
-0.4 -0.2 0 0.2
GTB-FullLub-Unload
Sample No.
-0.4 -0.2 0 0.2
GTB-FullLub-FullLoad
Sample No.
-0.1
0
0.1
Sample No.
GTC-Dry-Unload
-0.1 0 0.1
GTC-Dry-fullLoad
Sample No.
-0.1
0
0.1
GTC-HalfLub-Unload
Sample No.
-0.1 0 0.1
GTC-HalfLub-FullLoad
Sample No.
-0.1
0
0.1
GTC-FullLub-Unload
Sample No.
-0.1 0 0.1
GTC-FullLub-FullLoad
Sample No.
-0.2 0 0.2
Sample
TFW-Dry-Unload
-0.2 0 0.2
TFW-Dry-FullLoad
Sample
-0.2 0 0.2
TFW-HalfLub-Unload
Sample
-0.2 0 0.2
TFW-HalfLub-FullLoad
Sample
-0.2 0 0.2
TFW-FullLub-Unload
Sample
-0.2 0 0.2
TFW-FullLub-FullLoad
Sample
Fig 5 (a) Vibration signal for good pinion wheel under different lubrication and loading conditions (b) Vibration signal for pinion wheel with teeth breakage under different lubrication and loading conditions (c) Vibration signal for pinion wheel with crack at root under different lubrication and loading conditions (d) Vibration signals for pinion wheel with teeth face wear under different lubrication and loading conditions.
Trang 54.2 Skewness
Skewness characterizes the degree of asymmetry of a
distribution around its mean The below shown expression
was used to calculate the skewness, where ‘n’ is the sample
size and ‘s’ is the sample standard deviation
ðn 1Þðn 2Þ
s
:
4.3 Kurtosis
Kurtosis indicates the flatness or the spikiness of the
sig-nal Its value is very low for good bevel gearbox and high
for faulty gearbox due to the spiky nature of the signal
ðn 1Þðn 2Þðn 3Þ
s
2
ðn 2Þðn 3Þ:
where ‘s’ is the sample standard deviation
4.4 Standard error
Standard error is a measure of the amount of error in
the prediction of y for an individual x in the regression,
where x and y are the sample means and ‘n’ is the sample
size
Standard error of the predicted
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
ðn 2Þ
X
ðy yÞ2
Pðx xÞðy yÞ
Pðx xÞ2
v
u
:
4.5 Sample variance
It is variance of the signal points and the following
for-mula was used for computation of standard variance
4.6 Range
It refers to the difference in maximum and minimum
sig-nal point values for a given sigsig-nal
4.7 Minimum value
It refers to the minimum signal point value in a given
signal As the gear parts (crack, breakage, face wear) get
degraded, the vibration levels seem to go high Therefore,
it can be used to detect faulty gears
4.8 Maximum value
It refers to the maximum signal point value in a given signal
4.9 Sum
It is the sum of all signal point values in a given signal
5 Using J 48 algorithm in the present work
A standard tree induced with c5.0 (or possibly ID3 or c4.5) consists of a number of branches, one root, a number
of nodes and a number of leaves One branch is a chain of nodes from root to a leaf; and each node involves one attri-bute The occurrence of an attribute in a tree provides the information about the importance of the associated
represen-tation methodology used to represent classification rules J48 algorithm (A WEKA implementation of c4.5 Algo-rithm) is a widely used one to construct Decision Trees
The Decision Tree algorithm has been applied to the problem under discussion Input to the algorithm is set of statistical features of vibration signatures It is clear that the top node is the best node for classification The other features in the nodes of Decision Tree appear in descending order of importance It is to be stressed here that only fea-tures that contribute to the classification appear in the Decision Tree and others do not Features, which have less discriminating capability, can be consciously discarded by deciding on the threshold This concept is made use for selecting good features The algorithm identifies the good features for the purpose of classification from the given training data set, and thus reduces the domain knowledge required to select good features for pattern classification
condi-tions of different faults compared with good condicondi-tions of the pinion gear wheel
Based on the output of J 48 algorithm, the decision tree various statistical parameters are selected for the various conditions of the gearbox The values appearing between various nodes in the decision tree are used for generating the fuzzy rules to classify the various conditions of the gearbox under study
5.1 Application of decision tree for feature selection The algorithm has been applied to the problem under discussion for feature selection Input to the algorithm is
extracted from raw vibration signatures, the output is the Decision Tree It is clear there from that the top node is the best node for classification The other features appear
Trang 6in the nodes in Decision Tree in descending order of
impor-tance It is to be stressed here that only features that
con-tribute to the classification appear in the Decision Tree
and others do not The level of contribution is not same
and all statistical features are not equally important The
level of contribution by individual feature is given by a
sta-tistical measure within the parenthesis in the Decision Tree
The first number in the parenthesis indicates the number of
data points that can be classified using that feature set The
second number indicates the number of samples against
this action If the first number is very small compared to
the total number of samples, then the corresponding
fea-tures can be considered as outliers and hence ignored
Fea-tures that have less discriminating capability can be
consciously discarded by deciding on the threshold This
concept is made use of in selecting good features The
algo-rithm identifies the good features for the purpose of
classi-fication from the given training data set and thus reduces
the domain knowledge required to select good features
for pattern classification problem
6 Methodology adopted for fuzzy classification
7 Fuzzy logic (classifier)
Fuzzy Logic provides a precise approach for dealing
with uncertainty Fuzzy inference is a method that
inter-prets the values in the input vector and, based on some
set of rules, assigns values to the output vector The
point of fuzzy logic is to map an input space to an
out-put space, and the primary mechanism for doing this is a
list of ‘if-then’ statements called rules Rules are the
inputs for building a fuzzy inference engine The
method-ology adopted for fuzzy classification is shown in Fig 6
All rules are evaluated in parallel, and the order of the
rules is unimportant The real world data do not have
sharply defined boundaries where information is often
incomplete or sometimes unreliable In quest for
preci-sion, scientists have generally attempted to manipulate
the real world into artificial mathematical models that
make no provision for gradation Because Fuzzy Logic
provides the tools to classify information into broad,
coarse categorizations or groupings, it has infinite
possi-bilities for application which have proven to be much cheaper, simpler and more effective than other systems
For the problem at hand, the condition of the gearbox, good or faulty is basically fuzzy in nature All the faults do not occur in the gearbox instantly It comes gradually In that case, there is no threshold value (crisp data) based
on which the decision on the condition of the gearbox can be taken (Whether gearbox is now good or faulty) The problems of this kind can be modeled using fuzzy logic
8 Membership function
A membership function (MF) is a curve that defines how each point in the input space is mapped to a membership value (or degree of membership) between 0 and 1 Observ-ing the values of the feature, based on which the branches
of the Decision Tree are created for different conditions of the gearbox, the membership functions for the correspond-ing features are defined There are four possible outcomes from a fuzzy classifier, namely: good bevel gear, bevel gear with tooth breakage (GTB), bevel gear with crack at root
of the tooth (GTC), and bevel gear with face wear of the teeth (TFW) for various loading and lubrication condi-tions Hence, four membership functions are defined with equal range for the output
9 Rule generation from decision tree Artificial neural network and support vector machine are used to generate rule for classification problems
xxxx) In this study, Decision Tree is used for that pur-pose Decision Tree shows the relation between features and the condition of the gearbox Tracing a branch from the root node leads to a condition of the gearbox (Refer
informa-tion available in a branch in the form of ‘if-then’ state-ment gives the rules for classification using fuzzy for
Dry/Half/Full
GOOD/GTC/
GTB/TFW Fig 6 Methodology of classification using Fuzzy Fig 7 Decision tree from J 48 Algorithm for dry-lub no-load condition.
Trang 7various conditions of the gearbox Hence the usefulness of
the decision tree in forming the rules for fuzzy
classifica-tion is established
10 Generation of rules for various gearbox conditions and
discussions
The preceding section describes how the classification
has been carried out using fuzzy technique
10.1 Dry-lubrication and no-load condition
variance play a decisive role in classifying the various gear-box faults under dry lubrication and no-load condition This output of the decision tree is used to design the mem-bership function for fuzzy classifier as shown inFig 8–10
A membership function (MF) is a curve that defines how each point in the input space is mapped to a membership
Fig 8 Membership function for ‘‘standard error”.
Fig 9 Membership function for ‘‘kurtosis”.
Fig 10 Membership function for ‘‘variance”.
Trang 8value (or degree of membership) between 0 and 1 In the
present study, trapezoidal membership function is used
The selection of this membership function is to some extent
arbitrary However, the following points were considered
while selecting membership function The Decision Tree
for the selected three features is shown inFig 7 Observing
the values of the feature, based on which the branches of
the Decision Tree is created, the membership functions
for all three features are defined for standard error,
kurto-sis and variance, respectively
10.1.1 Rules designed for the dry-lubrication and no-load
condition
1 If (stderr is not stderr) then (Output1 is GTC)
2 If (stderr is stderr) and (kurtosis is Kur) then (Output1 is
GOOD)
3 If (stderr is stderr) and (kurtosis is not Kur) and
(vari-ance is Var) then (Output1 is GTB)
4 If (stderr is stderr) and (kurtosis is not Kur) and
(vari-ance is not Var) then (Output1 is TFW)
The membership value of the condition being GTC is
when the standard error value is less than or equal to
0.000175 (fromFig 7) which is the threshold value Hence,
up to this threshold value the membership function
gener-ates the value ‘0’ and afterwards it increases linearly
(assumption) The trapezoidal membership function suits
this phenomenon and hence it was selected to map each
point in the input space to a membership value To review,
the threshold values are given by decision tree and the slope
is defined by the user through heuristics The threshold
value (0.000175) is defined based on the representative
training dataset If standard error value is less than or
equal to 0.000175, a membership function which is defined
on a 0–1 scale gives a value of 0 which means that it is not a
standard error If threshold value is greater than 0.000175,
the membership function generates a value of 1 Similarly
membership functions for other features are designed
accordingly and shown inFigs 9 and 10
There are four possible outcomes from a fuzzy classifier, namely: Good, GTC, GTB and TFW Hence, four member-ship functions are defined with equal range and shown in
10.1.2 Fuzzy inference engine After defining membership functions and generating the
‘if-then’ rules, the next step is to build the fuzzy inference engine The fuzzy toolbox available in MATLAB 7 was used for building fuzzy inference engine Each rule was taken at a time and using membership functions and fuzzy operators the rules were entered The rules were obtained from a training data set (150 trials in each condition) For testing the built model a portion of the data (100 trials
in each condition) called testing data was kept aside Using the testing data, the fuzzy inference engine was evaluated and its performance was presented as confusion matrix in
table (3) show the number of correctly classified instances
In the first row, the first element shows the number of data points belonging to ‘good’ class and classified by fuzzy logic as ‘good’ The second element shows the number of data points belonging to ‘GTC’ class and classified by fuzzy logic as ‘GTC’ The third element shows the number of data points belonging to ‘GTB’ class and classified by fuzzy logic as ‘GTB’ The fourth element shows the number of data points belonging to ‘TFW’ class and classified by fuzzy logic as ‘TFW’.Table 3illustrates the powerfulness
of the fuzzy rules designed with the aid of the decision trees
by the authors
Here each row corresponds to each rule as discussed in
Fig 11 Membership functions for condition (output).
Table 3
Trang 9section 6.4.2 The first three blocks in rows represents the
membership function of standard error, kurtosis, variance,
respectively The fourth block corresponds to the
help of sample inputs for standard error, kurtosis and
var-iance the rules are tested as follows, for a sample input of
standard error as 0.0005, kurtosis as 10 and variance as
0.005 which satisfies the second rule completely and the
corresponding output condition is GOOD, which is shown
in the output block of the second row in the rule viewer
shown inFig 12
10.1.3 Confusion matrix
In lieu with the above discussions the fuzzy rules,
mem-bership functions, confusion matrix and rule viewer are
shown in Sections10.2–10.5 and 10.6
10.2 Dry-lubrication and full-load condition
10.2.1 Rules designed for the dry-lubrication and full-load condition
1 If (stderr is stderr) then (output1 is GTC)
2 If (stderr is not stderr) and (variance is var3) then (out-put1 is GTB)
3 If (kurtosis is kur) and (variance is var2) then (output1 is GOOD)
4 If (kurtosis is not kur) and (variance is var1) then (out-put1 is TFW)
Here there are three membership functions to represent three threshold values of variance in the decision tree
stderr = 0.0005, kurtosis = 10, Variance = 0.005, then the output is 6.25, i.e., the condition is GTB
10.2.2 Confusion matrix
Fig 12 Rule viewer for one of the test data.
Fig 13 Decision tree from J 48 Algorithm for dry-lub full-load condition.
Trang 1010.3 Half-lubrication and no-load condition
10.3.1 Rules designed for half -lubrication and no-load
condition
1 If (stderr is stderr1) then (output1 is GTC)
2 If (stderr is not stderr2) then (output1 is GOOD)
3 If (stderr is stderr2) and (kurtosis is kur) then (output1 is
GTB)
4 If (stderr is stderr2) and (kurtosis is not kur) then (out-put1 is TFW)
10.3.2 Confusion matrix
10.4 Half-lubrication and full-load condition
10.4.1 Rules designed for half-lubrication and full-load condition
Fig 14 Membership function for ‘‘stderr”.
Fig 15 Membership function for ‘‘kurtosis”.
Fig 16 Membership functions for ‘‘variance”.