A comparative study on classification of features by SVMand PSVM extracted using Morlet wavelet for fault diagnosis of spur bevel gear box Department of Mechanical Engineering, Amrita Vis
Trang 1A comparative study on classification of features by SVM
and PSVM extracted using Morlet wavelet for fault diagnosis
of spur bevel gear box
Department of Mechanical Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, Tamil Nadu, India
Abstract
The condition of an inaccessible gear in an operating machine can be monitored using the vibration signal of the machine measured at some convenient location and further processed to unravel the significance of these signals This paper deals with the effectiveness of wavelet-based features for fault diagnosis using support vector machines (SVM) and proximal support vector machines (PSVM) The statistical feature vectors from Morlet wavelet coefficients are classified using J48 algorithm and the predominant features were fed as input for training and testing SVM and PSVM and their relative efficiency in classifying the faults in the bevel gear box was compared
2007 Elsevier Ltd All rights reserved
Keywords: Support vector machine; Proximal support vector machines; Bevel gear box; Morlet wavelet; Statistical features; Fault detection
1 Introduction
Fault diagnosis is an important process in preventive
maintenance of gear box which avoids serious damage if
defects occur to one of the gears during operation
condi-tion Early detection of the defects, therefore, is crucial to
prevent the system from malfunction that could cause
dam-age or entire system halt Diagnosing a gear system by
examining vibration signals is the most commonly used
method for detecting gear failures In the recent past
reports of fault diagnosis of critical components using
machine learning algorithms like SVM, PSVM are reported
(Sugumaran, Muralidharan, & Ramachandran, 2006) The
conventional methods for processing measured data
con-tain the frequency domain technique, time domain
tech-nique, and time-frequency domain technique These
methods have been widely employed to detect gear failures
The use of vibration analysis for gear fault diagnosis and
monitoring has been widely investigated and its application
Gadd & Mitchell, 1984; Leblanc, Dube, & Devereux,
where the helicopter engine, drive trains and rotor systems are fitted with vibration sensors for component health monitoring
Support vector machine (SVM) is used in many applica-tions of machine learning because of its high accuracy and good generalization capabilities SVM is based on statisti-cal learning theory SVM classifies better than ANN because of the principle of risk minimization In artificial neural network (ANN) traditional empirical risk minimiza-tion (ERM) is used on training data set to minimize the error But in SVM, structural risk minimization (SRM) is used to minimize an upper bound on the expected risk SVM is modeled as an optimization problem and involves extensive computation, whereas, PSVM is modeled as a system of linear equations which involves less computation (Burgess, 1998) PSVM gives results very close to SVM Wavelet transform (WT) has attracted many
the wavelet transform to represent all possible types of transients in vibration signals generated by faults in a
0957-4174/$ - see front matter 2007 Elsevier Ltd All rights reserved.
doi:10.1016/j.eswa.2007.08.026
*
Corresponding author Tel.: +91 4222656422; fax: +91 4222656274.
E-mail address: n_saravanan@ettimadai.amrita.edu (N Saravanan).
www.elsevier.com/locate/eswa Expert Systems with Applications 35 (2008) 1351–1366
Expert Systems with Applications
Trang 2gearbox.Petrille, Paya, Esat, and Badi (1995)proposed the
neural network to diagnose a simple gear system after the
data have been pre-processed by the wavelet transform
Boulahbal, Golnaraghi, and Ismail (1997)used the wavelet
transform to analyze the vibration signal from the gear
sys-tem with pitting on the gear The raw vibration signal in
any mode from a single point on a machine is not a good
indicator of the health or condition of a machine
Vibra-tion is a vectorial parameter with three dimensions and
requires to be measured at several carefully selected points
Vibration analysis can be carried out using Fourier
transform techniques like Fourier series expansion (FSE),
Fourier integral transform (FIT) and discrete Fourier
large-scale integration (LSI) and the associated
micropro-cessor technology, fast Fourier transform (FFT) analyzers
became cost effective for general applications The raw
sig-natures acquired through a vibration sensor needed further
processing and classification of the data for any meaningful
surveillance of the condition of the system being
acquisition and further processing using PSVM
This work deals with extraction of features from the
vibration data of a bevel gear box system by Morlet
wave-let and classification of Gear faults using support vector
machine (SVM) and proximal support vector machine
(PSVM) The vibration signal from a piezoelectric trans-ducer is captured for the following conditions: good bevel gear, bevel gear with tooth breakage (GTB), Bevel Gear
Fig 1 Flowchart of fault diagnosis system.
Fig 2 Flow chart for bevel gear box condition diagnosis.
Fig 3 Fault simulator setup.
Fig 4 Inner view of the bevel gear box.
1352 N Saravanan et al / Expert Systems with Applications 35 (2008) 1351–1366
Trang 3with crack at root of the tooth (GTC), and bevel gear with
face wear of the teeth (TFW) for various loading and
lubri-cation conditions
Wavelet transform is a time-frequency signal analysis
method, which is widely used and well established It has
the local characteristic of time domain as well as frequency
domain In the processing of non-stationary signals, it
presents better performance than the traditional Fourier
analysis Hence, wavelet transform has got potential
appli-cation in gear box fault diagnosis in which features are
extracted from the wavelet transform coefficients of the
vibration signals Continuous wavelet transform (CWT) could put the fine partition ability of wavelet transform
to good use, and is quite suitable for the gear box fault diagnosis In this work, the coefficients of Morlet wavelet were used for feature extraction Even though different pos-sible families of wavelets are available in wavelet applica-tion, but the Morlet wavelet has been used most commonly in the literature for the analysis of vibration sig-nal from rotating machineries This is due to the fact that the Morlet wavelet is able to pickup impulses generated
by rotating elements A group of statistical features like kurtosis, standard deviation, maximum value, etc form a set of features, which are widely used in fault diagnostics, are extracted from the wavelet coefficients of the time domain signals Selection of good features is an important phase in pattern recognition and requires detailed domain knowledge The Decision Tree using J48 algorithm was used for identifying the best features from a given set of samples The selected features were fed as input to SVM for classification
1.1 Different phases of present work The signals obtained are processed further for machine
2 Experimental studies
speed DC motor (0.5 hp) with speed up to 3000 rpm is the basic drive A short shaft of 30 mm diameter is attached to the shaft of the motor through a flexible coupling; this is to
Table 1
Details of faults under investigation
G3 Gear with crack at root (GTC) 0.8 · 0.5 · 20
Table 2
Gear wheel and pinion details
Chordal tooth thickness 3.930.150mm 3.920.110mm
Fig 5 (a) View of good pinion wheel; (b) view of pinion wheel with face wear (GFW); (c) view of pinion wheel with tooth breakage (GTB).
Trang 4minimize effects of misalignment and transmission of
vibra-tion from motor
The shaft is supported at its ends through two roller
bearings From this shaft the motion is transmitted to
the bevel gear box by means of a belt drive The gear
full lubrication level is 110 mm and half lubrication level
is 60 mm SAE 40 oil was used as a lubricant An
electro-magnetic spring loaded disc brake was used to load the gear wheel A torque level of 8 N-m was applied at the full load condition The various defects are created in the pin-ion wheels and the mating gear wheel is not disturbed With the sensor mounted on top of the gear box vibra-tions signals are obtained for various condivibra-tions The selected area is made flat and smooth to ensure effective coupling A piezoelectric accelerometer (Dytran model)
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
Sample No.
Good-Dry-Unload
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
Good-Dry-FullLoad
Sample No.
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
Good-HalfLub-Unload
Sample No.
0 2000 4000 6000 8000 -0.2
0 0.2
Good-HalfLub-FullLoad
Sample No.
0 2000 4000 6000 8000 -0.2
0 0.2
Good-FullLub-Unload
Sample No.
0 2000 4000 6000 8000 -0.2
0 0.2
Good-Full-FullLoad
Sample No.
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
Sample No.
GTB-Dry-Unload
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
GTB-Dry-FullLoad
Sample No.
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
GTB-HalfLub-Unload
Sample No.
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
GTB-HalfLub-FullLoad
Sample No.
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
GTB-FullLub-Unload
Sample No.
0 2000 4000 6000 8000 -0.4
-0.2 0 0.2
GTB-FullLub-FullLoad
Sample No.
Fig 6 (a) Vibration signal for good pinion wheel under different lubrication and loading conditions; (b) vibration signal for pinion wheels with teeth breakage under different lubrication and loading conditions; (c) vibration signal for pinion wheel with crack at root under different lubrication and loading conditions; (d) vibration signals for pinion wheel with teeth face wear under different lubrication and loading conditions.
1354 N Saravanan et al / Expert Systems with Applications 35 (2008) 1351–1366
Trang 5is mounted on the flat surface using direct adhesive
mounting technique The accelerometer is connected to
the signal-conditioning unit (DACTRAN FFT analyzer),
where the signal goes through the charge amplifier and
an analogue-to-digital converter (ADC) The vibration
signal in digital form is fed to the computer through a
USB port The software RT Pro-series that accompanies
the signal conditioning unit is used for recording the
sig-nals directly in the computer’s secondary memory The
signal is then read from the memory and replayed and processed to extract different features
2.1 Experimental procedure
In the present study, four pinion wheels whose details
wheel and was assumed to be free from defects In the other three pinion wheels, defects were created using EDM in
0 2000 4000 6000 8000 -0.1
0 0.1
Sample No.
GTC-Dry-Unload
0 2000 4000 6000 8000 -0.1
0 0.1
GTC-Dry-fullLoad
Sample No.
0 2000 4000 6000 8000 -0.1
0 0.1
GTC-HalfLub-Unload
Sample No.
0 2000 4000 6000 8000 -0.1
0 0.1
GTC-HalfLub-FullLoad
Sample No.
0 2000 4000 6000 8000 -0.1
0 0.1
GTC-FullLub-Unload
Sample No.
0 2000 4000 6000 8000 -0.1
0 0.1
GTC-FullLub-FullLoad
Sample No.
0 2000 4000 6000 8000 -0.2
0 0.2
Sample
TFW-Dry-Unload
0 2000 4000 6000 8000 -0.2
0 0.2
TFW-Dry-FullLoad
Sample
0 2000 4000 6000 8000 -0.2
0 0.2
TFW-HalfLub-Unload
Sample
0 2000 4000 6000 8000 -0.2
0 0.2
TFW-HalfLub-FullLoad
Sample
0 2000 4000 6000 8000 -0.2
0 0.2
TFW-FullLub-Unload
Sample
0 2000 4000 6000 8000 -0.2
0 0.2
TFW-FullLub-FullLoad
Sample
Fig 6 (continued)
Trang 6order to keep the size of the defect under control The
The size of the defects is a little bigger than one can
encounter in the practical situation; however, it is in-line
The vibration signal from the piezoelectric pickup mounted
on the test bearing was taken, after allowing initial running
of the bearing for sometime
The sampling frequency was 12 000 Hz and sample
length was 8192 for all speeds and all conditions The
sam-ple length was chosen arbitrarily; however, the following
points were considered Statistical measures are more
meaningful, when the number of samples is more On the
other hand, as the number of samples increases the
compu-tational time increases To strike a balance, sample length
of around 10 000 was chosen In some feature extraction
techniques, which will be used with the same data, the
8192 and hence, it was taken as sample length Many trials
were taken at the set speed and vibration signal was stored
in the data The raw vibration signals acquired for various
experimental conditions form the gear box using FFT are
3 Feature extraction
After acquiring the vibration signals in the time domain,
it is processed to obtain feature vectors The continuous
wavelet transform (CWT) is used for obtaining the wavelet
coefficients of the signals The statistical parameters of the
wavelet coefficients are extracted, which constitute the
fea-ture vectors
The term wavelet means a small wave It is the
represen-tation of a signal in terms of finite length or fast decaying
waveform known as mother wavelet This waveform is
scaled and translated to match the input signal
defined as
1
where
jsj
is a window function called the mother wavelet, s is a scale and s is a translation
The term translation is related to the location of the win-dow, as the window is shifted through the signal This cor-responds to the time information in the transform domain But instead of a frequency parameter, we have a scale Scaling, as a mathematical operation, either dilates or com-presses a signal Smaller scale corresponds to high fre-quency of signals and large scale corresponds to low frequency signals
The wavelet series is simply a sampled version of the CWT, and the information it provides is highly redundant
as far as the reconstruction of the signal is concerned This redundancy, on the other hand, requires a significant amount of computation time and resources
3.1 Wavelet-based feature extraction The multilevel 1D wavelet decomposition function, available in Matlab is chosen with the Morlet wavelets specified It returns the wavelet coefficients of signal X at
Morlet wavelet
Sixty-four scales are initially chosen to extract the Mor-let waveMor-let coefficients of the signal data The efficiency of sixty-four scales of Morlet wavelets were obtained using WEKA data mining software and the coefficients of highest
Fig 7 Morlet wavelet ( Collacott ).
Efficiency of Morlet Coefficients
91 92 93 94 95 96 97
0 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64
Morlet Wavelet Scale
4
Fig 8 % Efficiency of Morlet wavelet coefficients.
1356 N Saravanan et al / Expert Systems with Applications 35 (2008) 1351–1366
Trang 7Fig 9 (a) Good-Dry-No Load Vs GTB, GTC, TFW-Dry-No Load; (b) Good-Dry -Full Load Vs GTB, GTC, TFW-Dry -Full Load; (c) Good-Half
Lub-No Load Vs GTB, GTC, TFW-Half Lub-Lub-No Load; (d) Good-Half Lub-Full Load Vs GTB, GTC, TFW-Half Lub-Full Load; (e) Good-Full Lub-Lub-No Load Vs GTB, GTC, TFW-Full Lub-No Load; (f) Good-Full Lub-Full Load Vs GTB, GTC, TFW-Full Lub-Full Load.
Trang 8level are considered for classification Since the eighth level
gave maximum efficiency of 96.5%, the statistical features
corresponding to it were given as input for J48 algorithm
to determine the predominant features to be given as an
the efficiencies of all scales
4 Using J 48 algorithm in the present work
A standard tree induced with c5.0 (or possibly ID3 or
c4.5) consists of a number of branches, one root, a number
of nodes and a number of leaves One branch is a chain of
nodes from root to a leaf; and each node involves one
attri-bute The occurrence of an attribute in a tree provides the
information about the importance of the associated
(2002) A Decision Tree is a tree based knowledge
represen-tation methodology used to represent classification rules
J48 algorithm (A WEKA implementation of c4.5
Algo-rithm) is a widely used one to construct Decision Trees
The Decision Tree algorithm has been applied to the
problem under discussion Input to the algorithm is set of
statistical features of the eighth scale Morlet coefficients
It is clear that the top node is the best node for
classifica-tion The other features in the nodes of Decision Tree
appear in descending order of importance It is to be
stressed here that only features that contribute to the
clas-sification appear in the Decision Tree and others do not
Features, which have less discriminating capability, can
be consciously discarded by deciding on the threshold This concept is made use for selecting good features The algo-rithm identifies the good features for the purpose of classi-fication from the given training data set, and thus reduces the domain knowledge required to select good features for pattern classification problem The decision trees shown in Fig 9is for various lubrication and loading conditions of different faults compared with good conditions of the pin-ion gear wheel
Based on above trees its clear that of all the statistical features, standard error, kurtosis, sample variance and minimum value play a dominant role in feature classifica-tion using Morlet coefficients These four predominant fea-tures are fed as an input to SVM for further classification The scatter plot showing the variation of the statistical
These features were given as input for training and testing
of classifying features using SVM
5 Proximal support vector machine (PSVM) PSVM is a modified version of support vector machine (SVM) The SVM is a new generation learning system based on statistical learning theory SVM belongs to the class of supervised learning algorithms in which the learn-ing machine is given a set of features (or inputs) with the associated labels (or output values) Each of these features can be looked upon as a dimension of a hyper-plane SVMs construct a hyper-plane that separates the data into two classes (this can be extended to multi-class problems)
Fig 9 (continued)
1358 N Saravanan et al / Expert Systems with Applications 35 (2008) 1351–1366
Trang 9While doing so, SVM algorithm tries to achieve maximum
classes with a large margin minimizes a bound on the
expected generalization error By ‘minimum generalization
error’, we mean that when a new set of features (that is data points with unknown class values) arrive for classification, the chance of making an error in the prediction (of the class to which it belongs) based on the learned classifier
-1 0 1 2
Good-Dry-Noload
Sample No.
-1 0 1 2
Good-Dry-Fullload
Sample No.
-0.5 0 0.5
1 Good-HalfLub-Noload
Sample No.
-0.5 0 0.5 Good-HalfLub-Fullload
Sample No.
5 10 15 20 -0.5
0 0.5
Good-FullLub-Noload
Sample No.
0
0.5
Good-FullLub-Fullload
Sample No.
-0.5 0
0.5
GTB-Dry-Noload
Sample
-0.4 -0.2 0 0.2
GTB-Dry-Fullload
Sample
-0.4 -0.2 0 0.2
GTB-HalfLub-Noload
Sample
-0.4 -0.2 0 0.2
GTB-HalfLub-Fullload
Sample
-0.4 -0.2 0 0.2
GTB-FullLub-Noload
Sample
-0.4 -0.2 0 GTB-FullLub-Fullload
Sample
Fig 10 (a) Vibration signal for good pinion wheel under different lubrication and loading conditions; (b) vibration signal for pinion wheel with teeth breakage under different lubrication and loading conditions; (c) vibration signal for pinion wheel with crack at root under different lubrication and loading conditions; (d) vibration signals for pinion wheel with teeth face wear under different lubrication and loading conditions.
Trang 10(hyper-plane) should be minimum Intuitively, such a
clas-sifier is one, which achieves maximum separation-margin
between the classes The above process of maximizing
sep-aration leads to two hyper-planes parallel to the separating
plane, on either side of it These two can have one or more
points on them The planes are known as ‘bounding planes’
and the distance between them is called as ‘margin’ By
SVM ‘learning’, we mean, finding a hyper-plane, which maximizes the margin The points lying beyond the bound-ing planes are called support vectors As for as data points belonging to A are concerned P1, P2, P3, P4, and P5 are
vectors Similar thing hold good for class A+ These points play a crucial role in the theory and hence the name
-0.2 0 0.2
GTC-Dry-Noload
Sample
5 10 15 20 0
1 2
GTC-Dry-Fullload
Sample
-0.2 0 0.2 0.4
GTC-HalfLub-Noload
Sample
5 10 15 20 -0.2
0 0.2 0.4
GTC-HalfLub-Fullload
Sample
-0.2 0 0.2 0.4 0.6
GTC-FullLub-Noload
Sample
5 10 15 20 -0.2
0 0.2
GTC-FullLub-Fullload
Sample
-0.5 0 0.5
TFW-Dry-Noload
Sample No.
-0.5 0 0.5
TFW-Dry-Fullload
Sample No.
-0.5 0 0.5
TFW-HalfLub-Noload
Sample No.
-0.5 0 0.5
TFW-HalfLub-Fullload
Sample No.
-0.5 0 0.5
TFW-FullLub-Noload
Sample No.
0 0.5
TFW-FullLub-Fullload
Sample No.
Fig 10 (continued)
1360 N Saravanan et al / Expert Systems with Applications 35 (2008) 1351–1366