Fault diagnosis of spur bevel gear box using artificial neural network ANN, and proximal support vector machine PSVM Department of Mechanical Engineering, Amrita Vishwa Vidyapeetham, Coim
Trang 1Fault diagnosis of spur bevel gear box using artificial neural network (ANN), and proximal support vector machine (PSVM)
Department of Mechanical Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, Tamil Nadu, India
1 Introduction
Malfunctions in machinery are often sources of reduced
productivity and increased maintenance costs in various industrial
applications For this reason, machine condition monitoring is
being pursued to recognize incipient faults As modern production
plants are expected to run continuously for extended hours,
unexpected downtime due to rotating machinery failures has
become more costly than ever before The faults arising in rotating
machines are often due to damages and failures in the components
of gear box assembly Fault diagnosis is an important process in
preventive maintenance of gear box, which avoids serious damage
if defects occur to one of the gears during operation condition
Early detection of the defects, therefore, is crucial to prevent the
system from malfunction that could cause damage or entire
system halt Diagnosing a gear system by examining vibration
signals is the most commonly used method for detecting gear
failures The conventional methods for processing measured data
contain the frequency domain technique, time-domain technique,
and time-frequency domain technique These methods have been
widely employed to detect gear failures The use of vibration
analysis for gear fault diagnosis and monitoring has been widely
investigated and its application in industry is well established[1–
3] This is particularly reflected in the aviation industry where the
helicopter engine, drive trains and rotor systems are fitted with vibration sensors for component health monitoring These methods have traditionally been applied, separately in time and frequency domains A time-domain analysis focuses principally on statistical characteristics of vibration signal such as peak level, standard deviation, skewness, kurtosis, and crest factor A frequency domain approach uses Fourier methods to transform the time-domain signal to the frequency domain, where further analysis is carried out, and conventionally using vibration amplitude and power spectra It should be noted that use of either domain implicitly excludes the direct use of information present in the other Time-frequency based energy distribution method was employed for early detection of gear failure[4] The frequency domain refers to a display or analysis of the vibration data as a function of frequency The time-domain vibration signal
is typically processed into the frequency domain by applying a Fourier transform, usually in the form of a fast Fourier transform (FFT) algorithm[5]
The works presented in [6–9] found that, the FFT-based methods are not suitable for non-stationary signal analysis and are not able to reveal the inherent information of non-stationary signals However, various kinds of factors, such as the change of the environment and the faults from the machine itself, often make the output signals of the running machine contain non-stationary components Usually, these non-stationary components contain abundant information about machine faults; therefore, it is important to analyze the non-stationary signals Most algorithms recently developed for mechanical fault detection are based on the
Applied Soft Computing 10 (2010) 344–360
A R T I C L E I N F O
Article history:
Received 8 July 2008
Received in revised form 6 April 2009
Accepted 2 August 2009
Available online 8 August 2009
Keywords:
Artificial neural network
Proximal support vector machine
Bevel gear box
Morlet wavelet
Statistical features
Fault detection
A B S T R A C T
Vibration signals extracted from rotating parts of machineries carries lot many information with in them about the condition of the operating machine Further processing of these raw vibration signatures measured at a convenient location of the machine unravels the condition of the component or assembly under study This paper deals with the effectiveness of wavelet-based features for fault diagnosis of a gear box using artificial neural network (ANN) and proximal support vector machines (PSVM) The statistical feature vectors from Morlet wavelet coefficients are classified using J48 algorithm and the predominant features were fed as input for training and testing ANN and PSVM and their relative efficiency in classifying the faults in the bevel gear box was compared
ß2009 Elsevier B.V All rights reserved
* Corresponding author at: Sohar University, Sohar, Oman.
E-mail address: nsaro_2000@yahoo.com (N Saravanan).
Contents lists available atScienceDirect
Applied Soft Computing
j o u r n a l h o m e p a g e : w w w e l s e v i e r c o m / l o c a t e / a s o c
1568-4946/$ – see front matter ß 2009 Elsevier B.V All rights reserved.
Trang 2assumption of stationarity of the vibration signals Some of these,
including cepstrum, time-domain averaging, adaptive noise
cancellation, demodulation analysis, etc.[10–12]are well
estab-lished and have proved to be very effective in machinery
diagnostics However, in many cases these methods are not
sufficient to reliably detect different types of faults There is a need
for new techniques which can cope with technological advances in
machinery, and which provide satisfactory fault detection
sensitivity A relatively small amount of applied research has
been done in the application of time-variant fault detection
methods It is known[13,14], that local faults in gear boxes cause
impacts As a result of this impact excitation, impulses and
discontinuities may be observed in the instantaneous
character-istics of the envelope and phase functions [14,15] Due to the
nature of these functions, vibration signals can be considered as
non-stationary[16]and strong non-stationary events can appear
in a local time period, e.g one revolution of gear in mesh The
analysis of non-stationary signals requires specific techniques
which go beyond the classical Fourier approach There exist a lot of
different time-variant methods, some are reviewed in[16–18]
In the recent past reports of fault diagnosis of critical
components using machine learning algorithms like SVM, PSVM
are reported[19] In ANN, the condition-monitoring problem is
treated as a generalization/classification problem based on
training pattern from the samples of faulty roller bearings[20]
However, the traditional ANN approaches have limitations on
generalization of results in models that can over-fit the data
Support vector machine (SVM) is used in many applications of
machine learning because of its high accuracy and good
generalization capabilities SVM is based on statistical learning
theory SVM classifies better than ANN because of the principle
of risk minimization In artificial neural network (ANN)
traditional Empirical Risk Minimization (ERM) is used on
training data set to minimize the error But in SVM, Structural
Risk Minimization (SRM) is used to minimize an upper bound on
the expected risk SVM is modeled as an optimization problem
and involves extensive computation, whereas, PSVM is modeled
as a system of linear equations which involves less computation
[21] PSVM gives results very close to SVM One of the more
recent mathematical tools adopted for transient signals is the
wavelet transform [22,23] Wavelet transform (WT) has
attracted many researchers’ attention recently The wavelet
transform was utilized to represent all possible types of
transients in vibration signals generated by faults in a gear
box[24] A neural network was used to diagnose a simple gear
system after the data have been pre-processed by the wavelet
transform [25] Wavelet transform was used to analyze the
vibration signal from the gear system with pitting on the gear
[26] Hence based on the literature review there exist a wide
scope to explore machine learning methods like ANN, SVM and
PSVM for fault diagnosis of gear box This paper is one such
attempt to apply machine learning methods like ANN and PSVM
to wavelet features of the vibration signal of the gear box under
investigation
This work deals with extraction of wavelet features from the
vibration data of a bevel gear box system and classification of Gear
faults using artificial neural network (ANN) and proximal support
vector machine (PSVM) The vibration signal from a piezoelectric
transducer is captured for the following conditions: Good Bevel
Gear, Bevel Gear with tooth breakage (GTB), Bevel Gear with crack
at root of the tooth (GTC), and Bevel Gear with face wear of the
teeth (TFW) for various loading and lubrication conditions of the
gear box
A group of statistical features like kurtosis, standard deviation,
maximum value, etc form a set of features, which are widely used
in fault diagnostics, are extracted from the wavelet coefficients of
the time-domain signals Selection of good features is an important phase in pattern recognition and requires detailed domain knowledge The Decision Tree using J48 algorithm was used for identifying the best features from a given set of samples The selected features were fed as input to ANN and PSVM for classification
1.1 Different phases of present work The signals obtained are processed further for machine condition diagnosis as explained in the flow chart inFig 1
2 Experimental studies The fault simulator with sensor is shown inFig 2and the inner view of bevel gear box is shown inFig 3 A variable speed DC motor (0.5 hp) with speed up to 3000 rpm is the basic drive A short shaft
of 30 mm diameter is attached to the shaft of the motor through a flexible coupling; this is to minimize effects of misalignment and transmission of vibration from motor
The shaft is supported at its ends through two roller bearings From this shaft the motion is transmitted to the bevel gear box by means of a belt drive The gear box is of dimension
Fig 1 Flow chart for bevel gear box condition diagnosis.
Fig 2 Fault simulator setup.
Trang 3150 mm 170 mm 120 mm and the full lubrication level is
110 mm and half lubrication level is 60 mm SAE 40 oil was used as
a lubricant An electromagnetic spring loaded disc brake was used
to load the gear wheel A torque level of 8 N m was applied at the
full load condition The various defects are created in the pinion
wheels and the mating gear wheel is not disturbed With the sensor
mounted on top of the gear box vibrations signals are obtained for
various conditions The selected area is made flat and smooth to
ensure effective coupling A piezoelectric accelerometer (Dytran
model) is mounted on the flat surface using direct adhesive
mounting technique The accelerometer is connected to the
signal-conditioning unit (DACTRAN FFT analyzer), where the signal goes
through the charge amplifier and an Analogue-to-Digital Converter
(ADC) The vibration signal in digital form is fed to the computer
through a USB port The software RT Pro-series that accompanies
the signal conditioning unit is used for recording the signals
directly in the computer’s secondary memory The signal is then
read from the memory and replayed and processed to extract
different features
2.1 Experimental procedure
In the present study, four pinion wheels whose details are as
mentioned inTable 1were used One was a new wheel and was
assumed to be free from defects In the other three pinion wheels, defects were created using EDM in order to keep the size of the defect under control The details of the various defects are depicted
The size of the defects is a little bigger than one can encounter in the practical situation; however, it is in-line with work reported in literature[27] The vibration signal from the piezoelectric pickup mounted on the gear box was taken, after allowing initial running
of the gear box for some time
The sampling frequency was 12,000 Hz and sample length was
8192 for all conditions The sample length was chosen arbitrarily; however, the following points were considered Statistical measures are more meaningful, when the number of samples is more On the other hand, as the number of samples increases the computational time increases To strike a balance, sample length of around 10,000 was chosen In some feature extraction techniques, which will be used with the same data, as per the Nyquist criteria the number of samples is to be 2n The nearest 2nto 10,000 is 8192 and hence, it was taken as sample length Many trials were taken at the set speed and vibration signal was stored in the data The raw
Fig 3 Inner view of the bevel gear box.
Table 1 Details of faults under investigation.
G3 Gear with crack at root (GTC) 0.8 0.5 20
Table 2 Gear wheel and Pinion details.
Chordal tooth thickness 3.93 0.150
mm
N Saravanan et al / Applied Soft Computing 10 (2010) 344–360 346
Trang 4vibration signals acquired for various experimental conditions
form the gear box using FFT are shown inFig 5(a)–(d)
3 Feature extraction
After acquiring the vibration signals in the time domain, it is
processed to obtain feature vectors The Continuous Wavelet
Transform (CWT) is used for obtaining the wavelet coefficients of
the signals The statistical parameters of the wavelet coefficients
are extracted, which constitute the feature vectors
The term wavelet means a small wave It is the representation of
a signal in terms of finite length or fast decaying waveform known
as mother wavelet This waveform is scaled and translated to
match the input signal
The Continuous Wavelet Transform[28]is defined as
WsðtÞ ¼
Z þ1
1
f ðtÞCs; jðtÞ dt where Cs; jðtÞ ¼ 1ffiffiffiffiffi
jsj
p C t t
s
is a window function called the mother wavelet, s is a scale andtis
a translation
The term translation is related to the location of the window, as the window is shifted through the signal This corresponds to the time information in the transform domain But instead of a frequency parameter, we have a scale Scaling, as a mathematical operation, either dilates or compresses a signal Smaller scale corresponds to high frequency of signals and large scale corresponds to low frequency signals
Fig 5 (a) Vibration Signal for Good Pinion wheel under different lubrication and loading conditions; (b) Vibration Signal for Pinion wheel with Teeth Breakage under different lubrication and loading conditions; (c) Vibration Signal for Pinion wheel with crack at root under different lubrication and loading conditions; (d) Vibration Signals for Pinion
Trang 5Fig 5 (Continued ).
Fig 7 % Efficiency of Morlet wavelet coefficients.
N Saravanan et al / Applied Soft Computing 10 (2010) 344–360 348
Trang 6The wavelet series is simply a sampled version of the CWT, and
the information it provides is highly redundant as far as the
reconstruction of the signal is concerned This redundancy, on the
other hand, requires a significant amount of computation time and
resources
3.1 Wavelet-based feature extraction
The multilevel 1D wavelet decomposition function, available in
Matlab is chosen with the Morlet wavelets specified It returns the
wavelet coefficients of signal X at scale N[29].Fig 6shows Morlet
wavelet
Sixty-four scales are initially chosen to extract the Morlet
wavelet coefficients of the signal data The efficiency of 64 scales of
Morlet wavelets was obtained using WEKA data mining software
and the coefficients of highest level are considered for
classifica-tion Since the eighth level gave maximum efficiency of 96.5%, the
statistical features corresponding to it were given as input for J48 algorithm to determine the predominant features to be given as an input for training and classification using SVM Fig 7 gives the efficiencies of all scales of Morlet wavelet
4 Using J 48 algorithm in the present work
A standard tree induced with c5.0 (or possibly ID3 or c4.5) consists of a number of branches, one root, a number of nodes and a number of leaves One branch is a chain of nodes from root to a leaf; and each node involves one attribute The occurrence of an attribute in a tree provides the information about the importance
of the associated attribute as explained in[31] A Decision Tree is a tree based knowledge representation methodology used to represent classification rules J48 algorithm (A WEKA implemen-tation of c4.5 Algorithm) is a widely used one to construct Decision Trees as explained in[19] The Decision Tree algorithm has been
Fig 8 (a) Good-Dry-No Load vs GTB, GTC, TFW-Dry-No Load (b) Good-Dry-Full Load vs GTB, GTC, TFW-Dry-Full Load (c) Good-Half No Load vs GTB, GTC, TFW-Half
Lub-No Load (d) Good-Half Lub-Full Load vs GTB, GTC, TFW-Half Lub-Full Load (e) Good-Full Lub-Lub-No Load vs GTB, GTC, TFW-Full Lub-Lub-No Load (f) Good-Full Lub-Full Load vs GTB,
Trang 7applied to the problem under discussion Input to the algorithm is
set of statistical features of the eighth scale Morlet coefficients It is
clear that the top node is the best node for classification The other
features in the nodes of Decision Tree appear in descending order
of importance It is to be stressed here that only features that
contribute to the classification appear in the Decision Tree and
others do not Features, which have less discriminating capability,
can be consciously discarded by deciding on the threshold This
concept is made use for selecting good features The algorithm
identifies the good features for the purpose of classification from
the given training data set, and thus reduces the domain
knowledge required to select good features for pattern
classifica-tion problem The decision trees shown inFig 8(a)–(f) is for various
lubrication and loading conditions of different faults compared
with good conditions of the pinion gear wheel
Based on above trees its clear that of all the statistical features,
standard error, kurtosis, sample variance and minimum value play
a dominant role in feature classification using Morlet coefficients These four predominant features are fed as an input to SVM for training and further classification The scatter plot showing the variation of the statistical parameters of Morlet coefficients are shown in Fig 9(a)–(d) These features were given as input for training and testing of classifying features using SVM
5 Artificial neural network ANN is one of the approaches to forecast and validate using computer models with some of the architecture and processing capabilities of the human brain[22] The technology that attempts
to achieve such results is called neural computing or artificial neural networks ANN mimics biological neurons by simulating some of the workings of the human brain An ANN is made up of processing elements called neurons that are interconnected in a network The artificial neurons receive inputs that are analogous to
Fig 8 (Continued ).
N Saravanan et al / Applied Soft Computing 10 (2010) 344–360 350
Trang 8the electro-chemical signals that natural neurons receive from
other neurons By changing the weights given to theses signals, the
network learns in a process that seems similar to that found in
nature i.e., neurons in ANN receive signals or information from
other neurons or external sources, perform transformations on the
signals, and then pass those signals on to other neurons The way
information is processed and intelligence is stored depends on the
architecture and algorithms of ANN.Fig 10shows the architecture
of ANN
A main advantage of ANN is its ability to learn patterns in very
complex systems Through learning or self-organizing process,
they translate the inputs into desired outputs by adjusting the
weights given to signals between neurodes
The proposed method diagnoses a gear box condition using ANN A multi layered feed forward neural network trained with error back propagation was used ANN’s are characterized by their topology, weight vector and activation functions They have three layers namely an input layer that receives signals from some external source, a hidden layer that does the processing of the signals and output layer that sends processed signals back to the external world
5.1 The back propagation algorithm of ANN The back propagation of an ANN assumes that there is a supervision of learning of the network The method of adjusting
Fig 9 (a) Vibration Signal for Good Pinion wheel under different lubrication and loading conditions; (b) Vibration Signal for Pinion wheel with Teeth Breakage under different lubrication and loading conditions; (c) Vibration Signal for Pinion wheel with Crack at root under different lubrication and loading conditions; (d) Vibration Signals for Pinion
Trang 9weights is designed to minimize the sum of the squared errors for a
given training data set:
j – identifies a receiving node,
i – denotes the node that feeds a second node,
I – denotes input to a neuron,
O – denotes output of a neuron,
Wij– denotes the weights associated with the nodes
Each non-input node has an output level Ojwhere
Oj¼ 1
1 þ eI j; Ij¼X
where Ois each of the signals to node j (i.e., the output of node of i)
The derivation of the back propagation formula involves the use
of the chain rule of partial derivatives and equals:
di j¼@SSE
@Wi j
¼ @SSE
@Oj
@O
j
@Ij
@I
j
@Wi j
(2)
where by convention the left-hand side is denoted by dij, the change in the sum of squared errors (SSE) attributed to Wij Now error is given by
ei¼ ðDj OjÞ; SSE ¼X
ðDj OjÞ2 (3) Therefore,
@SSE
@O j
¼ 2X
Fig 9 (Continued ).
N Saravanan et al / Applied Soft Computing 10 (2010) 344–360 352
Trang 10From the output of the output node, we obtain,
@Oj
@Ij
The input to an input node is Ij=P
WijOi
Therefore the change in the input to the output node resulting
from the previous hidden node, i, is
@Ij
@Wi j
Thus from above equations, the jth delta is
di j¼ 2ejOjð1 OjÞOi (7)
Now the old weight is updated by the following equation:
DWi jðnewÞ ¼hdi jOjþa DWi jðoldÞ (8)
For the hidden layers, the calculations are similar The only change
is how the ANN output error is back propagated to the hidden layer nodes The output error at the ith hidden node depends on the output errors of all nodes in the output layer This relationship is given by
ei¼X
After calculating the output error for the hidden layer, the update rules for the weights in that layer are the same as the previous update
5.2 Proximal support vector machine (PSVM)
PSVM is a modified version of support vector machine (SVM) The SVM is a new generation learning system based on statistical learning theory SVM belongs to the class of supervised learning algorithms in which the learning machine is given a set of features (or inputs) with the associated labels (or output values) Each of
Fig 10 ANN architecture.
Fig 11 Flowchart of PSVM.
Fig 12 Standard SVM classifier.