RBF Neurals Networks and a new algorithm for training RBF networks
Trang 11 Summary
2 Introduction about function regression
3 RBF Neurals Networks and a new algorithm for training RBF networks
4 Experiment
5 Conclusion
Trang 21 SUMMARY
Gaussian radial basis function (RBF) networks are commonly used forinterpolating multivariable functions However, the way of choosing number ofneurals in hidden layer and choosing the appropriate center of the RBFs to have agood interpolating networks is still open and attracted interest of researchers.Thisreport proposes using equally spaced nodes as centers in hidden layer After that,using k-nearest neighbour regression to interpolating function in the center and using
a new algorithm to training RBF networks results show that the generality ofnetworks trained by this new algorithm is sensibly improved and the running timesignificantly reduced, especially when the number of nodes is large
2 REGRESSTION FUNCTION
2.1.1 Introductions about regressing
A set D in Rn and f: D (Rn)Rm is a multivariable function in D We only know
a set T in D including N vectors: x 1 ,x 2 ….x N is f(x i ) = y i with i=1,2…,N and we must find f(x) of another x in D (x= x1,…,xn)
We find a function (x) in D such that:
And using (x) instead of f(x) When m >1, the interpolation problem is equal with m
problems of interpolating m functions of real multivariable Therefore, we only needworking with m =1
2.1.2 K-nearest neighbour (k-NN) regression
In this method, people choose a certain natural number k Each x D, x= x1,…,xn
we find (x)through f at k nearest nodes of x as follow:
Denoting z1,…,zk is k vectors in T which are nearest wit x (with d(u,v) is the distance of u,v in D), and then (x)is defined:
0 1
Trang 3parameters i by the system of equations:
That means:
(3)And
Step 1: Base on the unequally spaced nodes and its measured value with white noise, using regression method, we create a new set of data with equally spaced nodes in
a given web defined in the range of parameters of original unequally spaced nodes Each value of new equally spaced nodes is noise reduction
Step 2: Using HDH-1 phase to training the RBF networks in the new data, we get
a network which does not only interpolate appoximately function, but it also reduces the noise
Trang 4Figure 1: The web nodes base on original values of unequally spaced nodes
The figure above describe the case of 2 dimensions data, the web of new equally spaced nodes (the red circles) which based on the range of original values of original nodes (the blue triangles) The value of each node (circles) is computed by using
regression based on the values of k nearest original nodes (triangles) RBF networks will
be training by HDH-1phase algorithm with the input data is new equally spaced nodes (circles) and the reduction value of each node
Trang 52.3 The approximately multivariable function problem
Approximating multivariable function problem is considered as a common,general problem, the interpolation aspect is a special situation In the interpolatingproblem, the interpolating function must have the same value with the value of givennodes When the number of nodes is large, defining the interpolating function becomemore complex, and we will accept the approximate value at each given nodes andchoosing a simple function such that the error is best
The given problem:
Function y f (x) measured in N
k k
x 1 belong to D in R n is
N k
x
f
y k ( k); 1 with x x x k D
n k
k ( 1, , ) andy k R m
To approximate f (x) we needs a function with given form such that the error ineach node is as good as possible
The chosen function is usually (x) (x,c1,c2, ,c k) and the error is often
defined following paramters c1,c2, ,c k with least square method
N i
i j
i j i
Trang 6In some case, the number of nodes (N) is huge, to reduce the computation,instead of computing with i 1 N people can use with i 1 M and M<N to compute
(
least, with set M
k k
z 1is the nearest of x This method is local methodand function (x,c1,c2, ,c k) is chosen as linear function
3 RBF NETWORKS AND QHDH TRAINING ALGORITHM
RBF networks are networks with 3 layers (2 neural layers) Each neural in hiddenlayer is a non-linear function of distance from input vector X and center vector C j
combined with neural j with basic radialj The combination of input vector X andcenter vector C j creates a matrix of distance functions with responsible radial Peoplewill use this matrix to compute each weighting parameters of neurals in networks
3.1 Radial Basis Function
3.1.1 Interpolating multivariable function problem with RBF approach
Considering the multivariable function f :D( R n) R m given by nodes
),(
x 1 is a set of n-dimension vectors (so-called interpolating nodes) and
h is called Radial Basic Function (RBF) with center v k
, radial k
and M ( N)is the number of radial basic function using to define f ; wk an k is theparameters that needs finding
3.1.2 Radial Basic Function Technique
Considering the interpolating problem with m = 1 and the interpolaing nodes isnot too large we find function as follow:
Trang 7w x
1
0)()
With k (x)is radial basic function There are many diffirent radial basicfunctions and the most widely used function is Gauss function The following formulaintroduces the technique with Gauss RBF:
N k
v k is center of RBF k Centers are the interpolating v k=x k k , then
M = N (more detail in chapter 3 of [13])
The parameters w k and k need to be found such that satisfiesinterpolating conditions (3.1):
i N
k
i k k
)()
With each k, parameter k is used to control effective range of RBF k, when
k k
With given parameter k, Michelli [14] proved is reversible matrix and
positive if the x k is diffirent Therefore, with given w0, the system of equations (3.4)
always have one root w 1, …, wN
The sum of error square is defined by the formula (3.6)
The genaral approximation and the best approximation of radial basic functions
is investigated in [22][31][54] The interpolating function has advantages in sum of error
Trang 8square E which is always global minimum (page 98 in [13]) From the conclusion above,people suggests an algorithm to interpolating and approximate function based on sum ofleast square or solving the system of equations [49].
3.1.3 Some radial basic functions
Non-linear Radial Basic Function f can be used as some following functions:
Gaussian Function:
2
2
) (
) ( x e x r c
3.7)With c R is center of RBF with radial is r The value of RBF Gaussian increase when x is closer to center as the figure 3.1:
Figure: 3.1 RBF Gaussian with r =1 and c = 0
Trang 9Multiquadric Function:
2 / 1 2
) ((
)
3.8)
Figure 3.2 RBF Multiquadric with r =1 and c = 0
Inverse Multiquadric Function:
2 / 1 2
) ((
Trang 10r r c x x
f ( ) (( )2 2) 1
Figure: 3.4 RBF Cauchy with r =1 and c = 0
3.2 RBF Network structure
RBF neural networks structure is 3 layers (2 neural layers), transposing function
is radial basic The structrue includes:
i) Input layer with n nodes for the input vectorx R n.ii) Hidden layer has M neurals; each neurals k has center v k Output of
the neural is responsible RBF k.iii) Output layer includes m neurals with output value.
With transposing function in hidden layer is RBF and output of transposingfunction is linear
Trang 11Figure: 3.5 RBF neural networks structureThe figure 3.5 describes the general of RBF networks structures If with the data
y 1 is responsible expected vector
with input vector x k
W 0 is the threshold of each output neural We have output of each neural as theformula (3.11):
rj = w1j1 + … + wMjM + w0j (3.11)With each x 1 = f1(x,v1), …, x M = fM(x,vM), and
M k
k kj w
y J
Expected value
Trang 12zj =
M k
k kj w M
1)/1
Looking at the formula (3.11) and (3.12) here k is radial basic function, wkj is
connection points of the neuron output layer, x is the input signal of the network, vm isthe center selected the corresponding the radius function Center vector vm = (v1m,…, vnm) corresponding with hidden neuron m with n components included in the
input vector, M is the radius function is the number of hidden layer neurons
When training the network perform the following tasks:
Identify the center vector
Select m radius parameters accordingly
Creation of connection w
The training of connection points so that the total squared error E the smallest
3.3 Algorithm for training Interpolation RBF Networks: HDH and QHDH
a)Two-phase Algorithm for training HDH
We denote indentity Matrix of size N as I; W=
z
1
are vectors n dimensional space RN, in which:
j k khi
k k
j x x j
:
;0
k y
1
(13)
Trang 13With each k N, the function qk ofk was indentified as:
j j k k
q
1 ,
(14)
HDH algorithum is occured as follows: thực hiện như sau.
With
With tolerance and positive constants q, <1 what are given, thealgorithum includes two phase to determine the parameters: k và W* In thefirst phase, we’ll determinek for qk q and being closest to q(it means that if
we place k=k/ then qk>q) Then, matrix criterion corresponds to vector
criterion * (fomula:
N j j
u u
1
* ) and is smaller than q The experimentalsolutions of latter phase are found by the method of simple iteration Thealgorithum is demonstrated in figure 1
Trang 14Figure 1: Two-phase Algorithm for training HDH
To determine the solution W* of the system (10) we operate iterative procedure
as follows:
1- Firstly, create W0=Z ;
2- Then work out : Wk+1= W k+Z ;
3- If the end condition is not satisfied, then assign W0 := W1 and come back tostep 2 ;
With each N-dimensional vector, we denote stadard
N j j
u u
b) Algorithm QHDH
If interpolation molds are equidistant, they can be expressed in the form of
multi-index : : ( 1 , , )
1 ,
2 ,
n i
in i
Find W* using method of simple interative; //2ndphase
End
Trang 15In which: x k x k ik.h k, hk (k=1,…,n) are given constants (pitch variation ofvariable xk ), n is the number of dimension , and ik varies from 1 to Nk (Nk isthe number of each dimension’s mold).
Instead of the Euclide criterion, we examnie stadard : x x T Ax
2 1
1
00
0
10
0
01
in i in
w x
,
1 ,
1
0 ,
1 ,
1
1
)()
1 / ,
, , 1 , , 1
1 , , 1
,
1
/ , ,
1 , ,
1 , ,
in i
x x jn
j jn
j in
, , 1 , , 1
1 , , 1 , ,
1 :
; 0
in i in i jn
x jn
j in i
e
n i jn j
jn j in i in
q
, , 1
, , 1 , , 1 , ,
1 1 And have criterion which is smaller thanq
Trang 16Subsequently, we apply method of simple interative in 2nd phase of thealgorithm which is mentioned in Section 2 to determine points output layer Sothat, one-phase QHDH algorithm has been formed With given positiveconstant q<1, this algorithm can be interpreted as follows :
Figure 2 : One-phase interative Algorithm for training RBF NetworksSystem with equidistant mold
3.4 New function approximation Method
We come back to the problem of multi-variate f approximation : :D (Rn)
→R and set T= N
j j
1, , the method to create RBF network to beapproximate with this function is as follows :
Step 1 Choose a natural number k > n and a grid of evenly spaced molds
Step 2 Applying the method k-NN mentioned in 2.2 to determine approximate
values of f(zk) at the corresponding point of grid : zk Step 3 Applying QHDH to train RBF networks
So that, we could have RBF which is approximate f on D The procedure isdemostrated in Figure 3 :
Proceduce One-phase interative Algorithm
for training
Determine i1, ,in as the fomula (18) ;
Find W* using Method of simple Interative (mentioned
in Head 2) ;
End
Proceduce RBF network construction function approximation
Choose k and landmarks equidistant grid M j
j
z 1 on B, Calculate the approximate value M j
Trang 17Figure 3: Algorithm one-phase iterative training RBF network with equidistant markers
4 Experiments Results
We implement experiments comparing the approximation error due to data taken from the website:
http://www.liaad.up.pt/~ltorgo/Regression/cal_housing.html
Network will train to compare the error The effectiveness of the algorithm are
compared with the method of Guang-Bin Huang GGAP and colleagues and
demonstrate better than others methods by experiment
Testing program to programming in Visual Studio C + + 2010 is running on Operating System Windows 7 7601 32-bits, 2GB RAM, CPU Intel Core2 Duo T7300 2.0 GHz
Trang 184.1 The selection of grid size M.
We collected information on the variables using all the block groups in California from the 1990 Cens us In this sample a block group on average includes 1425.5 individuals living in a geographically compact area Naturally, the geographical area included varies inversely with the population density We computed distances among the centroids of each block group as measured in latitude and longitude We excluded all the block groups reporting zero entries for the independent and dependent
variables T he final data contained 20,640 observations on 9 variables The
dependent variable is ln(median house value)
The experimental results in Table 1 show that:
1) When the grid nodes M is less, significant error larger, grid nodes thicker (M becomes larger), the error better When increasing the number of nodes on the grid much
better error not much better
2) Where there is no noise for better approximation
Trang 19Table 1 Approximation error of the network with change M when N= 20640, k =
M = 30 x30 =900
M= 40x 40
=1600
M = 50x50 =2500
4.2 The selection of K.
Experiment with real data as in the previous section and M = 900 grid points
The results in table 2
Table 2 Approximation error of the network when K changes
Average error of the network with different k
The experiment show that when k increases, the noise reduction capabilities will
increase but the error increases more that far distant point in calculating the values in nodes grid can affect the generality of the function
4.3 Comparison of effective Guang-Bin Huang Networks
Guang-Bin Huang (2005) proposed the method of RBF network training GGAP
approximately multivariate normal distribution with mixed noise, experimental
results show that more effective than other commonly
used methods (MRAN,MAIC) We compared the effects test network with the
network by the method of GGAP Implement test for real data introduced above
Samples volume were collected for the case N = 20 640 samples The number of grid nodes respectively M = 30x30 = 900, with k values change The experimental results are shown in Table 3 below (the error in bold is better results GGAP
Trang 205 Conclusion
We can see new method combines linear regression and k-NN algorithm for RBF network training QHDH have a multivariate function approximation network, its performance is experimentally demonstrated great promise
In the future we will apply to the real data in the pattern recognition problem to test the effect of application