RBF Neurals Networks and a new algorithm for training RBF networks

Trang 1

1 Summary

2 Introduction about function regression

3 RBF Neurals Networks and a new algorithm for training RBF networks

4 Experiment

5 Conclusion

Trang 2

1 SUMMARY

Gaussian radial basis function (RBF) networks are commonly used forinterpolating multivariable functions However, the way of choosing number ofneurals in hidden layer and choosing the appropriate center of the RBFs to have agood interpolating networks is still open and attracted interest of researchers.Thisreport proposes using equally spaced nodes as centers in hidden layer After that,using k-nearest neighbour regression to interpolating function in the center and using

a new algorithm to training RBF networks results show that the generality ofnetworks trained by this new algorithm is sensibly improved and the running timesignificantly reduced, especially when the number of nodes is large

2 REGRESSTION FUNCTION

2.1.1 Introductions about regressing

A set D in Rn and f: D (Rn)Rm is a multivariable function in D We only know

a set T in D including N vectors: x 1 ,x 2 ….x N is f(x i ) = y i with i=1,2…,N and we must find f(x) of another x in D (x= x1,…,xn)

We find a function (x) in D such that:

And using  (x) instead of f(x) When m >1, the interpolation problem is equal with m

problems of interpolating m functions of real multivariable Therefore, we only needworking with m =1

2.1.2 K-nearest neighbour (k-NN) regression

In this method, people choose a certain natural number k Each x  D, x= x1,…,xn

we find (x)through f at k nearest nodes of x as follow:

Denoting z1,…,zk is k vectors in T which are nearest wit x (with d(u,v) is the distance of u,v in D), and then (x)is defined:

0 1

Trang 3

parameters i by the system of equations:

That means:

(3)And

Step 1: Base on the unequally spaced nodes and its measured value with white noise, using regression method, we create a new set of data with equally spaced nodes in

a given web defined in the range of parameters of original unequally spaced nodes Each value of new equally spaced nodes is noise reduction

Step 2: Using HDH-1 phase to training the RBF networks in the new data, we get

a network which does not only interpolate appoximately function, but it also reduces the noise

Trang 4

Figure 1: The web nodes base on original values of unequally spaced nodes

The figure above describe the case of 2 dimensions data, the web of new equally spaced nodes (the red circles) which based on the range of original values of original nodes (the blue triangles) The value of each node (circles) is computed by using

regression based on the values of k nearest original nodes (triangles) RBF networks will

be training by HDH-1phase algorithm with the input data is new equally spaced nodes (circles) and the reduction value of each node

Trang 5

2.3 The approximately multivariable function problem

Approximating multivariable function problem is considered as a common,general problem, the interpolation aspect is a special situation In the interpolatingproblem, the interpolating function must have the same value with the value of givennodes When the number of nodes is large, defining the interpolating function  becomemore complex, and we will accept the approximate value at each given nodes andchoosing a simple function such that the error is best

The given problem:

Function y  f (x) measured in  N

k k

x 1 belong to D in R n is

N k

x

f

y k  ( k);   1 with x x x k D

n k

k  ( 1, , )  andy  k R m

To approximate f (x) we needs a function with given form such that the error ineach node is as good as possible

The chosen function is usually  (x)   (x,c1,c2, ,c k) and the error is often

defined following paramters c1,c2, ,c k with least square method 





N i

i j

i j i

Trang 6

In some case, the number of nodes (N) is huge, to reduce the computation,instead of computing with i  1 N people can use with i  1 M and M<N to compute

(

 least, with set  M

k k

z 1is the nearest of x This method is local methodand function  (x,c1,c2, ,c k) is chosen as linear function

3 RBF NETWORKS AND QHDH TRAINING ALGORITHM

RBF networks are networks with 3 layers (2 neural layers) Each neural in hiddenlayer is a non-linear function of distance from input vector X and center vector C j

combined with neural j with basic radialj The combination of input vector X andcenter vector C j creates a matrix of distance functions with responsible radial Peoplewill use this matrix to compute each weighting parameters of neurals in networks

3.1 Radial Basis Function

3.1.1 Interpolating multivariable function problem with RBF approach

Considering the multivariable function f :D( R n)  R m given by nodes

),(

x 1 is a set of n-dimension vectors (so-called interpolating nodes) and

h   is called Radial Basic Function (RBF) with center v k

, radial k

and M  ( N)is the number of radial basic function using to define f ; wk an k is theparameters that needs finding

3.1.2 Radial Basic Function Technique

Considering the interpolating problem with m = 1 and the interpolaing nodes isnot too large we find function  as follow:

Trang 7

w x

1

0)()

With k (x)is radial basic function There are many diffirent radial basicfunctions and the most widely used function is Gauss function The following formulaintroduces the technique with Gauss RBF:

N k



 v k is center of RBF k Centers are the interpolating v k=x k k , then

M = N (more detail in chapter 3 of [13])

 The parameters w k and k need to be found such that  satisfiesinterpolating conditions (3.1):

i N

k

i k k

)()

With each k, parameter k is used to control effective range of RBF k, when

k k

With given parameter k, Michelli [14] proved  is reversible matrix and

positive if the x k is diffirent Therefore, with given w0, the system of equations (3.4)

always have one root w 1, …, wN

The sum of error square is defined by the formula (3.6)

The genaral approximation and the best approximation of radial basic functions

is investigated in [22][31][54] The interpolating function has advantages in sum of error

Trang 8

square E which is always global minimum (page 98 in [13]) From the conclusion above,people suggests an algorithm to interpolating and approximate function based on sum ofleast square or solving the system of equations [49].

3.1.3 Some radial basic functions

Non-linear Radial Basic Function f can be used as some following functions:

Gaussian Function:

2

) (

) ( x e x r c

3.7)With c  R is center of RBF with radial is r The value of RBF Gaussian increase when x is closer to center as the figure 3.1:

Figure: 3.1 RBF Gaussian with r =1 and c = 0

Trang 9

Multiquadric Function:

2 / 1 2

) ((

)

3.8)

Figure 3.2 RBF Multiquadric with r =1 and c = 0

Inverse Multiquadric Function:

2 / 1 2

) ((

Trang 10

r r c x x

f ( ) ((  )2 2) 1

Figure: 3.4 RBF Cauchy with r =1 and c = 0

3.2 RBF Network structure

RBF neural networks structure is 3 layers (2 neural layers), transposing function

is radial basic The structrue includes:

i) Input layer with n nodes for the input vectorx  R n.ii) Hidden layer has M neurals; each neurals k has center v k Output of

the neural is responsible RBF k.iii) Output layer includes m neurals with output value.

With transposing function in hidden layer is RBF and output of transposingfunction is linear

Trang 11

Figure: 3.5 RBF neural networks structureThe figure 3.5 describes the general of RBF networks structures If with the data

y 1 is responsible expected vector

with input vector x k

W 0 is the threshold of each output neural We have output of each neural as theformula (3.11):

rj = w1j1 + … + wMjM + w0j (3.11)With each x 1 = f1(x,v1), …, x  M = fM(x,vM), and

M k

k kj w

y J

Expected value

Trang 12

zj = 



M k

k kj w M

1)/1

Looking at the formula (3.11) and (3.12) here k is radial basic function, wkj is

connection points of the neuron output layer, x is the input signal of the network, vm isthe center selected the corresponding the radius function Center vector vm = (v1m,…, vnm) corresponding with hidden neuron m with n components included in the

input vector, M is the radius function is the number of hidden layer neurons

When training the network perform the following tasks:

Identify the center vector

Select m radius parameters accordingly

Creation of connection w

The training of connection points so that the total squared error E the smallest

3.3 Algorithm for training Interpolation RBF Networks: HDH and QHDH

a)Two-phase Algorithm for training HDH

We denote indentity Matrix of size N as I; W=

z

1

are vectors n dimensional space RN, in which:

j k khi

k k

j x x j

:

;0

k y

1

(13)

Trang 13

With each k N, the function qk ofk was indentified as: 



j j k k

q

1 ,

(14)

HDH algorithum is occured as follows: thực hiện như sau.

With

With tolerance  and positive constants q,  <1 what are given, thealgorithum includes two phase to determine the parameters: k và W* In thefirst phase, we’ll determinek for qk q and being closest to q(it means that if

we place k=k/ then qk>q) Then, matrix criterion corresponds to vector

criterion * (fomula: 



N j j

u u

1

* ) and is smaller than q The experimentalsolutions of latter phase are found by the method of simple iteration Thealgorithum is demonstrated in figure 1

Trang 14

Figure 1: Two-phase Algorithm for training HDH

To determine the solution W* of the system (10) we operate iterative procedure

as follows:

1- Firstly, create W0=Z ;

2- Then work out : Wk+1= W k+Z ;

3- If the end condition is not satisfied, then assign W0 := W1 and come back tostep 2 ;

With each N-dimensional vector, we denote stadard 



N j j

u u

b) Algorithm QHDH

If interpolation molds are equidistant, they can be expressed in the form of

multi-index : : ( 1 , , )

1 ,

2 ,

n i

in i

Find W* using method of simple interative; //2ndphase

End

Trang 15

In which: x k x k ik.h k, hk (k=1,…,n) are given constants (pitch variation ofvariable xk ), n is the number of dimension , and ik varies from 1 to Nk (Nk isthe number of each dimension’s mold).

Instead of the Euclide criterion, we examnie stadard : x x T Ax

2 1

1

00

0

10

0

01

in i in

w x

,

1 ,

1

0 ,

1 ,

1

)()

1 / ,

, , 1 , , 1

1 , , 1

,

1

/ , ,

1 , ,

in i

x x jn

j jn

j in

, , 1 , , 1

1 , , 1 , ,

1 :

; 0

in i in i jn

x jn

j in i

e

n i jn j

jn j in i in

q

, , 1

, , 1 , , 1 , ,

1  1 And have criterion which is smaller thanq

Trang 16

Subsequently, we apply method of simple interative in 2nd phase of thealgorithm which is mentioned in Section 2 to determine points output layer Sothat, one-phase QHDH algorithm has been formed With given positiveconstant q<1, this algorithm can be interpreted as follows :

Figure 2 : One-phase interative Algorithm for training RBF NetworksSystem with equidistant mold

3.4 New function approximation Method

We come back to the problem of multi-variate f approximation : :D (Rn)

→R and set T= N

j j

1, , the method to create RBF network to beapproximate with this function is as follows :

Step 1 Choose a natural number k > n and a grid of evenly spaced molds

Step 2 Applying the method k-NN mentioned in 2.2 to determine approximate

values of f(zk) at the corresponding point of grid : zk Step 3 Applying QHDH to train RBF networks

So that, we could have RBF which is approximate f on D The procedure isdemostrated in Figure 3 :

Proceduce One-phase interative Algorithm

for training

Determine i1, ,in as the fomula (18) ;

Find W* using Method of simple Interative (mentioned

in Head 2) ;

End

Proceduce RBF network construction function approximation

Choose k and landmarks equidistant grid  M j

j

z 1 on B, Calculate the approximate value  M j

Trang 17

Figure 3: Algorithm one-phase iterative training RBF network with equidistant markers

4 Experiments Results

We implement experiments comparing the approximation error due to data taken from the website:

http://www.liaad.up.pt/~ltorgo/Regression/cal_housing.html

Network will train to compare the error The effectiveness of the algorithm are

compared with the method of Guang-Bin Huang GGAP and colleagues and

demonstrate better than others methods by experiment

Testing program to programming in Visual Studio C + + 2010 is running on Operating System Windows 7 7601 32-bits, 2GB RAM, CPU Intel Core2 Duo T7300 2.0 GHz

Trang 18

4.1 The selection of grid size M.

We collected information on the variables using all the block groups in California from the 1990 Cens us In this sample a block group on average includes 1425.5 individuals living in a geographically compact area Naturally, the geographical area included varies inversely with the population density We computed distances among the centroids of each block group as measured in latitude and longitude We excluded all the block groups reporting zero entries for the independent and dependent

variables T he final data contained 20,640 observations on 9 variables The

dependent variable is ln(median house value)

The experimental results in Table 1 show that:

1) When the grid nodes M is less, significant error larger, grid nodes thicker (M becomes larger), the error better When increasing the number of nodes on the grid much

better error not much better

2) Where there is no noise for better approximation

Trang 19

Table 1 Approximation error of the network with change M when N= 20640, k =

M = 30 x30 =900

M= 40x 40

=1600

M = 50x50 =2500

4.2 The selection of K.

Experiment with real data as in the previous section and M = 900 grid points

The results in table 2

Table 2 Approximation error of the network when K changes

Average error of the network with different k

The experiment show that when k increases, the noise reduction capabilities will

increase but the error increases more that far distant point in calculating the values in nodes grid can affect the generality of the function

4.3 Comparison of effective Guang-Bin Huang Networks

Guang-Bin Huang (2005) proposed the method of RBF network training GGAP

approximately multivariate normal distribution with mixed noise, experimental

results show that more effective than other commonly

used methods (MRAN,MAIC) We compared the effects test network with the

network by the method of GGAP Implement test for real data introduced above

Samples volume were collected for the case N = 20 640 samples The number of grid nodes respectively M = 30x30 = 900, with k values change The experimental results are shown in Table 3 below (the error in bold is better results GGAP

Trang 20

5 Conclusion

We can see new method combines linear regression and k-NN algorithm for RBF network training QHDH have a multivariate function approximation network, its performance is experimentally demonstrated great promise

In the future we will apply to the real data in the pattern recognition problem to test the effect of application

Định dạng
Số trang	22
Dung lượng	615 KB