an incremental learning algorithm based on support vector domain classifier

An Incremental Learning Algorithm Based onYinggangZhao, QinmingHe College of Computer Science, Zhejiang University, Hangzhou310027, China Email:ygzl29g163.com SVDDalgorithmgivesus anenli

Trang 1

An Incremental Learning Algorithm Based on

YinggangZhao, QinmingHe College of Computer Science, Zhejiang University, Hangzhou310027, China

Email:ygzl29g163.com

SVDDalgorithmgivesus anenlightenment: whenweclassify

a binary-class dataset, if we only know part of sample's

Abstract category ( for example, samples with category label yi =1),

yet the other part of sample's category is unknown, thenwe

Incremental learning technique is usually used to solve can design new type of classifier based on SVDD named large-scaleproblem Wefirstlygave amodifedsupport vector support vector domain classifier ( SVDC) This new classifier machine (SVM) classification method support vector only need to describe the data with known category, then domain classifer (SVDC), then an incremental learning obtaining the description boundary of this class of data algorithm based on SVDC was proposed The basic idea of Finally, we can classify the unknown binary-class data this incremental algorithm is to obtain the initial target according to the obtained boundary

concepts using SVDC during the trainingprocedure and then In this paper our incremental learning algorithm is based update these target concepts by an updating model Difierent on SVDC, and this algorithm is motivated by the from the existed incremental learning approaches, in our person-learning procedure When learning a complicated algorithm, the model updating procedure equals to solve a concept, people usually obtain a initial concept by using part quadraticprogramming(QP)problem, andtheupdated model ofuseful information, then update the obtained concept by still owns theproperty ofspars solution Compared with other utilizing new information In term of our incremental existed incrementallearningalgorithms, theinverseprocedure algorithm based on SVDC, it firstly utilize part of data

ofour algorithm (i.e decreasing learning) iseasy toconduct ( memory space permitting), then obtain a concept (namely the without extra computation Experiment results show our parameter of obtained decision hypersurface) by SVDC algorithm iseffectiveandfeasible learning algorithm, finally according to the information of

decision hypersurface acquired in last step, update the

parameterof decisionhypersurfacegainedinlaststeputilizing

Keywords: Support Vector Machines, Support Vector Domain specialized updating model in the process of incremental

Classifier, Incremental learning, Classification learning,namely updating the known concept

Ouralgorithmownsthefollowingcharacters:

Withlargeamounts of data availableto machinelearning has a similar mathematics form compared with community, the need to design techniques that scale well is standard SVDC algorithm, and any algorithm used morecritical than before As somedata may be collectedover to obtain the standard SVDC can also be used to longperiods, there is alsoacontinuous needtoincorporatethe obtain theupdatingmodel ofouralgorithm; new data into the previously learned concept Incremental 2) The inverse procedure of this algorithm, i.e the learningtechniquescansatisfythe need for both thescalability decreasing learning procedure is easy to

Support vector machine (SVM) is based on statistical generalization performance dropped in the learning theory, which has developed over last three decades incremental process, we can easilyreturn last step

[1,2] It has been proven very successful in many applications withoutextracomputation;

[3,4,5,6] SVM is a supervised binary-class classifier, when The experimental results show the learning performance

wetrain samplesusingSVM, thecategories of thesamplesare of this algorithm approaches that of batch training, and needed to be known However, inmany cases, it is rare that performance well in large-scale dataset compared to other

we canobtain the data with theircategory be known, in other SVDC incrementallearning algorithm

words, mostof the obtained data'scategoriesareunknown.In Therestof thispaperisorganizedasfollows Insection 2 thissituation, traditional SVM isn'tappropriate wegiveanintroduction ofSVDC, andinsection 3wepresent TAXet alproposed amethod for data domain description our incremental algorithm Experimental and results called support vector domain description (SVDD) [7], and it is concerning the proposed algorithm are offered in Section 4 used todescribe data domain and delete outliers The key idea Section 5 collects the main conclusions

OfSVDD is to describe one class of data by finding a sphere

with minimum volume, which contains this class of data

Trang 2

2 Support Vector Domain Classifier with constrains , =1,= and 0<a, <C. Where the

2.1Support Vector Domain Description[7] inner product has been replaced with kernel function

K(.,.), and K(.,.) is a definite kernel satisfying mercer

Of a data setcontaiing N dataobj condition, for example a popular choice is the Gaussian

Of a data set containing N data objects, enl (,)=ep-xz2/2),>0

fx, Z =1, ,~ NJ} adescriptioniSrequired. Wetrytofinda kre:Kxz=pJ1X_12 221 a>.

{xs, i 1.,}acnd dscp requre e W wtr tindma To determine whether a test point is z within the

closed and compact sphere area Q with minimum sphere, the distance to the center of the sphere has to be volume, which contain all (or most of) the needed objects calculated A test object z accepted when this distance is

Q, and the outliers are outside Q Figure 1 shows the small than the radius, i.e., when (z -a)T(z-a) <R2 sketch ofSupportVectorDomainDescription (SVDD) Expressing the center of the sphere interm of the support

Z-a 2=K(z,z) 2 aiK(x z)+ZEaiaK(x1,xj) R2

ij

+ cassiication bo.urdary 2.2SupportVectorDomain Classifier

0++*- + + + la o SVDC situation Consider a training set of instance-label

pairs(xi,yi1),i=1, 2, l,l +1, N,where xi c R' and

This is very sensitive tothemostoutlying objectinthe Now we construct a hyper-sphere for samples of target objects When one or afew very remote objects are Yi =1, and the samples of yi =-1 are not considered,

in the training set, a very large sphere is obtained which then we can get the following quadratic optimization

will not represent the data very well Therefore, we allow

for some datapoints outside the sphere and introduce slack problem

minIIR2 + CZ ] (1) where Si 0,yi =1, and C is a constant Similarly,

where the C is a penalty constant which gives the usingmultipliersia> 0, ,i > 0,we introduceLagrangian

the number oferrors (numberof targetobjectsrejected) L(R, a,a>,)=R2 +aC,, -ocy'{R2-(x -a)T(x -a)}- /3X

(xi-a) (xi-a)<R +d Vi d 20 (2) and in formula (7), set the derivatives with respect to the Incorporating these constraints in (1), we construct the primalvariables R,a,j equaltozero,andre-substituting

L(R, a,a,) R2 +C4f -Eca{R2+ f -(x -2ax + a2)}-E / /

1 1W(o)a= Eo,yaK(x,yx) - E a iajy1yK(xi,xj)

withLagrangemultipliers a 2> , fi 0. (

Solving minimal solution offormula (3) cantransform sl-l = 1 0< a <C

tosolve the maximal solution of its dual problem Te ecndsg h iaycasshr-tutr

L(a)= ZaiK(x1,x1)-ZananK(x1,x j) (4) SVMclassifier

Trang 3

f(x)=sgn(R2-K(x, x)+2ZofyK(x, x)-ofcaayyYK(x, X1)) where a1k, ,f 0,(i 1, , lk) is Lagrangian multiplier.

(9) According to optimization solution conditions, we can

k{kyk=1e{S (K(Xk,Xi) a2yiyIK(xkxi) yK(xxj)) =Rk-Rk I-2?a'y> R 0 Rk=RkR + Yk''

in formula (10), xk represents support vector, and kis Finally we obtain the following decision function: the number ofsupport vector fk(x)=sgntRk-{K(x,x) +2Ea,y,K(x,X) -ZEa,ayjy,yjK(x,ix)}

If f(x)>0,the tested sample is contained insphere, ,ESV ,ESV

and we look the samples enclosed I sphere the same-class sgn{R21 +2Rkl E aoy1xi+( E aciyiXi)2}

objects Otherwise it is rejected, and we look it as the Xi,SVk xi,SVk

xiESV xiESV

3 SVDC Incremental Learning Algorithm

According formula (6), we suppose the obtained initial sgn{ffk (x) +2Rk EL aiy,x, + ( ciyixi)2}a

parameter(sphere radius) learningwith initialtrainingsetis xicsVk xicsVk

From equation (14) we can see it is easy to returnthe becomes Rk in the kthincremental learning, and the set last stepof incremental earning without extra computation

of support vectors becomes SVk, and the new dataset in From the above analysis we can see only conduct a

trifling modification onthe standard SVDC, canitbe used klh step becomes Dk ={(xk yk)j}l- to solve the updated model in incremental learning

procedure

Our incremental algorithm can be described as Nowwesummarizeouralgorithmasfollowings:

following:

Step1 Learning the initial concept: training SVDC Assume we has knownRkl updating the current using initial datasetoTS , then parameter R0 is

model using SJK,l1 and new dataset {(XiY7)}>=1 obtained;

We updating the current model using the following Step 2 Updating the current concept: when the new data

are available, using them to solve QP problem

where Rk-l is the radius of last optimization problem (11), 4 Experiments and Results

when k= 1, Ro is the radius of standard SVDC It is In order to evaluate thelearning performance offered by

obvious, when RklI =0, the incremental SVDC has the our incremental algorithm, we conducted experiment on six

different datasets taken from UCI Machine Repository: same form as the standard SVDC We will found the Banana, Diabetes, Flare-Solar, Heart, Breast-Cancer, German

updated model by the incremental SVDC also owns the Notesome of thenare notbinary-class classificationproblems,

butwehave transform themtobinary-class problem by special property of solution sparsity which is owned by the technique Experiment parameters and Dataset are shown in standard SVDC table 1 Fornotation simplicity, infigure 2, ouralgorithm was

abbreviate asOurISVM

In order to solve (11), we transform it to its dual The experiment parameters are listed in table 1 In

additionto conducting experiments with our algorithm, we problem,and introduce Lagrangian: also implementedand tested another popular and effective L=' R - R - -(k- - - incremental learning algorithm ISVM [8][9] on the same

2 a)(xk L=J'k"k datasets so that compare their learning performance in our

(12) experiment we choose RBF K(x,y)=exp( 2 ) as kernel

Trang 4

function, and the kernel widtho is not fixed The MATLAB 100 Cancer

experiment, and the experiment software and hardware 90

environment were: JIntel P4 PC(1.4GHz CPU, 256MB 85

70

Table 1 Data set and experiment parameters 65

55

02 50,I1 2 3 Incremental Learning Step

00

70

00

In table 1, the #TRS represents the number oftraining 65

samples, #TES represents the number of testing samples, 60

#ATT represents the number of attributes C is penalty 55

Literature [8] points out an efficient incremental (ncrementalLearning Step

learning algorithm should satisfies the following three (c)Flare-Solar

A Stability: When each step of incremental learning is 95 Our ISVM

over, the predication accuracy on the test should not vary 90

B Improvement: With the performing of the 80

75

incremental learning, the algorithms predication accuracy

C Recoverability: The incremental learning

algorithm should own the ability of performance

of the algorithm descends after a certain step learning, the Incremental LearningStep

different incrementallearning algorithms.

Banana

100~~~~0

100 r1 T 1 T r a ~~~~~~ ~~ISVM

85-95 -* OrIVMa

rl~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~7 -55

85~~~~~~~~~~~~~~~~~ 2 07 8 9 1

55 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Incremental Learning Step , 1 2 3 4 5 6 7 8 9 1 0 Ve,~~~~~~~~~~~~~~~~~~~~~~5

Incem~ental Learing Step

(a)

Trang 5

Heart [3] T Joachims.: Text categorization with supportvector machines:

Conference on Machine Learning, Springer, Berlin, 1998, pp.

85 o-° '~=0e

80 / [4] S Tong., E.,,<<< Chang,.: SupportVector Machine Active Learning

iEi 70 / ,,"ConferenceonMultimedia, 2000,pp 107-118.

65 ,

[5] Yang Deng et al A new method in data mining support

1 2 3 4 5 6 7 8 9 10 [6] L Baoqing Distance-based selection of potential support vector

Incremental Learning Step by kernel matrix In International symposium on Neural

Fig 2 Performance of two incremental learning algorithms

[7] D Tax.: One-class classification Ph D thesis, Delft University of From figure2 we can see after each step of incremental Technology, htp://www.phtn.tudelft.nl/-davidt/thesispdf(2001) training, the variation of the predication accuracy on the test

set is not various, which satisfy therequirement of algorithm [8] N ASyed,H Liu, K Sung From incremental learningtomodel stability., andwe can discovery the algorithm improvementis independent instance selection - a support vector machine

gradually improved and algorithm and the algorithm own the approach, Technical Report, TRA9/99, NUS, 1999

ability of performance recoverability So our incremental

ablgoithmo perfopo nedinrthisoperabmeets theduriremand o l [9] L Yangguang, C Qi, T yongchuan et al Incremental updating

method for support vector machine, Apweb2004, LNCS 3007,

The experiment results show, our algorithm has the

similar learning performance compared with the popular [10] S R Gunn. Support vector machines for classification and ISVM algorithm presented in [9] Another discovery in our regression Technical Report, Inage Speech and Intelligent experiment is with the gradually performing of our Systems Research Group, University of Southampton, 1997 incremental learning algorithm, the improvement of learning

performance become less and less, and at last , the learning

performance no longer improve It indicates that we can

estimate the needed number ofsamples required in problem

description by using this character

5 Conclusion

In this paper we proposed an incremental learning

algorithm basedon support vectordomain classifier (SVDC),

and its keyidea is toobtain the initial conceptusing standard

SVDC, then using the updating technique presented in this

paper, in fact which equalsto solve a QP problem similarto

that existing in standard SVDC algorithm solving

Experiments show that our algorithm is effective and

promising Others characters of this algorithm include:

updatingmodel has similar mathematics form compared with

standard SVDC,andwe canacquirethesparsity expressionof

its solutions, meanwhile using this algorithm can return last

step without extra computation, furthermore, this algorithm

can be used to estimate the needed number of samples

required in problem description

REFERENCES [1] C Cortes, V N Vapnik.: Support vector networks, Mach Learn.

20 (1995) pp 273-297.

[2] V N Vapnik.: Statistical learning Theory, Wiley, New York,

1998.

Định dạng
Số trang	5
Dung lượng	1,72 MB