An Incremental Learning Algorithm Based onYinggangZhao, QinmingHe College of Computer Science, Zhejiang University, Hangzhou310027, China Email:ygzl29g163.com SVDDalgorithmgivesus anenli
Trang 1An Incremental Learning Algorithm Based on
YinggangZhao, QinmingHe College of Computer Science, Zhejiang University, Hangzhou310027, China
Email:ygzl29g163.com
SVDDalgorithmgivesus anenlightenment: whenweclassify
a binary-class dataset, if we only know part of sample's
Abstract category ( for example, samples with category label yi =1),
yet the other part of sample's category is unknown, thenwe
Incremental learning technique is usually used to solve can design new type of classifier based on SVDD named large-scaleproblem Wefirstlygave amodifedsupport vector support vector domain classifier ( SVDC) This new classifier machine (SVM) classification method support vector only need to describe the data with known category, then domain classifer (SVDC), then an incremental learning obtaining the description boundary of this class of data algorithm based on SVDC was proposed The basic idea of Finally, we can classify the unknown binary-class data this incremental algorithm is to obtain the initial target according to the obtained boundary
concepts using SVDC during the trainingprocedure and then In this paper our incremental learning algorithm is based update these target concepts by an updating model Difierent on SVDC, and this algorithm is motivated by the from the existed incremental learning approaches, in our person-learning procedure When learning a complicated algorithm, the model updating procedure equals to solve a concept, people usually obtain a initial concept by using part quadraticprogramming(QP)problem, andtheupdated model ofuseful information, then update the obtained concept by still owns theproperty ofspars solution Compared with other utilizing new information In term of our incremental existed incrementallearningalgorithms, theinverseprocedure algorithm based on SVDC, it firstly utilize part of data
ofour algorithm (i.e decreasing learning) iseasy toconduct ( memory space permitting), then obtain a concept (namely the without extra computation Experiment results show our parameter of obtained decision hypersurface) by SVDC algorithm iseffectiveandfeasible learning algorithm, finally according to the information of
decision hypersurface acquired in last step, update the
parameterof decisionhypersurfacegainedinlaststeputilizing
Keywords: Support Vector Machines, Support Vector Domain specialized updating model in the process of incremental
Classifier, Incremental learning, Classification learning,namely updating the known concept
Ouralgorithmownsthefollowingcharacters:
Withlargeamounts of data availableto machinelearning has a similar mathematics form compared with community, the need to design techniques that scale well is standard SVDC algorithm, and any algorithm used morecritical than before As somedata may be collectedover to obtain the standard SVDC can also be used to longperiods, there is alsoacontinuous needtoincorporatethe obtain theupdatingmodel ofouralgorithm; new data into the previously learned concept Incremental 2) The inverse procedure of this algorithm, i.e the learningtechniquescansatisfythe need for both thescalability decreasing learning procedure is easy to
Support vector machine (SVM) is based on statistical generalization performance dropped in the learning theory, which has developed over last three decades incremental process, we can easilyreturn last step
[1,2] It has been proven very successful in many applications withoutextracomputation;
[3,4,5,6] SVM is a supervised binary-class classifier, when The experimental results show the learning performance
wetrain samplesusingSVM, thecategories of thesamplesare of this algorithm approaches that of batch training, and needed to be known However, inmany cases, it is rare that performance well in large-scale dataset compared to other
we canobtain the data with theircategory be known, in other SVDC incrementallearning algorithm
words, mostof the obtained data'scategoriesareunknown.In Therestof thispaperisorganizedasfollows Insection 2 thissituation, traditional SVM isn'tappropriate wegiveanintroduction ofSVDC, andinsection 3wepresent TAXet alproposed amethod for data domain description our incremental algorithm Experimental and results called support vector domain description (SVDD) [7], and it is concerning the proposed algorithm are offered in Section 4 used todescribe data domain and delete outliers The key idea Section 5 collects the main conclusions
OfSVDD is to describe one class of data by finding a sphere
with minimum volume, which contains this class of data
Trang 22 Support Vector Domain Classifier with constrains , =1,= and 0<a, <C. Where the
2.1Support Vector Domain Description[7] inner product has been replaced with kernel function
K(.,.), and K(.,.) is a definite kernel satisfying mercer
Of a data setcontaiing N dataobj condition, for example a popular choice is the Gaussian
Of a data set containing N data objects, enl (,)=ep-xz2/2),>0
fx, Z =1, ,~ NJ} adescriptioniSrequired. Wetrytofinda kre:Kxz=pJ1X_12 221 a>.
{xs, i 1.,}acnd dscp requre e W wtr tindma To determine whether a test point is z within the
closed and compact sphere area Q with minimum sphere, the distance to the center of the sphere has to be volume, which contain all (or most of) the needed objects calculated A test object z accepted when this distance is
Q, and the outliers are outside Q Figure 1 shows the small than the radius, i.e., when (z -a)T(z-a) <R2 sketch ofSupportVectorDomainDescription (SVDD) Expressing the center of the sphere interm of the support
Z-a 2=K(z,z) 2 aiK(x z)+ZEaiaK(x1,xj) R2
ij
+ cassiication bo.urdary 2.2SupportVectorDomain Classifier
0++*- + + + la o SVDC situation Consider a training set of instance-label
pairs(xi,yi1),i=1, 2, l,l +1, N,where xi c R' and
This is very sensitive tothemostoutlying objectinthe Now we construct a hyper-sphere for samples of target objects When one or afew very remote objects are Yi =1, and the samples of yi =-1 are not considered,
in the training set, a very large sphere is obtained which then we can get the following quadratic optimization
will not represent the data very well Therefore, we allow
for some datapoints outside the sphere and introduce slack problem
minIIR2 + CZ ] (1) where Si 0,yi =1, and C is a constant Similarly,
where the C is a penalty constant which gives the usingmultipliersia> 0, ,i > 0,we introduceLagrangian
the number oferrors (numberof targetobjectsrejected) L(R, a,a>,)=R2 +aC,, -ocy'{R2-(x -a)T(x -a)}- /3X
(xi-a) (xi-a)<R +d Vi d 20 (2) and in formula (7), set the derivatives with respect to the Incorporating these constraints in (1), we construct the primalvariables R,a,j equaltozero,andre-substituting
L(R, a,a,) R2 +C4f -Eca{R2+ f -(x -2ax + a2)}-E / /
1 1W(o)a= Eo,yaK(x,yx) - E a iajy1yK(xi,xj)
withLagrangemultipliers a 2> , fi 0. (
Solving minimal solution offormula (3) cantransform sl-l = 1 0< a <C
tosolve the maximal solution of its dual problem Te ecndsg h iaycasshr-tutr
L(a)= ZaiK(x1,x1)-ZananK(x1,x j) (4) SVMclassifier
Trang 3f(x)=sgn(R2-K(x, x)+2ZofyK(x, x)-ofcaayyYK(x, X1)) where a1k, ,f 0,(i 1, , lk) is Lagrangian multiplier.
(9) According to optimization solution conditions, we can
k{kyk=1e{S (K(Xk,Xi) a2yiyIK(xkxi) yK(xxj)) =Rk-Rk I-2?a'y> R 0 Rk=RkR + Yk''
in formula (10), xk represents support vector, and kis Finally we obtain the following decision function: the number ofsupport vector fk(x)=sgntRk-{K(x,x) +2Ea,y,K(x,X) -ZEa,ayjy,yjK(x,ix)}
If f(x)>0,the tested sample is contained insphere, ,ESV ,ESV
and we look the samples enclosed I sphere the same-class sgn{R21 +2Rkl E aoy1xi+( E aciyiXi)2}
objects Otherwise it is rejected, and we look it as the Xi,SVk xi,SVk
xiESV xiESV
3 SVDC Incremental Learning Algorithm
According formula (6), we suppose the obtained initial sgn{ffk (x) +2Rk EL aiy,x, + ( ciyixi)2}a
parameter(sphere radius) learningwith initialtrainingsetis xicsVk xicsVk
From equation (14) we can see it is easy to returnthe becomes Rk in the kthincremental learning, and the set last stepof incremental earning without extra computation
of support vectors becomes SVk, and the new dataset in From the above analysis we can see only conduct a
trifling modification onthe standard SVDC, canitbe used klh step becomes Dk ={(xk yk)j}l- to solve the updated model in incremental learning
procedure
Our incremental algorithm can be described as Nowwesummarizeouralgorithmasfollowings:
following:
Step1 Learning the initial concept: training SVDC Assume we has knownRkl updating the current using initial datasetoTS , then parameter R0 is
model using SJK,l1 and new dataset {(XiY7)}>=1 obtained;
We updating the current model using the following Step 2 Updating the current concept: when the new data
are available, using them to solve QP problem
where Rk-l is the radius of last optimization problem (11), 4 Experiments and Results
when k= 1, Ro is the radius of standard SVDC It is In order to evaluate thelearning performance offered by
obvious, when RklI =0, the incremental SVDC has the our incremental algorithm, we conducted experiment on six
different datasets taken from UCI Machine Repository: same form as the standard SVDC We will found the Banana, Diabetes, Flare-Solar, Heart, Breast-Cancer, German
updated model by the incremental SVDC also owns the Notesome of thenare notbinary-class classificationproblems,
butwehave transform themtobinary-class problem by special property of solution sparsity which is owned by the technique Experiment parameters and Dataset are shown in standard SVDC table 1 Fornotation simplicity, infigure 2, ouralgorithm was
abbreviate asOurISVM
In order to solve (11), we transform it to its dual The experiment parameters are listed in table 1 In
additionto conducting experiments with our algorithm, we problem,and introduce Lagrangian: also implementedand tested another popular and effective L=' R - R - -(k- - - incremental learning algorithm ISVM [8][9] on the same
2 a)(xk L=J'k"k datasets so that compare their learning performance in our
(12) experiment we choose RBF K(x,y)=exp( 2 ) as kernel
Trang 4function, and the kernel widtho is not fixed The MATLAB 100 Cancer
experiment, and the experiment software and hardware 90
environment were: JIntel P4 PC(1.4GHz CPU, 256MB 85
70
Table 1 Data set and experiment parameters 65
55
02 50,I1 2 3 Incremental Learning Step
00
70
00
In table 1, the #TRS represents the number oftraining 65
samples, #TES represents the number of testing samples, 60
#ATT represents the number of attributes C is penalty 55
Literature [8] points out an efficient incremental (ncrementalLearning Step
learning algorithm should satisfies the following three (c)Flare-Solar
A Stability: When each step of incremental learning is 95 Our ISVM
over, the predication accuracy on the test should not vary 90
B Improvement: With the performing of the 80
75
incremental learning, the algorithms predication accuracy
C Recoverability: The incremental learning
algorithm should own the ability of performance
of the algorithm descends after a certain step learning, the Incremental LearningStep
different incrementallearning algorithms.
Banana
100~~~~0
100 r1 T 1 T r a ~~~~~~ ~~ISVM
85-95 -* OrIVMa
rl~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~7 -55
85~~~~~~~~~~~~~~~~~ 2 07 8 9 1
55 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Incremental Learning Step , 1 2 3 4 5 6 7 8 9 1 0 Ve,~~~~~~~~~~~~~~~~~~~~~~5
Incem~ental Learing Step
(a)
Trang 5Heart [3] T Joachims.: Text categorization with supportvector machines:
Conference on Machine Learning, Springer, Berlin, 1998, pp.
85 o-° '~=0e
80 / [4] S Tong., E.,,<<< Chang,.: SupportVector Machine Active Learning
iEi 70 / ,,"ConferenceonMultimedia, 2000,pp 107-118.
65 ,
[5] Yang Deng et al A new method in data mining support
1 2 3 4 5 6 7 8 9 10 [6] L Baoqing Distance-based selection of potential support vector
Incremental Learning Step by kernel matrix In International symposium on Neural
Fig 2 Performance of two incremental learning algorithms
[7] D Tax.: One-class classification Ph D thesis, Delft University of From figure2 we can see after each step of incremental Technology, htp://www.phtn.tudelft.nl/-davidt/thesispdf(2001) training, the variation of the predication accuracy on the test
set is not various, which satisfy therequirement of algorithm [8] N ASyed,H Liu, K Sung From incremental learningtomodel stability., andwe can discovery the algorithm improvementis independent instance selection - a support vector machine
gradually improved and algorithm and the algorithm own the approach, Technical Report, TRA9/99, NUS, 1999
ability of performance recoverability So our incremental
ablgoithmo perfopo nedinrthisoperabmeets theduriremand o l [9] L Yangguang, C Qi, T yongchuan et al Incremental updating
method for support vector machine, Apweb2004, LNCS 3007,
The experiment results show, our algorithm has the
similar learning performance compared with the popular [10] S R Gunn. Support vector machines for classification and ISVM algorithm presented in [9] Another discovery in our regression Technical Report, Inage Speech and Intelligent experiment is with the gradually performing of our Systems Research Group, University of Southampton, 1997 incremental learning algorithm, the improvement of learning
performance become less and less, and at last , the learning
performance no longer improve It indicates that we can
estimate the needed number ofsamples required in problem
description by using this character
5 Conclusion
In this paper we proposed an incremental learning
algorithm basedon support vectordomain classifier (SVDC),
and its keyidea is toobtain the initial conceptusing standard
SVDC, then using the updating technique presented in this
paper, in fact which equalsto solve a QP problem similarto
that existing in standard SVDC algorithm solving
Experiments show that our algorithm is effective and
promising Others characters of this algorithm include:
updatingmodel has similar mathematics form compared with
standard SVDC,andwe canacquirethesparsity expressionof
its solutions, meanwhile using this algorithm can return last
step without extra computation, furthermore, this algorithm
can be used to estimate the needed number of samples
required in problem description
REFERENCES [1] C Cortes, V N Vapnik.: Support vector networks, Mach Learn.
20 (1995) pp 273-297.
[2] V N Vapnik.: Statistical learning Theory, Wiley, New York,
1998.