DSpace at VNU: CAR-Miner: An efficient algorithm for mining class-association rules

Some theorems for pruning nodes and computing information in the tree are developed after that, and then, based on the theorems, we propose an efﬁcient algorithm for mining CARs.. Recent

Trang 1

CAR-Miner: An efﬁcient algorithm for mining class-association rules

Loan T.T Nguyena, Bay Vob,⇑, Tzung-Pei Hongc,d, Hoang Chi Thanhe

a

Faculty of Information Technology, VOV College, Ho Chi Minh, Viet Nam

b

Information Technology College, Ho Chi Minh, Viet Nam

c

Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan, ROC

d

Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan, ROC

e

Department of Informatics, Ha Noi University of Science, Ha Noi, Viet Nam

a r t i c l e i n f o

Keywords:

Accuracy

Classiﬁcation

Class-association rules

Data mining

Tree structure

a b s t r a c t

Building a high accuracy classifier for classification is a problem in real applications One high accuracy classifier used for this purpose is based on association rules In the past, some researches showed that classification based on association rules (or class-association rules – CARs) has higher accuracy than that

of other rule-based methods such as ILA and C4.5 However, mining CARs consumes more time because it mines a complete rule set Therefore, improving the execution time for mining CARs is one of the main problems with this method that needs to be solved In this paper, we propose a new method for mining class-association rule Firstly, we design a tree structure for the storage frequent itemsets of datasets Some theorems for pruning nodes and computing information in the tree are developed after that, and then, based on the theorems, we propose an efﬁcient algorithm for mining CARs Experimental results show that our approach is more efﬁcient than those used previously

1 Introduction

1.1 Motivation

Classiﬁcation plays an important role in decision support

systems A lot of methods for mining classiﬁcation rules have been

developed including C4.5 (Quinlan, 1992) and ILA (Tolun &

Abu-Soud, 1998; Tolun, Sever, Uludag, & Abu-Soud, 1999) Recently,

a new method for classiﬁcation from data mining, called

classiﬁca-tion based on associaclassiﬁca-tions (CBA), has been proposed for mining

class-association rules (CARs) This method has more advantages

than the heuristic and greedy methods in that the former can easily

remove noise, and the accuracy is thus higher It generates a more

complete rule set than C4.5 and ILA For association rule mining,

the target attribute (or class attribute) is not pre-determined

How-ever, the target attribute must be pre-determined in classiﬁcation

problems Thus, some algorithms for mining classiﬁcation rules

based on association rule mining have been proposed Examples

include classiﬁcation based on predictive association rules (Yin &

Han, 2003), classiﬁcation based on multiple association rules (Li,

Han, & Pei, 2001), classiﬁcation based on associations (CBA,Liu,

Hsu, & Ma, 1998), multi-class, multi-label associative classiﬁcation

(Thabtah, Cowling, & Peng, 2004), multi-class classiﬁcation based

on association rules (Thabtah, Cowling, & Peng, 2005), associative

classiﬁer based on maximum entropy (Thonangi & Pudi, 2005), Noah (Giuffrida, Chu, & Hanssens, 2000), and the use of the equivalence class rule tree (Vo & Le, 2008) Some researches have also reported that classiﬁers based on class-association rules are more accurate than those of traditional methods such as C4.5 and ILA both theoretically (Veloso, Meira, & Zaki, 2006) and with regard

to experimental results (Liu et al., 1998) Veloso et al proposed lazy associative classiﬁcation (Veloso et al., 2006; Veloso, Meira, Goncalves, & Zaki, 2007; Veloso, Meira, Goncalves, Almeida, & Zaki,

2011), which differed from CARs in that it used rules mined from the projected dataset of an unknown object for predicting the class instead of using the ones mined from the whole dataset Genetic algorithms have also been applied recently for mining CARs, and several approaches have been proposed For example,Chien and Chen (2010)proposed a GA-based approach to build the classifier for numeric datasets and applied it to stock trading data Kaya (2010) proposed a Pareto-optimal genetic approach for building autonomous classifiers Qodmanan, Nasiri, and Minaei-Bidgoli (2011)proposed a GA-based method without requiring minimum support or minimum confidence thresholds.Yang, Mabu, Shimada, and Hirasawa (2011)proposed an evolutionary approach to rank rules These algorithms were mainly based on heuristics in order

to build classiﬁers

All the above methods focused on the design of the algorithms for mining CARs or for building classiﬁers, but did not discuss much with regard to their mining time Therefore, in this paper, we aim to propose an efﬁcient algorithm for mining CARs based on a tree structure Section1.2will present our contributions in this paper

⇑ Corresponding author.

E-mail addresses: nguyenthithuyloan@vov.org.vn (L.T.T Nguyen), vdbay@itc.

edu.vn (B Vo), tphong@nuk.edu.tw (T.-P Hong), thanhhc@vnu.vn (H.C Thanh).

Contents lists available atSciVerse ScienceDirect

Expert Systems with Applications

j o u r n a l h o m e p a g e : w w w e l s e v i e r c o m / l o c a t e / e s w a

Trang 2

1.2 Our contributions

In the past,Vo and Le (2008)proposed a method for mining

CARs using the equivalence class rule-tree (ECR-tree) An efﬁcient

mining algorithm, named ECR-CARM, was proposed in their study

ECR-CARM scanned the dataset only once and was based on object

identiﬁers to quickly determine the support of itemsets However,

it was quite time consuming for generate-and-test candidates

because all itemsets with the same attributes are grouped into

one node in the tree Therefore, when joining two nodes liand lj

to create a new node, ECR-CARM had to consider each element of

liwith each element of ljto check whether they had the same preﬁx

or not In this paper, we design a MECR-tree as follow: Each node in

the tree contains an itemset, instead of their all the itemsets With

this tree, some theorems are also designed and based on them, an

algorithm is proposed for mining CARs

1.3 Organization of our paper

The rest of this paper is organized as follows In Section2, we

introduce some works related to mining CARs Section3presents

preliminary concepts The main contributions are presented in

Sec-tion4, in which a tree structure, named MECR-tree, is developed

and some theorems for pruning candidates fast are derived Based

on the tree and these theorems, we propose an algorithm for

min-ing CARs efﬁciently In Section5, we show and discuss the

experi-mental results The conclusions and future work are presented in

Section6

2 Related work

Mining CARs is discovery of all classiﬁcation rules that satisfy

the minimum support (minSup) and the minimum conﬁdence

(minConf) thresholds The ﬁrst method for mining CARs was

pro-posed byLiu et al (1998) It generates all candidate 1-ruleitems

and then calculates their supports for ﬁnding ruleitems that satisfy

minSup It then generates all the candidate 2-ruleitems from the

1-ruleitems by the same checking way The authors also proposed a

heuristic for building the classiﬁer The weak point of this method

is that it generates a lot of candidates and scans the dataset many

times, so it is time-consuming Therefore, the proposed algorithm

uses a threshold K and only generates k-ruleitems with k 6 K In

2000, an improved algorithm for solving the problem of

imbal-anced datasets has been proposed (Liu, Ma, & Wong, 2000) The

lat-ter has higher accuracy than the former because it uses the hybrid

approach for prediction

Li et al proposed a method based on the FP-tree (Li et al., 2001)

The advantage of this method is that it scans the dataset only two

times and uses an FP-tree to compress the dataset It also uses the

tree-projection technique to ﬁnd frequent itemsets To predict

un-seen data, this method ﬁnds all rules that satisfy this data and uses

a weightedv2measure to determine the class

Vo and Le proposed another approach based on the ECR-tree

(Vo & Le, 2008) This approach develops a tree structure called

the equivalence class- rules tree (ECR-tree), and proposes an

algo-rithm called ECR-CARM for mining CARs The algoalgo-rithm scans the

dataset only once It is based on the intersection of object

identiﬁ-cations to quickly compute the supports of itemsets.Nguyen, Vo,

Hong, and Thanh (2012)then proposed a new method for pruning

redundant rules based on a lattice

Thabtah et al (2004)proposed a multi-class, multi-label

asso-ciative classiﬁcation approach for mining CARs This method used

the rule form {(Ai1,ai1), (Ai2,ai2), ,(Aim,aim)} ? ci1_ ci2_ _ cil,

where a is a value of attribute A , and c is a class label

Some other class-association rule mining approaches have been presented in the work ofCoenen, Leng, and Zhang (2007), Giuffrida

et al (2000), Lim and Lee (2010), Liu, Jiang, Liu, and Yang (2008), Priss (2002), Sun, Wang, and Wong (2006), Thabtah et al (2005), Thabtah, Cowling, and Hammoud (2006), Thonangi and Pudi (2005), Yin and Han (2003), Zhang, Chen, and Wei (2011), and Zhao, Tsang, Chen, and Wang (2010)

3 Preliminary concepts Let D be the set of training data with n attributes A1,A2, ,An

and jDj objects (cases) Let C = {c1, c2, ,ck} be a list of class labels

A speciﬁc value of an attribute Aiand class C are denoted by the lower-case letters a and c, respectively

Deﬁnition 1 An itemset is a set of some pairs of attributes and a speciﬁc value, denoted {(Ai1,ai1), (Ai2,ai2), ,(Aim,aim)}

Deﬁnition 2 A class-association rule r is of the form {(Ai1,ai1), , (Aim,aim)} ? c, where {(Ai1,ai1), , (Aim,aim)} is an itemset, and c 2 C

is a class label

Deﬁnition 3 The actual occurrence ActOcc(r) of a rule r in D is the number of rows of D that match r’s condition

Deﬁnition 4 The support of a rule r, denoted Sup(r), is the number

of rows that match r’s condition and belong to r’s class

For example: Consider r: {(A,a1)} ? y from the dataset in

Table 1 We have ActOcc(r) = 3 and Sup(r) = 2 because there are three objects with A = a1, in that two objects have the same class y

4 Mining class-association rules 4.1 Tree structure

In this paper, we modify the ECR-tree structure (Vo & Le, 2008) into the MECR-tree structure (M stands for Modiﬁcation) as fol-lows In the ECR-tree, all itemsets with the same attributes are arranged into one group and put them in one node Itemsets in dif-ferent groups were then joined together to form itemsets with more items This led to the consumption of much time for gener-ate-and-test itemsets In our work, each node in the tree contains only one itemset along with the following information:

(a) Obidset: a set of object identiﬁers that contain the itemset (b) (#c1, #c2, , #ck) – a list of integers, where #ciis the num-ber of records in Obidset which belong to class ci, and (c) pos – a positive integer storing the position of the class with the maximum count, i.e., pos = argmaxi2[1,k]{# ci}

In the ECR-tree, the authors did not store ciand pos, thus need-ing to compute them for all nodes However, some values are not calculated in the proposed approach here with the MECR-tree by using theorems presented in Section4.2

Table 1

An example of training dataset.

Trang 3

For example, consider a node containing the itemset X = {(A,a3),

(B,b3)} fromTable 1 Because X is contained in objects 4 and 6, all of

them belong to class y Therefore, a node fðA; a3Þ; ðB; b3Þg

46ð2;0Þ

or more simply as 3 a3b3

46ð2;0Þ is generated in the tree The pos is 1 (underlined

at position 1 of this node) because the count of class y is at a

max-imum (2 as compared to 0) The latter is another representation of

the former for saving memory when we use the tree structure to

store itemsets We use bit presentation for storage of the itemset’s

attributes For example, AB can present as 11 in bit presentation,

and therefore, the value of these attributes is 3 With this

presen-tation, we can use bitwise operations to make itemsets join faster

4.2 Proposed algorithm

In this section, some theorems for fast mining CARs are

de-signed Based on these theorems, we propose an efﬁcient algorithm

for mining CARs

Theorem 1 Given two nodes att1values1

Obidset1ðc11; ;c1kÞ ? att2values2

Obidset2ðc21; ;c2kÞ, if att1=att2 and values1–values2, then

Obidset1\ Obidset2= £

Proof Since att1= att2 and values1–values2, there exist a val1

2 values1and a val22 values2such that val1and val2have the same

attributes but different values Thus, if a record with OIDicontains

val1, it cannot contain val2 Therefore, "OID 2 Obidset1, and it can

be inferred that OID R Obidset2 Thus, Obidset1\ Obidset2= £ h

In this theorem, we divide the itemset into form att values for

ease of use Theorem 1 infers that if two itemsets X and Y have the

same attributes, they do not need to be combined into the itemset

XY because Sup(XY) = 0 For example, consider the two nodes

1 a1

127ð2; 1Þ and

1 a2

38ð1; 1Þ, in which Obidset({(A, a1)}) = 127, and

Obidset({(A, a2)}) = 38 Obidset({(A, a1), (A, a2)}) = Obidset({(A,

a1)}) \ Obidset({(A, a2)}) = £ Similarly, Obidset ({(A, a1), (B,

b1)}) = 1, and Obidset({(A, a1); (B, b2)}) = 2 It can be inferred that

Obidset({(A, a1), (B, b1)}) \ Obidset({(A, a1); (B, b2)}) = £ because

both of these two itemsets have the same attributes AB but with

different values

Theorem 2 Given two nodes itemset1

Obidset1ðc11; ;c1kÞ and itemset2 Obidset2ðc21; ;c2kÞ, if itemset1 itemset2 and jObidset1j =

jObid-set2j, then "i 2 [1, k]: c1i= c2i

Proof We have itemset1itemset2, this means that all records

con-taining itemset2 also contain itemset1, and therefore, Obidset2#

Obidset1 Additionally, according to theory, we have jObidset

1-j = 1-jObidset2j, this means that we have Obidset2= Obidset1, or "

i 2 [1, k]: c1i= c2i h

From Theorem 2, when we join two parent nodes into a child

node, then the itemset of the child node is always a supperset of

the itemset of each of the parent nodes Therefore, we will check

their cardinations, and if they are the same, we need not compute

the count for each class and the pos of this node because they are

the same as the parent node

Using these theorems, we develop an algorithm for mining CARs

efﬁciently By Theorem 1, we need not join two nodes with the

same attributes, and by Theorem 2, we need not compute the

information for some child nodes

First of all, the root node of the tree (Lr) contains children nodes

such that each node contains a single frequent itemset After that,

procedure CAR-Miner will be called with the parameter Lrto mine all CARs from the dataset D

The CAR-Miner procedure (Fig 1) considers each node liwith all the other node ljin Lr, with j > i (Lines 2 and 5) to generate a candidate child node l With each pair (li, lj), the algorithm checks whether li.att – lj.att or not (Line 6, using Theorem 1) If they are different, it computes the three elements att, values, Obidset for the new node O (Lines 7–9) Line 10 checks if the number of ob-ject identiﬁers of liis equal to the number of object identiﬁers of

O (by Theorem 2) If this is true, then, by Theorem 2, the algo-rithm can copy all information from node li to node O (Lines 11–12) Similarly, in the event that the result of Line 10 is false, the algorithm checks liwith O, and if the numbers of their object identiﬁers are the same (Line 13), the algorithm can copy all information from node lj to node O (Lines 14–15) Otherwise, the algorithm computes the O.count by using O.Obidset and O.pos (Lines 17–18) After computing all of the information for node O, the algorithm adds it to Pi(Piis initialized empty in Line 4) if O.count[O.pos] P minSup (Lines 19–20) Finally, CAR-Miner will be recursively called with a new set Pias its input parameter (Line 21)

The procedure ENUMERATE-CAR(l, minConf) generates a rule from node l It first computes the confidence of the rule (Line 22), if the confidence of this rule satisfies minConf (Line 23), then

it adds this rule into the set of CARs (Line 24)

4.3 An example

In this section, we use the example inTable 1to describe the CAR-Miner process with minSup = 10% and minConf = 60% Fig 2

shows the results of this process

The MECR-tree was built from the dataset inTable 1as follows: First, the root node Lr contains all frequent 1-itemsets such as

1a1 1a2 1a3 2 b1 2b2 2b3 4c1 4c2 127ð2;1Þ 38ð0;2Þ 456ð2;1Þ 15ð1;1Þ 238ð0;3Þ 467ð3;0Þ 12346ð3;2Þ 578ð1;2Þ

.

After that, procedure CAR-Miner is called with the parameter Lr

We use node li¼38ð0;2Þ1a2 as an example for illustrating the CAR-Miner process lijoins with all nodes following it in Lr:

With node lj¼ 1 a3

456ð2; 1Þ: They (liand lj) have the same attri-bute and different values Do not make any thing from them

With node lj¼ 2 b1

15ð1; 1Þ: Because their attributes are different, three elements are computed such as O.att = li.att [ lj.att = 1 j

2 = 3 or 11 in bit presentation; O.values = li.values [ lj values = a2 [ b1 = a2b1, and O.Obidset = li.Obidset \ lj.Obidset = {3,8} \ {1,5} = {£} Because the O.count[O.pos] = 0 < minSup, O

is not added to Pi

With node lj¼ 2 b2

2 = 3 or 11 in bit presentation; O.values = li.values [ lj values = a2 [ b2 = a2b2, and O.Obidset = li.Obidset \ lj Obid-set = {3,8} \ {2,3,8} = {3,8} Because of jli.Obidsetj = jO.Obidsetj, the algorithm copies all information for lito O This means that O.count = li.count = (0,2), and O.pos = 2 Because O.count [O.pos] = 2 > minSup, O is added to Pi) Pi¼ 3 a2b2

38ð0; 2Þ

With node lj¼ 2 b3

467ð3; 0Þ: Because their attributes are different, three elements are computed such as O.att = li.att [ lj.att = 1j2 = 3 or 11 in bit presentation; O.values = li.values [ lj values = a2 [ b3 = a2b3, and O.Obidset = li.Obidset \ lj.Obidset= {3,8} \ {4,6,7} = {£} Because the O.count[O.pos] = 0 < minSup,

O is not added to P

Trang 4

With node lj¼ 4 c1

12346ð3; 2Þ: Because their attributes are differ-ent, three elements are computed such as O.att = li.att [ lj.att = 1

j 4 = 5 or 101 in bit presentation; O.values = li.values [lj

.val-ues = a2 [ c1 = a2c1, and O.Obidset = li.Obidset \ lj.Obidset =

{3,8} \ {1,2,3,4,6} = {3} The algorithm computes additional

information including O.count = {0,1} and O.pos = 2 Because

the O.count[O.pos] = 1 P minSup, O is added to Pi) Pi¼

3 a2b2

38ð0; 2Þ;

5 a2c1

3ð0; 1Þ

With node lj¼ 4 c2

4 = 5 or 101 in bit presentation; O.values = li.values [ lj val-ues = a2 [ c2 = a2c2, and O.Obidset = li.Obidset \ lj.Obidset = {3,8}

\ {5,7,8} = {8} The algorithm computes additional information including O.count = {0,1} and O.pos = 2 Because the O.count[O.pos] = 1 P minSup, O is added to Pi) Pi¼

3 a2b2 38ð0; 2Þ ;

5 a2c1 8ð0; 1Þ ;

5 a2c2 8ð0; 1Þ

Fig 1 The proposed algorithm for mining CARs.

Fig 2 MECR-tree for the dataset in Table 1

Trang 5

After Piis created, the CAR-Miner is called recursively with

parameters Pi, minSup, and minConf to create all children nodes

of Pi Consider the process to make child nodes of node

li¼ 3 a2b2

38ð0; 2Þ:

With node lj¼ 5 a2c1

3ð0; 1Þ : Because their attributes are differ-ent, three elements are computed such as O.att = li.att

[ lj.att = 3 j 5 = 7 or 111 in bit presentation; O.values = li

values [ lj.values = a2b2 [ a2 c1 = a2b2c1, and O.Obidset =

li.Obidset \ lj.Obidset = {3,8} \ {3} = {3} = lj.Obidset the

algo-rithm copies all information of lj to O, it means that

O.count = lj.count = (0,1) and O.pos = 2 Because the

O.count[O.pos] = 1 > minSup, O is added to Pi) Pi¼

7 a2b2c1

3ð0; 1Þ

Using the same process for node lj¼ 5 a2c2

8ð0; 1Þ , we have the result Pi¼ 7 a2b2c1

3ð0; 1Þ ;

7 a2b2c2 8ð0; 1Þ

Rules are easily to generate in the same step for traversing li

(Line 3) by calling procedure ENUMERATE-CAR(li, minConf) For

example, when traversing node li¼ 38ð0; 2Þ1 a2, the procedure

com-putes the conﬁdence of the candidate rule, conf = li.count[li.pos]/

jli.Obidsetj = 2/2 = 1 Because conf P minConf (60%), add rule {(A,

a2)} ? n (2,1) into the rule set CARs The meaning of this rule is

‘‘If A = a2 then class = n’’ (support = 2 and conﬁdence = 100%)

To show the efﬁciency of Theorem 2, we can see that the

algo-rithm need not compute the information of some itemsets, such as

{3 a2b2, 7 a1b1c1, 7 a1b2c1, 7 a1b3c2, 7 a2b2c1,

7 a2b2c2, 7 a3b1c2, 7 a3b3c1}

5 Experimental results

5.1 Characteristics of experimental datasets

The algorithms used in the experiments were coded on a

per-sonal computer with C#2008, Windows 7, Centrino 2 2.53 GHz,

and 4 MBs RAM The experimental results were tested in the

data-sets obtained from the UCI Machine Learning Repository (http://

mlearn.ics.uci.edu).Table 4shows the characteristics of the

exper-imental datasets

The experimental datasets had different features The Breast,

German and Vehicle datasets had many attributes and distinctive

(values) but had very few numbers of objects (or records) The

Led7 dataset had only a few attributes, distinctive values and

num-ber of objects

5.2 Numbers of rules of the experimental datasets

Figs 3–7show the numbers of rules of the datasets inTable 4

for different minimum support thresholds We used a

min-Conf = 50% for all experiments

The results fromFigs 3–7show that some datasets had a lot of

rules For example, the Lymph dataset had 4,039,186 rules with a

Fig 3 Numbers of CARs in the breast dataset for various minSup values.

Fig 6 Numbers of CARs in the Led7 dataset for various minSup values Fig 5 Numbers of CARs in the lymph dataset for various minSup values Fig 4 Numbers of CARs in the German dataset for various minSup values.

Table 4

The characteristics of the experimental datasets.

Dataset #attrs #classes #distinct values #Objs

Trang 6

minSup = 1% The German dataset had 752,643 rules with a

min-Sup = 1%, etc

5.3 Execution time

Experiments were then made to compare the execution time

between CAR-Miner and ECR-CARM (Vo & Le, 2008) The results

are shown inFigs 8–12

Results fromFigs 8–12show CAR-Miner to be more efﬁcient

than ECR-CARM in all of the experiments For example: Consider

the Breast dataset with a minSup = 0.1% The mining time for the CAR-Miner was 1.517 s, while that for the ECR-CARM was 17.136 s The ratio was1:517

17:136 100% ¼ 8:85%

6 Conclusions and future work This paper proposed a new algorithm for mining CARs using a tree structure Each node in the tree contained some information for fast computation of the support of the candidate rule In addi-tion, using Obidset, we were able to compute the support of item-sets quickly Some theorems were also developed Based on these theorems, we did not need to compute the information for a lot

of nodes in the tree With these improvements, the proposed algo-rithm had better performance relative to the previous algoalgo-rithm in regard to all results

Mining itemsets from incremental databases has been devel-oped in recent years (Gharib, Nassar, Taha, & Abraham, 2010; Hong

& Wang, 2010; Hong, Lin, & Wu, 2009; Hong, Wang, & Tseng, 2011; Lin, Hong, & Lu, 2009) It can be seen that it saves a lot of time and memory when compared with mining from integration databases Therefore, in the future, we will study how to use this approach for mining CARs

Acknowledgements This work was supported by Vietnam’s National Foundation for Science and Technology Development (NAFOSTED) under Grant

No 102.01-2012.47

This paper has been completed while the second author is vis-iting Vietnam Institute for Advanced Study in Mathematics (VIASM), Ha Noi, Viet Nam

References

Chien, Y W C., & Chen, Y L (2010) Mining associative classiﬁcation rules with stock trading data – A GA-based method Knowledge-Based Systems, 23(6), 605–614.

Coenen, F., Leng, P., & Zhang, L (2007) The effect of threshold values on association rule based classiﬁcation accuracy Data and Knowledge Engineering, 60(2), 345–360.

Gharib, T F., Nassar, H., Taha, M., & Abraham, A (2010) An efﬁcient algorithm for incremental mining of temporal association rules Data and Knowledge Engineering, 69(8), 800–815.

Giuffrida, G., Chu, W W., & Hanssens, D M (2000) Mining classiﬁcation rules from datasets with large number of many-valued attributes In 7th International conference on extending database technology: advances in database technology (EDBT’00) (pp 335–349) Munich, Germany.

Hong, T P., & Wang, C J (2010) An efﬁcient and effective association-rule maintenance algorithm for record modiﬁcation Expert Systems with Applications, 37(1), 618–626.

Hong, T P., Lin, C W., & Wu, Y L (2009) Maintenance of fast updated frequent pattern trees for record deletion Computational Statistics and Data Analysis, 53(7), 2485–2499.

Hong, T P., Wang, C Y., & Tseng, S S (2011) An incremental mining algorithm for maintaining sequential patterns using pre-large sequences Expert Systems with

Fig 12 The execution time for CAR-Miner and ECR-CARM in the vehicle dataset.

Fig 11 The execution time for CAR-Miner and ECR-CARM in the Led7 dataset.

Fig 8 The execution time for CAR-Miner and ECR-CARM in the breast dataset.

Fig 9 The execution time for CAR-Miner and ECR-CARM in the German dataset.

Fig 10 The execution time for CAR-Miner and ECR-CARM in the lymph dataset.

Trang 7

Kaya, M (2010) Autonomous classiﬁers with understandable rule using

multi-objective genetic algorithms Expert Systems with Applications, 37(4),

3489–3494.

Li, W., Han, J., & Pei, J (2001) CMAR: Accurate and efﬁcient classiﬁcation based on

multiple class-association rules In 1st IEEE international conference on data

mining (pp 369–376) San Jose, CA, USA.

Lim, A H L., & Lee, C S (2010) Processing online analytics with classiﬁcation and

association rule mining Knowledge-Based Systems, 23(3), 248–255.

Lin, C W., Hong, T P., & Lu, W H (2009) The pre-FUFP algorithm for incremental

mining Expert Systems with Applications, 36(5), 9498–9505.

Liu, B., Hsu, W., & Ma, Y (1998) Integrating classiﬁcation and association rule

mining In 4th International conference on knowledge discovery and data mining

(pp 80–86) New York, USA.

Liu, B., Ma, Y., & Wong, C K (2000) Improving an association rule based classiﬁer In

4th European conference on principles of data mining and knowledge discovery (pp.

80–86) Lyon, France.

Liu, Y Z., Jiang, Y C., Liu, X., & Yang, S L (2008) CSMC: A combination strategy for

multiclass classiﬁcation based on multiple association rules Knowledge-Based

Systems, 21(8), 786–793.

Nguyen, T T L., Vo, B., Hong, T P., & Thanh, H C (2012) Classiﬁcation based on

association rules: A lattice-based approach Expert Systems with Applications,

39(13), 11357–11366.

Priss, U (2002) A classiﬁcation of associative and formal concepts In The Chicago

linguistic society’s 38th annual meeting (pp 273–284) Chicago, USA.

Qodmanan, H R., Nasiri, M., & Minaei-Bidgoli, B (2011) Multi objective association

rule mining with genetic algorithm without specifying minimum support and

minimum conﬁdence Expert Systems with Applications, 38(1), 288–298.

Quinlan, J R (1992) C4.5: program for machine learning Morgan Kaufmann.

Sun, Y., Wang, Y., & Wong, A K C (2006) Boosting an associative classiﬁer IEEE

Transactions on Knowledge and Data Engineering, 18(7), 988–992.

Thabtah, F., Cowling, P., & Hammoud, S (2006) Improving rule sorting, predictive

accuracy and training time in associative classiﬁcation Expert Systems with

Applications, 31(2), 414–426.

Thabtah, F., Cowling, P., & Peng, Y (2004) MMAC: A new multi-class, multi-label

associative classiﬁcation approach In 4th IEEE international conference on data

mining (pp 217–224) Brighton, UK.

Thabtah, F., Cowling, P., & Peng, Y (2005) MCAR: Multi-class classiﬁcation based on association rule In 3rd ACS/IEEE international conference on computer systems and applications (pp 33–39) Tunis, Tunisia.

Thonangi, R., & Pudi, V (2005) ACME: An associative classiﬁer based on maximum entropy principle In 16th International conference algorithmic learning theory (pp 122–134) LNAI 3734, Singapore.

Tolun, M R., & Abu-Soud, S M (1998) ILA: An inductive learning algorithm for production rule discovery Expert Systems with Applications, 14(3), 361–370 Tolun, M R., Sever, H., Uludag, M., & Abu-Soud, S M (1999) ILA-2: An inductive learning algorithm for knowledge discovery Cybernetics and Systems, 30(7), 609–628.

Veloso, A., Meira, Jr., W., & Zaki, M J (2006) Lazy associative classiﬁcation In 2006 IEEE international conference on data mining (ICDM’06) (pp 645–654) Hong Kong, China.

Veloso, A., Meira, W., Jr., Goncalves, M., & Zaki, M J (2007) Multi-label lazy associative classiﬁcation In 11th European conference on principles of data mining and knowledge discovery (pp 605–612) Warsaw, Poland.

Veloso, A., Meira, W., Jr., Goncalves, M., Almeida, H M., & Zaki, M J (2011) Calibrated lazy associative classiﬁcation Information Sciences, 181(13), 2656–2670.

Vo, B., & Le, B (2008) A novel classiﬁcation algorithm based on association rule mining In The 2008 Paciﬁc rim knowledge acquisition workshop (held with PRICAI’08) (pp 61–75) LNAI 5465, Ha Noi, Viet Nam.

Yang, G., Mabu, S., Shimada, K., & Hirasawa, K (2011) An evolutionary approach to rank class association rules with feedback mechanism Expert Systems with Applications, 38(12), 15040–15048.

Yin, X., & Han, J (2003) CPAR: Classiﬁcation based on predictive association rules.

In SIAM international conference on data mining (SDM’03) (pp 331–335) San Francisco, CA, USA.

Zhang, X., Chen, G., & Wei, Q (2011) Building a highly-compact and accurate associative classiﬁer Applied Intelligence, 34(1), 74–86.

Zhao, S., Tsang, E C C., Chen, D., & Wang, X Z (2010) Building a rule-based classiﬁer – A fuzzy-rough set approach IEEE Transactions on Knowledge and Data Engineering, 22(5), 624–638.

Định dạng
Số trang	7
Dung lượng	881,7 KB