Some theorems for pruning nodes and computing information in the tree are developed after that, and then, based on the theorems, we propose an efficient algorithm for mining CARs.. Recent
Trang 1CAR-Miner: An efficient algorithm for mining class-association rules
Loan T.T Nguyena, Bay Vob,⇑, Tzung-Pei Hongc,d, Hoang Chi Thanhe
a
Faculty of Information Technology, VOV College, Ho Chi Minh, Viet Nam
b
Information Technology College, Ho Chi Minh, Viet Nam
c
Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan, ROC
d
Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan, ROC
e
Department of Informatics, Ha Noi University of Science, Ha Noi, Viet Nam
a r t i c l e i n f o
Keywords:
Accuracy
Classification
Class-association rules
Data mining
Tree structure
a b s t r a c t
Building a high accuracy classifier for classification is a problem in real applications One high accuracy classifier used for this purpose is based on association rules In the past, some researches showed that classification based on association rules (or class-association rules – CARs) has higher accuracy than that
of other rule-based methods such as ILA and C4.5 However, mining CARs consumes more time because it mines a complete rule set Therefore, improving the execution time for mining CARs is one of the main problems with this method that needs to be solved In this paper, we propose a new method for mining class-association rule Firstly, we design a tree structure for the storage frequent itemsets of datasets Some theorems for pruning nodes and computing information in the tree are developed after that, and then, based on the theorems, we propose an efficient algorithm for mining CARs Experimental results show that our approach is more efficient than those used previously
2012 Elsevier Ltd All rights reserved
1 Introduction
1.1 Motivation
Classification plays an important role in decision support
systems A lot of methods for mining classification rules have been
developed including C4.5 (Quinlan, 1992) and ILA (Tolun &
Abu-Soud, 1998; Tolun, Sever, Uludag, & Abu-Soud, 1999) Recently,
a new method for classification from data mining, called
classifica-tion based on associaclassifica-tions (CBA), has been proposed for mining
class-association rules (CARs) This method has more advantages
than the heuristic and greedy methods in that the former can easily
remove noise, and the accuracy is thus higher It generates a more
complete rule set than C4.5 and ILA For association rule mining,
the target attribute (or class attribute) is not pre-determined
How-ever, the target attribute must be pre-determined in classification
problems Thus, some algorithms for mining classification rules
based on association rule mining have been proposed Examples
include classification based on predictive association rules (Yin &
Han, 2003), classification based on multiple association rules (Li,
Han, & Pei, 2001), classification based on associations (CBA,Liu,
Hsu, & Ma, 1998), multi-class, multi-label associative classification
(Thabtah, Cowling, & Peng, 2004), multi-class classification based
on association rules (Thabtah, Cowling, & Peng, 2005), associative
classifier based on maximum entropy (Thonangi & Pudi, 2005), Noah (Giuffrida, Chu, & Hanssens, 2000), and the use of the equivalence class rule tree (Vo & Le, 2008) Some researches have also reported that classifiers based on class-association rules are more accurate than those of traditional methods such as C4.5 and ILA both theoretically (Veloso, Meira, & Zaki, 2006) and with regard
to experimental results (Liu et al., 1998) Veloso et al proposed lazy associative classification (Veloso et al., 2006; Veloso, Meira, Goncalves, & Zaki, 2007; Veloso, Meira, Goncalves, Almeida, & Zaki,
2011), which differed from CARs in that it used rules mined from the projected dataset of an unknown object for predicting the class instead of using the ones mined from the whole dataset Genetic algorithms have also been applied recently for mining CARs, and several approaches have been proposed For example,Chien and Chen (2010)proposed a GA-based approach to build the classifier for numeric datasets and applied it to stock trading data Kaya (2010) proposed a Pareto-optimal genetic approach for building autonomous classifiers Qodmanan, Nasiri, and Minaei-Bidgoli (2011)proposed a GA-based method without requiring minimum support or minimum confidence thresholds.Yang, Mabu, Shimada, and Hirasawa (2011)proposed an evolutionary approach to rank rules These algorithms were mainly based on heuristics in order
to build classifiers
All the above methods focused on the design of the algorithms for mining CARs or for building classifiers, but did not discuss much with regard to their mining time Therefore, in this paper, we aim to propose an efficient algorithm for mining CARs based on a tree structure Section1.2will present our contributions in this paper
0957-4174/$ - see front matter 2012 Elsevier Ltd All rights reserved.
⇑ Corresponding author.
E-mail addresses: nguyenthithuyloan@vov.org.vn (L.T.T Nguyen), vdbay@itc.
edu.vn (B Vo), tphong@nuk.edu.tw (T.-P Hong), thanhhc@vnu.vn (H.C Thanh).
Contents lists available atSciVerse ScienceDirect
Expert Systems with Applications
j o u r n a l h o m e p a g e : w w w e l s e v i e r c o m / l o c a t e / e s w a
Trang 21.2 Our contributions
In the past,Vo and Le (2008)proposed a method for mining
CARs using the equivalence class rule-tree (ECR-tree) An efficient
mining algorithm, named ECR-CARM, was proposed in their study
ECR-CARM scanned the dataset only once and was based on object
identifiers to quickly determine the support of itemsets However,
it was quite time consuming for generate-and-test candidates
because all itemsets with the same attributes are grouped into
one node in the tree Therefore, when joining two nodes liand lj
to create a new node, ECR-CARM had to consider each element of
liwith each element of ljto check whether they had the same prefix
or not In this paper, we design a MECR-tree as follow: Each node in
the tree contains an itemset, instead of their all the itemsets With
this tree, some theorems are also designed and based on them, an
algorithm is proposed for mining CARs
1.3 Organization of our paper
The rest of this paper is organized as follows In Section2, we
introduce some works related to mining CARs Section3presents
preliminary concepts The main contributions are presented in
Sec-tion4, in which a tree structure, named MECR-tree, is developed
and some theorems for pruning candidates fast are derived Based
on the tree and these theorems, we propose an algorithm for
min-ing CARs efficiently In Section5, we show and discuss the
experi-mental results The conclusions and future work are presented in
Section6
2 Related work
Mining CARs is discovery of all classification rules that satisfy
the minimum support (minSup) and the minimum confidence
(minConf) thresholds The first method for mining CARs was
pro-posed byLiu et al (1998) It generates all candidate 1-ruleitems
and then calculates their supports for finding ruleitems that satisfy
minSup It then generates all the candidate 2-ruleitems from the
1-ruleitems by the same checking way The authors also proposed a
heuristic for building the classifier The weak point of this method
is that it generates a lot of candidates and scans the dataset many
times, so it is time-consuming Therefore, the proposed algorithm
uses a threshold K and only generates k-ruleitems with k 6 K In
2000, an improved algorithm for solving the problem of
imbal-anced datasets has been proposed (Liu, Ma, & Wong, 2000) The
lat-ter has higher accuracy than the former because it uses the hybrid
approach for prediction
Li et al proposed a method based on the FP-tree (Li et al., 2001)
The advantage of this method is that it scans the dataset only two
times and uses an FP-tree to compress the dataset It also uses the
tree-projection technique to find frequent itemsets To predict
un-seen data, this method finds all rules that satisfy this data and uses
a weightedv2measure to determine the class
Vo and Le proposed another approach based on the ECR-tree
(Vo & Le, 2008) This approach develops a tree structure called
the equivalence class- rules tree (ECR-tree), and proposes an
algo-rithm called ECR-CARM for mining CARs The algoalgo-rithm scans the
dataset only once It is based on the intersection of object
identifi-cations to quickly compute the supports of itemsets.Nguyen, Vo,
Hong, and Thanh (2012)then proposed a new method for pruning
redundant rules based on a lattice
Thabtah et al (2004)proposed a multi-class, multi-label
asso-ciative classification approach for mining CARs This method used
the rule form {(Ai1,ai1), (Ai2,ai2), ,(Aim,aim)} ? ci1_ ci2_ _ cil,
where a is a value of attribute A , and c is a class label
Some other class-association rule mining approaches have been presented in the work ofCoenen, Leng, and Zhang (2007), Giuffrida
et al (2000), Lim and Lee (2010), Liu, Jiang, Liu, and Yang (2008), Priss (2002), Sun, Wang, and Wong (2006), Thabtah et al (2005), Thabtah, Cowling, and Hammoud (2006), Thonangi and Pudi (2005), Yin and Han (2003), Zhang, Chen, and Wei (2011), and Zhao, Tsang, Chen, and Wang (2010)
3 Preliminary concepts Let D be the set of training data with n attributes A1,A2, ,An
and jDj objects (cases) Let C = {c1, c2, ,ck} be a list of class labels
A specific value of an attribute Aiand class C are denoted by the lower-case letters a and c, respectively
Definition 1 An itemset is a set of some pairs of attributes and a specific value, denoted {(Ai1,ai1), (Ai2,ai2), ,(Aim,aim)}
Definition 2 A class-association rule r is of the form {(Ai1,ai1), , (Aim,aim)} ? c, where {(Ai1,ai1), , (Aim,aim)} is an itemset, and c 2 C
is a class label
Definition 3 The actual occurrence ActOcc(r) of a rule r in D is the number of rows of D that match r’s condition
Definition 4 The support of a rule r, denoted Sup(r), is the number
of rows that match r’s condition and belong to r’s class
For example: Consider r: {(A,a1)} ? y from the dataset in
Table 1 We have ActOcc(r) = 3 and Sup(r) = 2 because there are three objects with A = a1, in that two objects have the same class y
4 Mining class-association rules 4.1 Tree structure
In this paper, we modify the ECR-tree structure (Vo & Le, 2008) into the MECR-tree structure (M stands for Modification) as fol-lows In the ECR-tree, all itemsets with the same attributes are arranged into one group and put them in one node Itemsets in dif-ferent groups were then joined together to form itemsets with more items This led to the consumption of much time for gener-ate-and-test itemsets In our work, each node in the tree contains only one itemset along with the following information:
(a) Obidset: a set of object identifiers that contain the itemset (b) (#c1, #c2, , #ck) – a list of integers, where #ciis the num-ber of records in Obidset which belong to class ci, and (c) pos – a positive integer storing the position of the class with the maximum count, i.e., pos = argmaxi2[1,k]{# ci}
In the ECR-tree, the authors did not store ciand pos, thus need-ing to compute them for all nodes However, some values are not calculated in the proposed approach here with the MECR-tree by using theorems presented in Section4.2
Table 1
An example of training dataset.
Trang 3For example, consider a node containing the itemset X = {(A,a3),
(B,b3)} fromTable 1 Because X is contained in objects 4 and 6, all of
them belong to class y Therefore, a node fðA; a3Þ; ðB; b3Þg
46ð2;0Þ
or more simply as 3 a3b3
46ð2;0Þ is generated in the tree The pos is 1 (underlined
at position 1 of this node) because the count of class y is at a
max-imum (2 as compared to 0) The latter is another representation of
the former for saving memory when we use the tree structure to
store itemsets We use bit presentation for storage of the itemset’s
attributes For example, AB can present as 11 in bit presentation,
and therefore, the value of these attributes is 3 With this
presen-tation, we can use bitwise operations to make itemsets join faster
4.2 Proposed algorithm
In this section, some theorems for fast mining CARs are
de-signed Based on these theorems, we propose an efficient algorithm
for mining CARs
Theorem 1 Given two nodes att1values1
Obidset1ðc11; ;c1kÞ ? att2values2
Obidset2ðc21; ;c2kÞ, if att1=att2 and values1–values2, then
Obidset1\ Obidset2= £
Proof Since att1= att2 and values1–values2, there exist a val1
2 values1and a val22 values2such that val1and val2have the same
attributes but different values Thus, if a record with OIDicontains
val1, it cannot contain val2 Therefore, "OID 2 Obidset1, and it can
be inferred that OID R Obidset2 Thus, Obidset1\ Obidset2= £ h
In this theorem, we divide the itemset into form att values for
ease of use Theorem 1 infers that if two itemsets X and Y have the
same attributes, they do not need to be combined into the itemset
XY because Sup(XY) = 0 For example, consider the two nodes
1 a1
127ð2; 1Þ and
1 a2
38ð1; 1Þ, in which Obidset({(A, a1)}) = 127, and
Obidset({(A, a2)}) = 38 Obidset({(A, a1), (A, a2)}) = Obidset({(A,
a1)}) \ Obidset({(A, a2)}) = £ Similarly, Obidset ({(A, a1), (B,
b1)}) = 1, and Obidset({(A, a1); (B, b2)}) = 2 It can be inferred that
Obidset({(A, a1), (B, b1)}) \ Obidset({(A, a1); (B, b2)}) = £ because
both of these two itemsets have the same attributes AB but with
different values
Theorem 2 Given two nodes itemset1
Obidset1ðc11; ;c1kÞ and itemset2 Obidset2ðc21; ;c2kÞ, if itemset1 itemset2 and jObidset1j =
jObid-set2j, then "i 2 [1, k]: c1i= c2i
Proof We have itemset1itemset2, this means that all records
con-taining itemset2 also contain itemset1, and therefore, Obidset2#
Obidset1 Additionally, according to theory, we have jObidset
1-j = 1-jObidset2j, this means that we have Obidset2= Obidset1, or "
i 2 [1, k]: c1i= c2i h
From Theorem 2, when we join two parent nodes into a child
node, then the itemset of the child node is always a supperset of
the itemset of each of the parent nodes Therefore, we will check
their cardinations, and if they are the same, we need not compute
the count for each class and the pos of this node because they are
the same as the parent node
Using these theorems, we develop an algorithm for mining CARs
efficiently By Theorem 1, we need not join two nodes with the
same attributes, and by Theorem 2, we need not compute the
information for some child nodes
First of all, the root node of the tree (Lr) contains children nodes
such that each node contains a single frequent itemset After that,
procedure CAR-Miner will be called with the parameter Lrto mine all CARs from the dataset D
The CAR-Miner procedure (Fig 1) considers each node liwith all the other node ljin Lr, with j > i (Lines 2 and 5) to generate a candidate child node l With each pair (li, lj), the algorithm checks whether li.att – lj.att or not (Line 6, using Theorem 1) If they are different, it computes the three elements att, values, Obidset for the new node O (Lines 7–9) Line 10 checks if the number of ob-ject identifiers of liis equal to the number of object identifiers of
O (by Theorem 2) If this is true, then, by Theorem 2, the algo-rithm can copy all information from node li to node O (Lines 11–12) Similarly, in the event that the result of Line 10 is false, the algorithm checks liwith O, and if the numbers of their object identifiers are the same (Line 13), the algorithm can copy all information from node lj to node O (Lines 14–15) Otherwise, the algorithm computes the O.count by using O.Obidset and O.pos (Lines 17–18) After computing all of the information for node O, the algorithm adds it to Pi(Piis initialized empty in Line 4) if O.count[O.pos] P minSup (Lines 19–20) Finally, CAR-Miner will be recursively called with a new set Pias its input parameter (Line 21)
The procedure ENUMERATE-CAR(l, minConf) generates a rule from node l It first computes the confidence of the rule (Line 22), if the confidence of this rule satisfies minConf (Line 23), then
it adds this rule into the set of CARs (Line 24)
4.3 An example
In this section, we use the example inTable 1to describe the CAR-Miner process with minSup = 10% and minConf = 60% Fig 2
shows the results of this process
The MECR-tree was built from the dataset inTable 1as follows: First, the root node Lr contains all frequent 1-itemsets such as
1a1 1a2 1a3 2 b1 2b2 2b3 4c1 4c2 127ð2;1Þ 38ð0;2Þ 456ð2;1Þ 15ð1;1Þ 238ð0;3Þ 467ð3;0Þ 12346ð3;2Þ 578ð1;2Þ
.
After that, procedure CAR-Miner is called with the parameter Lr
We use node li¼38ð0;2Þ1a2 as an example for illustrating the CAR-Miner process lijoins with all nodes following it in Lr:
With node lj¼ 1 a3
456ð2; 1Þ: They (liand lj) have the same attri-bute and different values Do not make any thing from them
With node lj¼ 2 b1
15ð1; 1Þ: Because their attributes are different, three elements are computed such as O.att = li.att [ lj.att = 1 j
2 = 3 or 11 in bit presentation; O.values = li.values [ lj values = a2 [ b1 = a2b1, and O.Obidset = li.Obidset \ lj.Obidset = {3,8} \ {1,5} = {£} Because the O.count[O.pos] = 0 < minSup, O
is not added to Pi
With node lj¼ 2 b2
238ð0; 3Þ: Because their attributes are different, three elements are computed such as O.att = li.att [ lj.att = 1 j
2 = 3 or 11 in bit presentation; O.values = li.values [ lj values = a2 [ b2 = a2b2, and O.Obidset = li.Obidset \ lj Obid-set = {3,8} \ {2,3,8} = {3,8} Because of jli.Obidsetj = jO.Obidsetj, the algorithm copies all information for lito O This means that O.count = li.count = (0,2), and O.pos = 2 Because O.count [O.pos] = 2 > minSup, O is added to Pi) Pi¼ 3 a2b2
38ð0; 2Þ
With node lj¼ 2 b3
467ð3; 0Þ: Because their attributes are different, three elements are computed such as O.att = li.att [ lj.att = 1j2 = 3 or 11 in bit presentation; O.values = li.values [ lj values = a2 [ b3 = a2b3, and O.Obidset = li.Obidset \ lj.Obidset= {3,8} \ {4,6,7} = {£} Because the O.count[O.pos] = 0 < minSup,
O is not added to P
Trang 4With node lj¼ 4 c1
12346ð3; 2Þ: Because their attributes are differ-ent, three elements are computed such as O.att = li.att [ lj.att = 1
j 4 = 5 or 101 in bit presentation; O.values = li.values [lj
.val-ues = a2 [ c1 = a2c1, and O.Obidset = li.Obidset \ lj.Obidset =
{3,8} \ {1,2,3,4,6} = {3} The algorithm computes additional
information including O.count = {0,1} and O.pos = 2 Because
the O.count[O.pos] = 1 P minSup, O is added to Pi) Pi¼
3 a2b2
38ð0; 2Þ;
5 a2c1
3ð0; 1Þ
With node lj¼ 4 c2
578ð1; 2Þ: Because their attributes are different, three elements are computed such as O.att = li.att [ lj.att = 1 j
4 = 5 or 101 in bit presentation; O.values = li.values [ lj val-ues = a2 [ c2 = a2c2, and O.Obidset = li.Obidset \ lj.Obidset = {3,8}
\ {5,7,8} = {8} The algorithm computes additional information including O.count = {0,1} and O.pos = 2 Because the O.count[O.pos] = 1 P minSup, O is added to Pi) Pi¼
3 a2b2 38ð0; 2Þ ;
5 a2c1 8ð0; 1Þ ;
5 a2c2 8ð0; 1Þ
Fig 1 The proposed algorithm for mining CARs.
Fig 2 MECR-tree for the dataset in Table 1
Trang 5After Piis created, the CAR-Miner is called recursively with
parameters Pi, minSup, and minConf to create all children nodes
of Pi Consider the process to make child nodes of node
li¼ 3 a2b2
38ð0; 2Þ:
With node lj¼ 5 a2c1
3ð0; 1Þ : Because their attributes are differ-ent, three elements are computed such as O.att = li.att
[ lj.att = 3 j 5 = 7 or 111 in bit presentation; O.values = li
values [ lj.values = a2b2 [ a2 c1 = a2b2c1, and O.Obidset =
li.Obidset \ lj.Obidset = {3,8} \ {3} = {3} = lj.Obidset the
algo-rithm copies all information of lj to O, it means that
O.count = lj.count = (0,1) and O.pos = 2 Because the
O.count[O.pos] = 1 > minSup, O is added to Pi) Pi¼
7 a2b2c1
3ð0; 1Þ
Using the same process for node lj¼ 5 a2c2
8ð0; 1Þ , we have the result Pi¼ 7 a2b2c1
3ð0; 1Þ ;
7 a2b2c2 8ð0; 1Þ
Rules are easily to generate in the same step for traversing li
(Line 3) by calling procedure ENUMERATE-CAR(li, minConf) For
example, when traversing node li¼ 38ð0; 2Þ1 a2, the procedure
com-putes the confidence of the candidate rule, conf = li.count[li.pos]/
jli.Obidsetj = 2/2 = 1 Because conf P minConf (60%), add rule {(A,
a2)} ? n (2,1) into the rule set CARs The meaning of this rule is
‘‘If A = a2 then class = n’’ (support = 2 and confidence = 100%)
To show the efficiency of Theorem 2, we can see that the
algo-rithm need not compute the information of some itemsets, such as
{3 a2b2, 7 a1b1c1, 7 a1b2c1, 7 a1b3c2, 7 a2b2c1,
7 a2b2c2, 7 a3b1c2, 7 a3b3c1}
5 Experimental results
5.1 Characteristics of experimental datasets
The algorithms used in the experiments were coded on a
per-sonal computer with C#2008, Windows 7, Centrino 2 2.53 GHz,
and 4 MBs RAM The experimental results were tested in the
data-sets obtained from the UCI Machine Learning Repository (http://
mlearn.ics.uci.edu).Table 4shows the characteristics of the
exper-imental datasets
The experimental datasets had different features The Breast,
German and Vehicle datasets had many attributes and distinctive
(values) but had very few numbers of objects (or records) The
Led7 dataset had only a few attributes, distinctive values and
num-ber of objects
5.2 Numbers of rules of the experimental datasets
Figs 3–7show the numbers of rules of the datasets inTable 4
for different minimum support thresholds We used a
min-Conf = 50% for all experiments
The results fromFigs 3–7show that some datasets had a lot of
rules For example, the Lymph dataset had 4,039,186 rules with a
Fig 3 Numbers of CARs in the breast dataset for various minSup values.
Fig 6 Numbers of CARs in the Led7 dataset for various minSup values Fig 5 Numbers of CARs in the lymph dataset for various minSup values Fig 4 Numbers of CARs in the German dataset for various minSup values.
Table 4
The characteristics of the experimental datasets.
Dataset #attrs #classes #distinct values #Objs
Trang 6minSup = 1% The German dataset had 752,643 rules with a
min-Sup = 1%, etc
5.3 Execution time
Experiments were then made to compare the execution time
between CAR-Miner and ECR-CARM (Vo & Le, 2008) The results
are shown inFigs 8–12
Results fromFigs 8–12show CAR-Miner to be more efficient
than ECR-CARM in all of the experiments For example: Consider
the Breast dataset with a minSup = 0.1% The mining time for the CAR-Miner was 1.517 s, while that for the ECR-CARM was 17.136 s The ratio was1:517
17:136 100% ¼ 8:85%
6 Conclusions and future work This paper proposed a new algorithm for mining CARs using a tree structure Each node in the tree contained some information for fast computation of the support of the candidate rule In addi-tion, using Obidset, we were able to compute the support of item-sets quickly Some theorems were also developed Based on these theorems, we did not need to compute the information for a lot
of nodes in the tree With these improvements, the proposed algo-rithm had better performance relative to the previous algoalgo-rithm in regard to all results
Mining itemsets from incremental databases has been devel-oped in recent years (Gharib, Nassar, Taha, & Abraham, 2010; Hong
& Wang, 2010; Hong, Lin, & Wu, 2009; Hong, Wang, & Tseng, 2011; Lin, Hong, & Lu, 2009) It can be seen that it saves a lot of time and memory when compared with mining from integration databases Therefore, in the future, we will study how to use this approach for mining CARs
Acknowledgements This work was supported by Vietnam’s National Foundation for Science and Technology Development (NAFOSTED) under Grant
No 102.01-2012.47
This paper has been completed while the second author is vis-iting Vietnam Institute for Advanced Study in Mathematics (VIASM), Ha Noi, Viet Nam
References
Chien, Y W C., & Chen, Y L (2010) Mining associative classification rules with stock trading data – A GA-based method Knowledge-Based Systems, 23(6), 605–614.
Coenen, F., Leng, P., & Zhang, L (2007) The effect of threshold values on association rule based classification accuracy Data and Knowledge Engineering, 60(2), 345–360.
Gharib, T F., Nassar, H., Taha, M., & Abraham, A (2010) An efficient algorithm for incremental mining of temporal association rules Data and Knowledge Engineering, 69(8), 800–815.
Giuffrida, G., Chu, W W., & Hanssens, D M (2000) Mining classification rules from datasets with large number of many-valued attributes In 7th International conference on extending database technology: advances in database technology (EDBT’00) (pp 335–349) Munich, Germany.
Hong, T P., & Wang, C J (2010) An efficient and effective association-rule maintenance algorithm for record modification Expert Systems with Applications, 37(1), 618–626.
Hong, T P., Lin, C W., & Wu, Y L (2009) Maintenance of fast updated frequent pattern trees for record deletion Computational Statistics and Data Analysis, 53(7), 2485–2499.
Hong, T P., Wang, C Y., & Tseng, S S (2011) An incremental mining algorithm for maintaining sequential patterns using pre-large sequences Expert Systems with
Fig 12 The execution time for CAR-Miner and ECR-CARM in the vehicle dataset.
Fig 11 The execution time for CAR-Miner and ECR-CARM in the Led7 dataset.
Fig 8 The execution time for CAR-Miner and ECR-CARM in the breast dataset.
Fig 9 The execution time for CAR-Miner and ECR-CARM in the German dataset.
Fig 10 The execution time for CAR-Miner and ECR-CARM in the lymph dataset.
Trang 7Kaya, M (2010) Autonomous classifiers with understandable rule using
multi-objective genetic algorithms Expert Systems with Applications, 37(4),
3489–3494.
Li, W., Han, J., & Pei, J (2001) CMAR: Accurate and efficient classification based on
multiple class-association rules In 1st IEEE international conference on data
mining (pp 369–376) San Jose, CA, USA.
Lim, A H L., & Lee, C S (2010) Processing online analytics with classification and
association rule mining Knowledge-Based Systems, 23(3), 248–255.
Lin, C W., Hong, T P., & Lu, W H (2009) The pre-FUFP algorithm for incremental
mining Expert Systems with Applications, 36(5), 9498–9505.
Liu, B., Hsu, W., & Ma, Y (1998) Integrating classification and association rule
mining In 4th International conference on knowledge discovery and data mining
(pp 80–86) New York, USA.
Liu, B., Ma, Y., & Wong, C K (2000) Improving an association rule based classifier In
4th European conference on principles of data mining and knowledge discovery (pp.
80–86) Lyon, France.
Liu, Y Z., Jiang, Y C., Liu, X., & Yang, S L (2008) CSMC: A combination strategy for
multiclass classification based on multiple association rules Knowledge-Based
Systems, 21(8), 786–793.
Nguyen, T T L., Vo, B., Hong, T P., & Thanh, H C (2012) Classification based on
association rules: A lattice-based approach Expert Systems with Applications,
39(13), 11357–11366.
Priss, U (2002) A classification of associative and formal concepts In The Chicago
linguistic society’s 38th annual meeting (pp 273–284) Chicago, USA.
Qodmanan, H R., Nasiri, M., & Minaei-Bidgoli, B (2011) Multi objective association
rule mining with genetic algorithm without specifying minimum support and
minimum confidence Expert Systems with Applications, 38(1), 288–298.
Quinlan, J R (1992) C4.5: program for machine learning Morgan Kaufmann.
Sun, Y., Wang, Y., & Wong, A K C (2006) Boosting an associative classifier IEEE
Transactions on Knowledge and Data Engineering, 18(7), 988–992.
Thabtah, F., Cowling, P., & Hammoud, S (2006) Improving rule sorting, predictive
accuracy and training time in associative classification Expert Systems with
Applications, 31(2), 414–426.
Thabtah, F., Cowling, P., & Peng, Y (2004) MMAC: A new multi-class, multi-label
associative classification approach In 4th IEEE international conference on data
mining (pp 217–224) Brighton, UK.
Thabtah, F., Cowling, P., & Peng, Y (2005) MCAR: Multi-class classification based on association rule In 3rd ACS/IEEE international conference on computer systems and applications (pp 33–39) Tunis, Tunisia.
Thonangi, R., & Pudi, V (2005) ACME: An associative classifier based on maximum entropy principle In 16th International conference algorithmic learning theory (pp 122–134) LNAI 3734, Singapore.
Tolun, M R., & Abu-Soud, S M (1998) ILA: An inductive learning algorithm for production rule discovery Expert Systems with Applications, 14(3), 361–370 Tolun, M R., Sever, H., Uludag, M., & Abu-Soud, S M (1999) ILA-2: An inductive learning algorithm for knowledge discovery Cybernetics and Systems, 30(7), 609–628.
Veloso, A., Meira, Jr., W., & Zaki, M J (2006) Lazy associative classification In 2006 IEEE international conference on data mining (ICDM’06) (pp 645–654) Hong Kong, China.
Veloso, A., Meira, W., Jr., Goncalves, M., & Zaki, M J (2007) Multi-label lazy associative classification In 11th European conference on principles of data mining and knowledge discovery (pp 605–612) Warsaw, Poland.
Veloso, A., Meira, W., Jr., Goncalves, M., Almeida, H M., & Zaki, M J (2011) Calibrated lazy associative classification Information Sciences, 181(13), 2656–2670.
Vo, B., & Le, B (2008) A novel classification algorithm based on association rule mining In The 2008 Pacific rim knowledge acquisition workshop (held with PRICAI’08) (pp 61–75) LNAI 5465, Ha Noi, Viet Nam.
Yang, G., Mabu, S., Shimada, K., & Hirasawa, K (2011) An evolutionary approach to rank class association rules with feedback mechanism Expert Systems with Applications, 38(12), 15040–15048.
Yin, X., & Han, J (2003) CPAR: Classification based on predictive association rules.
In SIAM international conference on data mining (SDM’03) (pp 331–335) San Francisco, CA, USA.
Zhang, X., Chen, G., & Wei, Q (2011) Building a highly-compact and accurate associative classifier Applied Intelligence, 34(1), 74–86.
Zhao, S., Tsang, E C C., Chen, D., & Wang, X Z (2010) Building a rule-based classifier – A fuzzy-rough set approach IEEE Transactions on Knowledge and Data Engineering, 22(5), 624–638.