Inference Control in Statistical Databases, Lecture Notes in Computer Science 2316 97-116.. 2002 A data swapping technique using ranks: a method for disclosure control, Research on Offici
Trang 1one), the results are similar Some parameterizations of rank swapping (Rank with parameter
p in the Table) and microaggregation (Micmul with parameter k in the Table) are ranked in
both (Domingo-Ferrer and Torra, 2001b) and here among the best algorithms
The comparison can be extended evaluating new masking methods and comparing them with the existing scores For example, results from (Jimenez and Torra, 2009) would permit to include in this table (with a score lower than 40) some parameterizations of lossy compression using JPEG 2000
35.6.2 R-U Maps
(Duncan et al., 2001, Duncan et al., 2004) propose the R-U maps, for Risk-Utility maps This
is a graphical representation of the two measures R for risk and U for utility.
Figure 35.2 represents an R-U map for the methods listed in the previous section each with several parameterizations Namely, RankXXX corresponds to Rank Swapping, MicXXX are variations of Microaggregation, JPEGXXX corresponds to Lossy Compression using JPEG,
and RemuestX is resampling (not described in this chapter) In the figure, DR corresponds to the Disclosure Risk (R following the standard jargon of R-U maps), and IL to information loss (in our case computed as aPIL) Formally, IL and utility U are related as follows: 1 −U = IL.
Note that in addition to the protection procedures represented in Table 35.1, the figure includes all the other methods analyzed in (Domingo-Ferrer and Torra, 2001b) but with the
new measures DR and aPIL described above In this figure, the lines represent scores of 50,
40, 30, and 20 Naturally, the nearer a method to(0,0), the better.
35.7 Conclusions
In this chapter we have reviewed the major topics concerning privacy in data mining We have rewiewed major protection methods, and discussed how to measure disclosure risk and information loss Finally, some tools for visualizing such measures and for comparing the methods have been described
Acknowledgements
Part of the research described in this chapter is supported by the Spanish MEC (projects ARES – CONSOLIDER INGENIO 2010 CSD2007-00004 – and eAEGIS – TSI2007-65406-C03-02)
References
Adam, N R., Wortmann, J C (1989) Security-control for statistical databases: a comparative study, ACM Computing Surveys, Volume: 21, 515-556
Aggarwal, C (2005) On k-anonymity and the curse of dimensionality, Proceedings of the
31st International Conference on Very Large Databases, pages 901-909
Aggarwal, C C., Yu, P S (2008) Privacy-Preserving Data Mining: Models and Algorithms, Springer
Trang 20 20 40 60 80 100
Risk/Utility Map
DR
Distr
Remuest1
Remuest3
JPEG100
JPEG010
JPEG015
JPEG095 JPEG020
MicOI10
JPEG025 JPEG030
JPEG070
MicOI09 JPEG075
MicOI08 JPEG080
MicOI07
JPEG065
JPEG090
MicOI06 JPEG085
MicOI04 Adit0.02 Mic2mul09
Rank01
JPEG055 JPEG050
Mic2mul10 JPEG035
Mic2mul06
Rank02
JPEG060
Mic2mul08
Adit0.04 Mic2mul07
Mic2mul03 Mic2mul04 JPEG045
Adit0.06 Adit0.08 Adit0.12
Adit0.16
Adit0.14
Rank03 Adit0.1 MicZ04
Rank04
MicZ03
Mic3mul09 MicZ08
Adit0.18 MicZ07
Mic3mul10 MicZ06
Mic3mul08 MicZ10
Mic3mul07 MicPCP10
Mic3mul03
MicPCP05
Mic3mul04 Mic3mul06 Mic4mul10
Mic3mul05
MicPCP06
Adit0.2
MicPCP04
Mic4mul09 MicPCP03
Mic4mul06 Mic4mul05 Mic4mul07
Rank06 Mic4mul04
Mic4mul03 Rank05
Micmul10 Micmul07
Rank08
Micmul06
Micmul04
Micmul03
Rank07 Rank10
Rank09 Rank12 Rank14 Rank13 Rank18 Rank17
+
+ + +
+ +
+ +
+
+ +
+
+ +
+ +
+
+
+
+ +
+ + +
+
+ +
+ +
+ +
+
+ + + + + +
+ + +
+ +
+ + +
+
+
+ +
+ ++
+ +
+ +
+ ++ +
+
+
+ + +
+
+
+
+
+ +
+ + +
+
+ + +
+ +
+
+ + +
+
+ + + + + + +
Fig 35.2 R-U Maps for some protection methods IL computed with PIL
Agrawal, R., Srikant, R (2000) Privacy Preserving Data Mining, Proc of the ACM SIGMOD Conference on Management of Data, 439-450
Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V (1999) Disclosure lim-itation of sensitive rules, Proc of IEEE Knowledge and Data Engineering Exchange Workshop (KDEX)
Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D (2008) Anonymity preserving pattern discovery, The VLDB Journal 17 703-727
Bacher, J., Brand, R., Bender, S (2002) Re-identifying register data by survey data using cluster analysis: an empirical study, Int J of Unc., Fuzz and Knowledge Based Systems 10:5 589-607
Bertino, E., Lin, D., Jiang, W (2008) A survey of quantification of privacy preserving data mining algorithms, in C C Aggarwal, P S Yu (eds.) Privacy-Preserving Data Mining:
Trang 3Models and Algorithms, Springer, 183-205.
Brand, R (2002) Microdata protection through noise addition, in J Domingo-Ferrer (ed.) Inference Control in Statistical Databases, Lecture Notes in Computer Science 2316 97-116
Bunn, P., Ostrovsky, R (2007) Secure two-party k-means clustering, Proc of CCS’07, ACM Press, 486-497
Burridge, J (2003) Information preserving statistical obfuscation, Statistics and Computing, 13:321–327
Carlson, M., Salabasis, M (2002) A data swapping technique using ranks: a method for disclosure control, Research on Official Statistics 5:2 35-64
Dalenius, T (1977) Towards a methodology for statistical disclosure control, Statistisk Tid-skrift 5 429-444
Dalenius, T (1986) Finding a needle in a haystack - or identifying anonymous census records, Journal of Official Statistics 2:3 329-336
Defays, D., Nanopoulos, P (1993) Panels of enterprises and confidentiality: the small aggre-gates method, Proc of 92 Symposium on Design and Analysis of Longitudinal Surveys, Statistics Canada, 195-204
Dempster, A P., Laird, N M., Rubin, D B (1977) Maximum Likelihood From Incomplete Data Via the EM Algorithm, Journal of the Royal Statistical Society 39 1-38
Domingo-Ferrer, J., Mateo-Sanz, J M (2002) Practical data-oriented microaggregation for statistical disclosure control, IEEE Trans on Knowledge and Data Engineering 14:1 189-201
Domingo-Ferrer, J., Mateo-Sanz, J M., Torra, V (2001) Comparing SDC methods for mi-crodata on the basis of information loss and disclosure risk, Pre-proceedings of ETK-NTTS’2001, (Eurostat, ISBN 92-894-1176-5), Vol 2, 807-826, Creta, Greece
Domingo-Ferrer, J., Sebe, F., Castella-Roca, J (2004) On the security of noise addition for privacy in statistical databases, PSD 2004, Lecture Notes in Computer Science 3050 149-161
Domingo-Ferrer, J., Torra, V (2001) Disclosure Control Methods and Information Loss for Microdata, in P Doyle, J I Lane, J J M Theeuwes, L Zayatz (eds.) Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, Elsevier Science, 91-110
Domingo-Ferrer, J., Torra, V (2001) A quantitative comparison of disclosure control meth-ods for microdata, in P Doyle, J I Lane, J J M Theeuwes, L Zayatz (eds.) Confi-dentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, North-Holland, 111-134
Domingo-Ferrer, J., Torra, V (2003) Disclosure Risk Assessment in Statistical Microdata Protection via advanced record linkage, Statistics and Computing, 13 343-354
Domingo-Ferrer, J., Torra, V (2005) Ordinal, Continuous and Heterogeneous k-Anonymity
Through Microaggregation, Data Mining and Knowledge Discovery 11:2 195-212 Duncan, G T., Keller-McNulty, S A., Stokes, S L (2001) Disclosure risk vs data utility: The R-U confidentiality map, Technical Report 121, National Institute of Statistical Sci-ences
Duncan, G T., Keller-McNulty, S A., Stokes, S L (2001) Database security and confiden-tiality: examining disclosure risk vs data utility through the R-U confidentiality map, Technical Report 142, National Institute of Statistical Sciences
Duncan, G T., Lambert, D (1986) Disclosure-limited data dissemination, Journal of the American Statistical Association, 81 10-18
Trang 4Duncan, G T., Lambert, D (1989) The risk disclosure for microdata, Journal of Business and Economic Statistics 7 207-217
Elamir, E A H (2004) Analysis of re-identification risk based on log-linear models, PSD
2004, Lecture Notes in Computer Science 3050 273-281
Elliot, M (2002) Integrating file and record level disclosure risk assessment, in J Domingo-Ferrer, Inference Control in Statistical Databases, Lecture Notes in Computer Science
2316 126-134
Elliot, M J Skinner, C J., Dale, A (1998) Special Uniqueness, Random Uniques and Sticky Populations: Some Counterintuitive Effects of Geographical Detail on Disclosure Risk, Research in Official Statistics 1:2 53-67
Fellegi, I P., Sunter, A B (1969) A theory for record linkage, Journal of the American Statistical Association 64:328 1183-1210
Fels¨o, F., Theeuwes, J., Wagner, G., (2001) Disclosure Limitation in Use: Results of a Survey,
in P Doyle, J I Lane, J J M Theeuwes, L Zayatz (eds.) Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, Elsevier Science, 17-42
Franconi, L., Polettini, S (2004) Individual risk estimation inμ-Argus: a review, PSD 2004, Lecture Notes in Computer Science 3050 262-272
Gouweleeuw, J M., Kooiman, P., Willenborg, L C R J., De Wolf, P.-P (1998) Post Ran-domisation for Statistical Disclosure Control: Theory and Implementation’, Journal of Official Statistics 14:4 463-478 Also as Research Paper No 9731, Voorburg: Statistics Netherlands (1997)
Gross, B., Guiblin, P., Merrett, K (2004) Implementing the Post Randomisation method
to the individual sample of anonymised records (SAR) from the 2001 Census, paper presented at “The Samples of Anonymised Records, An Open Meeting on the Samples of Anonymised Records from the 2001 Census” http://www.ccsr.ac.uk/sars/events/2004-09-30/gross.pdf
Hansen, S., Mukherjee, S (2003) A Polynomial Algorithm for Optimal Univariate Microag-gregation, IEEE Trans on Knowledge and Data Engineering 15:4 1043-1044
Haritsa, J R (2008) Mining association rules under privacy constraints, in C C Aggarwal,
P S Yu (eds.) Privacy-Preserving Data Mining: Models and Algorithms, Springer, 239-266
Hundepool, A., van de Wetering, A., Ramaswamy, R., Franconi, L., Capobianchi, C., de Wolf, P.-P., Domingo-Ferrer, J., Torra, V., Brand, R., Giessing, S (2003)μ-ARGUS version 3.2 Software and User’s Manual, Voorburg NL,Statistics Netherlands, February, 2003; version 4.0 published on may 2005 http://neon.vb.cbs.nl/casc
Jaro, M A (1989) Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida, Journal of the American Statistical Association 84:406 414-420
Jim´enez, J., Torra, V (2009) Utility and risk of JPEG-based continuous microdata protection methods, Proc Int Conf on Availability, Reliability and Security (ARES 2009), 929-934
Kantarcioglu, M (2008) A survey of privacy-preserving methods across horizontally parti-tioned data, in C C Aggarwal, P S Yu (eds.) Privacy-Preserving Data Mining: Models and Algorithms, Springer, 313-335
Kim, J., Winkler, W (2003) Multiplicative noise for masking continuous data, Research Report Series (Statistics 2003-01), U S Bureau of the Census
Kisilevich S., Rokach L., Elovici Y., Shapira B., Efficient Multidimensional Suppression for K-Anonymity, IEEE Transactions on Knowledge and Data Engineering, vol 22, no 3,
Trang 5pp 334-347, Mar 2010
Ladra, S., Torra, V (2008) On the comparison of generic information loss measures and cluster-specific ones, Intl J of Unc., Fuzz and Knowledge-Based Systems, 16:1 107-120
Lambert, D (1993) Measures of Disclosure Risk and Harm, Journal of Official Statistics 9 313-331
LeFevre, K., DeWitt, D J., Ramakrishnan, R (2005) Multidimensional k-anonymity,
Tech-nical Report 1521, University of Wisconsin
LeFevre, K., DeWitt, D J., Ramakrishnan, R (2005) Incognito: Efficient Full-Domain K-Anonymity, SIGMOD 2005
Li, N., Li, T., Venkatasubramanian, S (2007) T-closeness: privacy beyond k-anonymity and l-diversity, Proc of the IEEE ICDE 2007
Liew, C K., Choi, U J., Liew, C J (1985) A data distortion by probability distribution, ACM Transactions on Database Systems 10 395-411
Lindell, Y., Pinkas, B (2002) Privacy Preserving Data Mining, Journal of Cryptology, 15:3 Lindell, Y., Pinkas, B (2000) Privacy Preserving Data Mining, Crypto’00, Lecture Notes in Computer Science 1880 20-24
Liu, K., Kargupta, H., Ryan, J (2006) Random projection based multiplicative data pertur-bation for privacy preserving data mining, IEEE Trans on Knowledge and Data Engi-neering 18:1 92-106
Machanavajjhala, A., Gehrke, J., Kiefer, D., Venkitasubramanian, M (2006) L-diversity: privacy beyond k-anonymity, Proc of the IEEE ICDE
Mateo-Sanz, J M., Domingo-Ferrer, J Seb´e, F (2005) Probabilistic information loss mea-sures in confidentiality protection of continuous microdata, Data Mining and Knowledge Discovery, 11:2 181-193
Moore, R (1996) Controlled data swapping techniques for masking public use microdata sets, U S Bureau of the Census (unpublished manuscript)
Muralidhar, K., Sarathy, R (2008) Generating Sufficiency-based Non-Synthetic Perturbed Data, Transactions on Data Privacy 1:1 17 - 33
Nin, J., Herranz, J., Torra, V (2007) Rethinking Rank Swapping to Decrease Disclosure Risk, Data and Knowledge Engineering, 64:1 346-364
Nin, J., Herranz, J., Torra, V (2008) How to Group Attributes in Multivariate Microaggrega-tion, Intl J of Unc., Fuzz and Knowledge-Based Systems, 16:1 121-138
Nin, J., Herranz, J., Torra, V (2008) On the Disclosure Risk of Multivariate Microaggrega-tion, Data and Knowledge Engineering, 67:3 399-412
Nin, J., Herranz, J., Torra, V (2008) Towards a More Realistic Disclosure Risk Assessment, Lecture Notes in Computer Science, 5262 152-165
Nin, J Torra, V (2006) Extending microaggregation procedures for time series protection, Lecture Notes in Artificial Intelligence, 4259 899-908
Nin, J., Torra, V (2009) Analysis of the Univariate Microaggregation Disclosure Risk, New Generation Computing, 27 177-194
Oganian, A., Domingo-Ferrer, J (2000) On the Complexity of Optimal Microaggregation for Statistical Disclosure Control, Statistical J United Nations Economic Commission for Europe, 18, 4, 345-354
Paass, G (1985) Disclosure risk and disclosure avoidance for microdata, Journal of Business and Economic Statistics 6 487-500
Paass, G., Wauschkuhn, U (1985) Datenzugang, Datenschutz und Anonymisierung - Anal-ysepotential und Identifizierbarkeit von Anonymisierten Individualdaten, Oldenbourg Verlag
Trang 6Pagliuca, D., Seri, G (1999) Some results of individual ranking method on the system of enterprise accounts annual survey, Esprit SDC Project, Deliverable MI-3/D2
Pinkas, B (2002) Cryptographic techniques for privacy-preserving data mining, ACM SIGKDD Explorations 4:2
Ravikumar, P., Cohen, W W (2004) A hierarchical graphical model for record linkage, Proc
of UAI 2004
Rokach L., Genetic algorithm-based feature set partitioning for classification prob-lems,Pattern Recognition, 41(5):1676–1700, 2008
Rokach L., Maimon O and Lavi I., Space Decomposition In Data Mining: A Clustering Ap-proach, Proceedings of the 14th International Symposium On Methodologies For Intel-ligent Systems, Maebashi, Japan, Lecture Notes in Computer Science, Springer-Verlag,
2003, pp 24–31
Samarati, P (2001) Protecting Respondents’ Identities in Microdata Release, IEEE Trans on Knowledge and Data Engineering, 13:6 1010-1027
Samarati, P., Sweeney, L (1998) Protecting privacy when disclosing information:
k-anonymity and its enforcement through generalization and suppression, SRI Intl Tech Rep
Spruill, N L (1983) The confidentiality and analytic usefulness of masked business mi-crodata, Proc of the Section on Survery Research Methods 1983, American Statistical Association, 602-610
Sweeney, L (2002) Achieving k-anonymity privacy protection using generalization and
sup-pression, Int J of Unc., Fuzz and Knowledge Based Systems 10:5 571-588
Sweeney, L (2002) k-anonymity: a model for protecting privacy, Int J of Unc., Fuzz and
Knowledge Based Systems 10:5 557-570
Takemura, A (2002) Local recoding and record swapping by maximum weight matching for disclosure control of microdata sets, Journal of Official Statistics 18 275-289 Preprint (1999) Local recoding by maximum weight matching for disclosure control of microdata sets
Templ, M (2008) Statistical Disclosure Control for Microdata Using the R-Package sdcMi-cro, Transactions on Data Privacy 1 67-85
Torra, V (2004) Microaggregation for categorical variables: a median based approach, Proc Privacy in Statistical Databases (PSD 2004), Lecture Notes in Computer Science 3050 162-174
Torra, V (2004) OWA operators in data modeling and reidentification, IEEE Trans on Fuzzy Systems 12:5 652-660
Torra, V (2008) Constrained Microaggregation: Adding Constraints for Data Editing, Trans-actions on Data Privacy 1:2 86-104
Torra, V., Abowd, J M., Domingo-Ferrer, J (2006) Using Mahalanobis Distance-Based Record Linkage for Disclosure Risk Assessment, Lecture Notes in Computer Science
4302 233-242
Torra, V., Domingo-Ferrer, J (2003) Record linkage methods for multidatabase data mining,
in V Torra (ed.) Information Fusion in Data Mining, Springer, 101-132
Torra, V., Miyamoto, S (2004) Evaluating fuzzy clustering algorithms for microdata protec-tion, PSD 2004, Lecture Notes in Computer Science 3050 175-186
Trottini, M (2003) Decision models for data disclosure limitation, PhD Dissertation, Carnegie Mellon University http://www.niss.org/dgii/TR/Thesis-Trottini-final.pdf Truta, T M., Vinay, B (2006) Privacy protection: p-sensitive k-anonymity property Proc 2nd Int Workshop on Privacy Data management (PDM 2006) p 94
Trang 7Willenborg, L., de Waal, T (2001) Elements of Statistical Disclosure Control, Lecture Notes
in Statistics, Springer-Verlag
Winkler, W E (1993) Matching and record linkage, Statistical Research Division, U S Bureau of the Census (USA), RR93/08
Winkler, W E (2004) Re-identification methods for masked microdata, PSD 2004, Lecture Notes in Computer Science 3050 216-230
Yancey, W E., Winkler, W E., Creecy, R H (2002) Disclosure risk assessment in pertur-bative microdata protection, in J Domingo-Ferrer (ed.) Inference Control in Statistical Databases, Lecture Notes in Computer Science 2316 135-152
Yao, A C (1982) Protocols for Secure Computations, Proc of 23rd IEEE Symposium on Foundations of Computer Science, Chicago, Illinois, 160-164
http://www.census.gov
Trang 8Meta-Learning - Concepts and Techniques
Ricardo Vilalta1, Christophe Giraud-Carrier2, and Pavel Brazdil3
1 University of Houston
2 Brigham Young University
3 University of Porto
Summary The field of meta-learning has as one of its primary goals the understanding of the interaction between the mechanism of learning and the concrete contexts in which that mech-anism is applicable The field has seen a continuous growth in the past years with interesting new developments in the construction of practical model-selection assistants, task-adaptive learners, and a solid conceptual framework In this chapter we give an overview of different techniques necessary to build meta-learning systems We begin by describing an idealized meta-learning architecture comprising a variety of relevant component techniques We then look at how each technique has been studied and implemented by previous research In ad-dition we show how meta-learning has already been identified as an important component in real-world applications
Key words: Meta-learning
36.1 Introduction
We are used to thinking of a learning system as a rational agent capable of adapting to a specific environment by exploiting knowledge gained through experience; encountering multiple and diverse scenarios sharpens the ability of the learning system to predict the effect produced from selecting a particular course of action In this case, learning is made manifest because the quality of the predictions normally improves with an increasing number of scenarios or examples Nevertheless, if the predictive mechanism were to start afresh on different tasks, the learning system would find itself at a considerable disadvantage; learning systems capable
of modifying their own predictive mechanism would soon outperform our base learner by being able to change their learning strategy according to the characteristics of the task under analysis
Meta-learning differs from base-learning in the scope of the level of adaptation; whereas
learning at the base-level is based on accumulating experience on a specific learning task (e.g., credit rating, medical diagnosis, mine-rock discrimination, fraud detection, etc.), learning at the meta-level is based on accumulating experience on the performance of multiple applica-tions of a learning system If a base-learner fails to perform efficiently, one would expect the
DOI 10.1007/978-0-387-09823-4_36, © Springer Science+Business Media, LLC 2010
Trang 9learning mechanism itself to adapt in case the same task is presented again Meta-learning is then important in understanding the interaction between the mechanism of learning and the concrete contexts in which that mechanism is applicable Briefly stated, the field of meta-learning is focused on the relation between tasks or domains and meta-learning strategies In that sense, by learning or explaining what causes a learning system to be successful or not on a particular task or domain, we go beyond the goal of producing more accurate learners to the additional goal of understanding the conditions (e.g., types of example distributions) under which a learning strategy is most appropriate
From a practical stance, meta-learning can solve important problems in the application of machine learning and Data Mining tools, particularly in the area of classification and regres-sion First, the successful use of these tools outside the boundaries of research (e.g., industry, commerce, government) is conditioned on the appropriate selection of a suitable predictive model (or combinations of models) according to the domain of application Without any kind
of assistance, model selection and combination can turn into stumbling blocks to the end-user who wishes to access the technology more directly and cost-effectively End-users often lack not only the expertise necessary to select a suitable model, but also the availability of many models to proceed on a trial-and-error basis (e.g., by measuring accuracy via some re-sampling technique such as n-fold cross-validation) A solution to this problem is attainable through the construction of meta-learning systems These systems can provide automatic and systematic user guidance by mapping a particular task to a suitable model (or combination of models) Second, a problem commonly observed in the practical use of ML and DM tools is how
to profit from the repetitive use of a predictive model over similar tasks The successful ap-plication of models in real-world scenarios requires a continuous adaptation to new needs Rather than starting afresh on new tasks, we expect the learning mechanism itself to re-learn,
taking into account previous experience (Thrun, 1998,Pratt et al., 1991,Caruana, 1997,Vilalta
and Drissi, 2002) Again, meta-learning systems can help control the process of exploiting cumulative expertise by searching for patterns across tasks
Our goal in this chapter is to give an overview of different techniques necessary to build learning systems To impose some structure, we begin by describing an idealized meta-learning architecture comprising a variety of relevant component techniques We then look at how each technique has been studied and implemented by previous research We hope that by proceeding in this way the reader can not only learn from past work, but in addition gain some insight on how to construct meta-learning systems
We also hope to show how recent advances in meta-learning are increasingly filling the gaps in the construction of practical model-selection assistants and task-adaptive learners,
as well as in the development of a solid conceptual framework (Baxter, 1998, Baxter, 2000,
Giraud-Carrier et al., 2004).
This chapter is organized as follows In the next section we illustrate an idealized meta-learning architecture and detail on its constituent parts In Section 65.3.3 we describe previous research in learning and its relation to our architecture Section 65.3.4 describes a meta-learning tool that has been instrumental as a decision support tool in real applications Lastly, section 65.3.5 discusses future directions and provides our conclusions
36.2 A Meta-Learning Architecture
In this section we provide a general view of a software architecture that will be used as a reference to describe many of the principles and current techniques in meta-learning Though
Trang 10not every technique in meta-learning fits into this architecture, such a general view helps us understand the challenges we need to overcome before we can turn the technology into a set
of useful and practical tools
36.2.1 Knowledge-Acquisition Mode
To begin, we propose a meta-learning system that divides into two modes of operation During
the first mode, also known as the knowledge-acquisition mode, the main goal is to learn about
the learning process itself Figure 36.1 illustrates this mode of operation We assume the input
to the system is made of more than one dataset of examples (e.g., more than one set of pairs
of feature vectors and classes; Figure 36.1A) Upon arrival of each dataset, the meta-learning system invokes a component responsible for extracting dataset characteristics or meta-features (Figure 36.1B) The goal of this component is to gather information that transcends the par-ticular domain of application We look for information that can be used to generalize to other example distributions Section 36.3.1 details current research pointing in this direction During the knowledge acquisition mode, the learning technique (Figure 36.1C) does not exploit knowledge across different datasets or tasks Each dataset is considered independently
of the rest; the output to the system is a learning strategy (e.g., a classifier or combination of classifiers, Figure 36.1D) Statistics derived from the output model or its performance (Figure 36.1E) may also serve as a form of characterizing the task under analysis (Sections 36.3.1 and 36.3.1)
Information derived from the meta-feature generator and the performance evaluation
mod-ule can be combined into a meta-knowledge base (Figure 36.1F) This knowledge base is the
main result of the knowledge–acquisition phase; it reflects experience accumulated across different tasks Meta-learning is tightly linked to the process of acquiring and exploiting meta-knowledge One can even say that advances in the field of meta-learning hinge around one specific question: how can we acquire and exploit knowledge about learning systems (i.e., meta-knowledge) to understand and improve their performance? As we describe current re-search in meta-learning we will be pointing out to different forms of meta-knowledge 36.2.2 Advisory Mode
The efficiency of the meta-learner increases as it accumulates meta-knowledge We assume the lack of experience at the beginning of the learner’s life compels the meta-learner to use one or more learning strategies without a clear preference for one of them; experimenting with many different strategies becomes time consuming However, as more training sets have been examined, we expect the expertise of the meta-learner to dominate in deciding which learning strategy best suits the characteristics of the training set
In the advisory mode, meta-knowledge acquired in the exploratory mode is used to
con-figure the learning system in a manner that exploits the characteristics of the new data distri-bution Meta-features extracted from the dataset (Figure 36.2B) are matched with the meta-knowledge base (Figure 36.2F) to produce a recommendation regarding the best available learning strategy At this point we move away from the use of static base learners to the ability
to do model selection or combining base learners (Figure 36.2C)
Two observations are worth considering at this point First, the nature of the match be-tween the set of meta-features and the meta-knowledge base can have several interpretations The traditional view poses this problem as a learning problem itself where a meta-learner
is invoked to output an approximating function mapping meta-features to learning strategies