1. Trang chủ
  2. » Công Nghệ Thông Tin

Data Mining and Knowledge Discovery Handbook, 2 Edition part 74 pptx

10 333 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 10
Dung lượng 372,75 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Inference Control in Statistical Databases, Lecture Notes in Computer Science 2316 97-116.. 2002 A data swapping technique using ranks: a method for disclosure control, Research on Offici

Trang 1

one), the results are similar Some parameterizations of rank swapping (Rank with parameter

p in the Table) and microaggregation (Micmul with parameter k in the Table) are ranked in

both (Domingo-Ferrer and Torra, 2001b) and here among the best algorithms

The comparison can be extended evaluating new masking methods and comparing them with the existing scores For example, results from (Jimenez and Torra, 2009) would permit to include in this table (with a score lower than 40) some parameterizations of lossy compression using JPEG 2000

35.6.2 R-U Maps

(Duncan et al., 2001, Duncan et al., 2004) propose the R-U maps, for Risk-Utility maps This

is a graphical representation of the two measures R for risk and U for utility.

Figure 35.2 represents an R-U map for the methods listed in the previous section each with several parameterizations Namely, RankXXX corresponds to Rank Swapping, MicXXX are variations of Microaggregation, JPEGXXX corresponds to Lossy Compression using JPEG,

and RemuestX is resampling (not described in this chapter) In the figure, DR corresponds to the Disclosure Risk (R following the standard jargon of R-U maps), and IL to information loss (in our case computed as aPIL) Formally, IL and utility U are related as follows: 1 −U = IL.

Note that in addition to the protection procedures represented in Table 35.1, the figure includes all the other methods analyzed in (Domingo-Ferrer and Torra, 2001b) but with the

new measures DR and aPIL described above In this figure, the lines represent scores of 50,

40, 30, and 20 Naturally, the nearer a method to(0,0), the better.

35.7 Conclusions

In this chapter we have reviewed the major topics concerning privacy in data mining We have rewiewed major protection methods, and discussed how to measure disclosure risk and information loss Finally, some tools for visualizing such measures and for comparing the methods have been described

Acknowledgements

Part of the research described in this chapter is supported by the Spanish MEC (projects ARES – CONSOLIDER INGENIO 2010 CSD2007-00004 – and eAEGIS – TSI2007-65406-C03-02)

References

Adam, N R., Wortmann, J C (1989) Security-control for statistical databases: a comparative study, ACM Computing Surveys, Volume: 21, 515-556

Aggarwal, C (2005) On k-anonymity and the curse of dimensionality, Proceedings of the

31st International Conference on Very Large Databases, pages 901-909

Aggarwal, C C., Yu, P S (2008) Privacy-Preserving Data Mining: Models and Algorithms, Springer

Trang 2

0 20 40 60 80 100

Risk/Utility Map

DR

Distr

Remuest1

Remuest3

JPEG100

JPEG010

JPEG015

JPEG095 JPEG020

MicOI10

JPEG025 JPEG030

JPEG070

MicOI09 JPEG075

MicOI08 JPEG080

MicOI07

JPEG065

JPEG090

MicOI06 JPEG085

MicOI04 Adit0.02 Mic2mul09

Rank01

JPEG055 JPEG050

Mic2mul10 JPEG035

Mic2mul06

Rank02

JPEG060

Mic2mul08

Adit0.04 Mic2mul07

Mic2mul03 Mic2mul04 JPEG045

Adit0.06 Adit0.08 Adit0.12

Adit0.16

Adit0.14

Rank03 Adit0.1 MicZ04

Rank04

MicZ03

Mic3mul09 MicZ08

Adit0.18 MicZ07

Mic3mul10 MicZ06

Mic3mul08 MicZ10

Mic3mul07 MicPCP10

Mic3mul03

MicPCP05

Mic3mul04 Mic3mul06 Mic4mul10

Mic3mul05

MicPCP06

Adit0.2

MicPCP04

Mic4mul09 MicPCP03

Mic4mul06 Mic4mul05 Mic4mul07

Rank06 Mic4mul04

Mic4mul03 Rank05

Micmul10 Micmul07

Rank08

Micmul06

Micmul04

Micmul03

Rank07 Rank10

Rank09 Rank12 Rank14 Rank13 Rank18 Rank17

+

+ + +

+ +

+ +

+

+ +

+

+ +

+ +

+

+

+

+ +

+ + +

+

+ +

+ +

+ +

+

+ + + + + +

+ + +

+ +

+ + +

+

+

+ +

+ ++

+ +

+ +

+ ++ +

+

+

+ + +

+

+

+

+

+ +

+ + +

+

+ + +

+ +

+

+ + +

+

+ + + + + + +

Fig 35.2 R-U Maps for some protection methods IL computed with PIL

Agrawal, R., Srikant, R (2000) Privacy Preserving Data Mining, Proc of the ACM SIGMOD Conference on Management of Data, 439-450

Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V (1999) Disclosure lim-itation of sensitive rules, Proc of IEEE Knowledge and Data Engineering Exchange Workshop (KDEX)

Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D (2008) Anonymity preserving pattern discovery, The VLDB Journal 17 703-727

Bacher, J., Brand, R., Bender, S (2002) Re-identifying register data by survey data using cluster analysis: an empirical study, Int J of Unc., Fuzz and Knowledge Based Systems 10:5 589-607

Bertino, E., Lin, D., Jiang, W (2008) A survey of quantification of privacy preserving data mining algorithms, in C C Aggarwal, P S Yu (eds.) Privacy-Preserving Data Mining:

Trang 3

Models and Algorithms, Springer, 183-205.

Brand, R (2002) Microdata protection through noise addition, in J Domingo-Ferrer (ed.) Inference Control in Statistical Databases, Lecture Notes in Computer Science 2316 97-116

Bunn, P., Ostrovsky, R (2007) Secure two-party k-means clustering, Proc of CCS’07, ACM Press, 486-497

Burridge, J (2003) Information preserving statistical obfuscation, Statistics and Computing, 13:321–327

Carlson, M., Salabasis, M (2002) A data swapping technique using ranks: a method for disclosure control, Research on Official Statistics 5:2 35-64

Dalenius, T (1977) Towards a methodology for statistical disclosure control, Statistisk Tid-skrift 5 429-444

Dalenius, T (1986) Finding a needle in a haystack - or identifying anonymous census records, Journal of Official Statistics 2:3 329-336

Defays, D., Nanopoulos, P (1993) Panels of enterprises and confidentiality: the small aggre-gates method, Proc of 92 Symposium on Design and Analysis of Longitudinal Surveys, Statistics Canada, 195-204

Dempster, A P., Laird, N M., Rubin, D B (1977) Maximum Likelihood From Incomplete Data Via the EM Algorithm, Journal of the Royal Statistical Society 39 1-38

Domingo-Ferrer, J., Mateo-Sanz, J M (2002) Practical data-oriented microaggregation for statistical disclosure control, IEEE Trans on Knowledge and Data Engineering 14:1 189-201

Domingo-Ferrer, J., Mateo-Sanz, J M., Torra, V (2001) Comparing SDC methods for mi-crodata on the basis of information loss and disclosure risk, Pre-proceedings of ETK-NTTS’2001, (Eurostat, ISBN 92-894-1176-5), Vol 2, 807-826, Creta, Greece

Domingo-Ferrer, J., Sebe, F., Castella-Roca, J (2004) On the security of noise addition for privacy in statistical databases, PSD 2004, Lecture Notes in Computer Science 3050 149-161

Domingo-Ferrer, J., Torra, V (2001) Disclosure Control Methods and Information Loss for Microdata, in P Doyle, J I Lane, J J M Theeuwes, L Zayatz (eds.) Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, Elsevier Science, 91-110

Domingo-Ferrer, J., Torra, V (2001) A quantitative comparison of disclosure control meth-ods for microdata, in P Doyle, J I Lane, J J M Theeuwes, L Zayatz (eds.) Confi-dentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, North-Holland, 111-134

Domingo-Ferrer, J., Torra, V (2003) Disclosure Risk Assessment in Statistical Microdata Protection via advanced record linkage, Statistics and Computing, 13 343-354

Domingo-Ferrer, J., Torra, V (2005) Ordinal, Continuous and Heterogeneous k-Anonymity

Through Microaggregation, Data Mining and Knowledge Discovery 11:2 195-212 Duncan, G T., Keller-McNulty, S A., Stokes, S L (2001) Disclosure risk vs data utility: The R-U confidentiality map, Technical Report 121, National Institute of Statistical Sci-ences

Duncan, G T., Keller-McNulty, S A., Stokes, S L (2001) Database security and confiden-tiality: examining disclosure risk vs data utility through the R-U confidentiality map, Technical Report 142, National Institute of Statistical Sciences

Duncan, G T., Lambert, D (1986) Disclosure-limited data dissemination, Journal of the American Statistical Association, 81 10-18

Trang 4

Duncan, G T., Lambert, D (1989) The risk disclosure for microdata, Journal of Business and Economic Statistics 7 207-217

Elamir, E A H (2004) Analysis of re-identification risk based on log-linear models, PSD

2004, Lecture Notes in Computer Science 3050 273-281

Elliot, M (2002) Integrating file and record level disclosure risk assessment, in J Domingo-Ferrer, Inference Control in Statistical Databases, Lecture Notes in Computer Science

2316 126-134

Elliot, M J Skinner, C J., Dale, A (1998) Special Uniqueness, Random Uniques and Sticky Populations: Some Counterintuitive Effects of Geographical Detail on Disclosure Risk, Research in Official Statistics 1:2 53-67

Fellegi, I P., Sunter, A B (1969) A theory for record linkage, Journal of the American Statistical Association 64:328 1183-1210

Fels¨o, F., Theeuwes, J., Wagner, G., (2001) Disclosure Limitation in Use: Results of a Survey,

in P Doyle, J I Lane, J J M Theeuwes, L Zayatz (eds.) Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, Elsevier Science, 17-42

Franconi, L., Polettini, S (2004) Individual risk estimation inμ-Argus: a review, PSD 2004, Lecture Notes in Computer Science 3050 262-272

Gouweleeuw, J M., Kooiman, P., Willenborg, L C R J., De Wolf, P.-P (1998) Post Ran-domisation for Statistical Disclosure Control: Theory and Implementation’, Journal of Official Statistics 14:4 463-478 Also as Research Paper No 9731, Voorburg: Statistics Netherlands (1997)

Gross, B., Guiblin, P., Merrett, K (2004) Implementing the Post Randomisation method

to the individual sample of anonymised records (SAR) from the 2001 Census, paper presented at “The Samples of Anonymised Records, An Open Meeting on the Samples of Anonymised Records from the 2001 Census” http://www.ccsr.ac.uk/sars/events/2004-09-30/gross.pdf

Hansen, S., Mukherjee, S (2003) A Polynomial Algorithm for Optimal Univariate Microag-gregation, IEEE Trans on Knowledge and Data Engineering 15:4 1043-1044

Haritsa, J R (2008) Mining association rules under privacy constraints, in C C Aggarwal,

P S Yu (eds.) Privacy-Preserving Data Mining: Models and Algorithms, Springer, 239-266

Hundepool, A., van de Wetering, A., Ramaswamy, R., Franconi, L., Capobianchi, C., de Wolf, P.-P., Domingo-Ferrer, J., Torra, V., Brand, R., Giessing, S (2003)μ-ARGUS version 3.2 Software and User’s Manual, Voorburg NL,Statistics Netherlands, February, 2003; version 4.0 published on may 2005 http://neon.vb.cbs.nl/casc

Jaro, M A (1989) Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida, Journal of the American Statistical Association 84:406 414-420

Jim´enez, J., Torra, V (2009) Utility and risk of JPEG-based continuous microdata protection methods, Proc Int Conf on Availability, Reliability and Security (ARES 2009), 929-934

Kantarcioglu, M (2008) A survey of privacy-preserving methods across horizontally parti-tioned data, in C C Aggarwal, P S Yu (eds.) Privacy-Preserving Data Mining: Models and Algorithms, Springer, 313-335

Kim, J., Winkler, W (2003) Multiplicative noise for masking continuous data, Research Report Series (Statistics 2003-01), U S Bureau of the Census

Kisilevich S., Rokach L., Elovici Y., Shapira B., Efficient Multidimensional Suppression for K-Anonymity, IEEE Transactions on Knowledge and Data Engineering, vol 22, no 3,

Trang 5

pp 334-347, Mar 2010

Ladra, S., Torra, V (2008) On the comparison of generic information loss measures and cluster-specific ones, Intl J of Unc., Fuzz and Knowledge-Based Systems, 16:1 107-120

Lambert, D (1993) Measures of Disclosure Risk and Harm, Journal of Official Statistics 9 313-331

LeFevre, K., DeWitt, D J., Ramakrishnan, R (2005) Multidimensional k-anonymity,

Tech-nical Report 1521, University of Wisconsin

LeFevre, K., DeWitt, D J., Ramakrishnan, R (2005) Incognito: Efficient Full-Domain K-Anonymity, SIGMOD 2005

Li, N., Li, T., Venkatasubramanian, S (2007) T-closeness: privacy beyond k-anonymity and l-diversity, Proc of the IEEE ICDE 2007

Liew, C K., Choi, U J., Liew, C J (1985) A data distortion by probability distribution, ACM Transactions on Database Systems 10 395-411

Lindell, Y., Pinkas, B (2002) Privacy Preserving Data Mining, Journal of Cryptology, 15:3 Lindell, Y., Pinkas, B (2000) Privacy Preserving Data Mining, Crypto’00, Lecture Notes in Computer Science 1880 20-24

Liu, K., Kargupta, H., Ryan, J (2006) Random projection based multiplicative data pertur-bation for privacy preserving data mining, IEEE Trans on Knowledge and Data Engi-neering 18:1 92-106

Machanavajjhala, A., Gehrke, J., Kiefer, D., Venkitasubramanian, M (2006) L-diversity: privacy beyond k-anonymity, Proc of the IEEE ICDE

Mateo-Sanz, J M., Domingo-Ferrer, J Seb´e, F (2005) Probabilistic information loss mea-sures in confidentiality protection of continuous microdata, Data Mining and Knowledge Discovery, 11:2 181-193

Moore, R (1996) Controlled data swapping techniques for masking public use microdata sets, U S Bureau of the Census (unpublished manuscript)

Muralidhar, K., Sarathy, R (2008) Generating Sufficiency-based Non-Synthetic Perturbed Data, Transactions on Data Privacy 1:1 17 - 33

Nin, J., Herranz, J., Torra, V (2007) Rethinking Rank Swapping to Decrease Disclosure Risk, Data and Knowledge Engineering, 64:1 346-364

Nin, J., Herranz, J., Torra, V (2008) How to Group Attributes in Multivariate Microaggrega-tion, Intl J of Unc., Fuzz and Knowledge-Based Systems, 16:1 121-138

Nin, J., Herranz, J., Torra, V (2008) On the Disclosure Risk of Multivariate Microaggrega-tion, Data and Knowledge Engineering, 67:3 399-412

Nin, J., Herranz, J., Torra, V (2008) Towards a More Realistic Disclosure Risk Assessment, Lecture Notes in Computer Science, 5262 152-165

Nin, J Torra, V (2006) Extending microaggregation procedures for time series protection, Lecture Notes in Artificial Intelligence, 4259 899-908

Nin, J., Torra, V (2009) Analysis of the Univariate Microaggregation Disclosure Risk, New Generation Computing, 27 177-194

Oganian, A., Domingo-Ferrer, J (2000) On the Complexity of Optimal Microaggregation for Statistical Disclosure Control, Statistical J United Nations Economic Commission for Europe, 18, 4, 345-354

Paass, G (1985) Disclosure risk and disclosure avoidance for microdata, Journal of Business and Economic Statistics 6 487-500

Paass, G., Wauschkuhn, U (1985) Datenzugang, Datenschutz und Anonymisierung - Anal-ysepotential und Identifizierbarkeit von Anonymisierten Individualdaten, Oldenbourg Verlag

Trang 6

Pagliuca, D., Seri, G (1999) Some results of individual ranking method on the system of enterprise accounts annual survey, Esprit SDC Project, Deliverable MI-3/D2

Pinkas, B (2002) Cryptographic techniques for privacy-preserving data mining, ACM SIGKDD Explorations 4:2

Ravikumar, P., Cohen, W W (2004) A hierarchical graphical model for record linkage, Proc

of UAI 2004

Rokach L., Genetic algorithm-based feature set partitioning for classification prob-lems,Pattern Recognition, 41(5):1676–1700, 2008

Rokach L., Maimon O and Lavi I., Space Decomposition In Data Mining: A Clustering Ap-proach, Proceedings of the 14th International Symposium On Methodologies For Intel-ligent Systems, Maebashi, Japan, Lecture Notes in Computer Science, Springer-Verlag,

2003, pp 24–31

Samarati, P (2001) Protecting Respondents’ Identities in Microdata Release, IEEE Trans on Knowledge and Data Engineering, 13:6 1010-1027

Samarati, P., Sweeney, L (1998) Protecting privacy when disclosing information:

k-anonymity and its enforcement through generalization and suppression, SRI Intl Tech Rep

Spruill, N L (1983) The confidentiality and analytic usefulness of masked business mi-crodata, Proc of the Section on Survery Research Methods 1983, American Statistical Association, 602-610

Sweeney, L (2002) Achieving k-anonymity privacy protection using generalization and

sup-pression, Int J of Unc., Fuzz and Knowledge Based Systems 10:5 571-588

Sweeney, L (2002) k-anonymity: a model for protecting privacy, Int J of Unc., Fuzz and

Knowledge Based Systems 10:5 557-570

Takemura, A (2002) Local recoding and record swapping by maximum weight matching for disclosure control of microdata sets, Journal of Official Statistics 18 275-289 Preprint (1999) Local recoding by maximum weight matching for disclosure control of microdata sets

Templ, M (2008) Statistical Disclosure Control for Microdata Using the R-Package sdcMi-cro, Transactions on Data Privacy 1 67-85

Torra, V (2004) Microaggregation for categorical variables: a median based approach, Proc Privacy in Statistical Databases (PSD 2004), Lecture Notes in Computer Science 3050 162-174

Torra, V (2004) OWA operators in data modeling and reidentification, IEEE Trans on Fuzzy Systems 12:5 652-660

Torra, V (2008) Constrained Microaggregation: Adding Constraints for Data Editing, Trans-actions on Data Privacy 1:2 86-104

Torra, V., Abowd, J M., Domingo-Ferrer, J (2006) Using Mahalanobis Distance-Based Record Linkage for Disclosure Risk Assessment, Lecture Notes in Computer Science

4302 233-242

Torra, V., Domingo-Ferrer, J (2003) Record linkage methods for multidatabase data mining,

in V Torra (ed.) Information Fusion in Data Mining, Springer, 101-132

Torra, V., Miyamoto, S (2004) Evaluating fuzzy clustering algorithms for microdata protec-tion, PSD 2004, Lecture Notes in Computer Science 3050 175-186

Trottini, M (2003) Decision models for data disclosure limitation, PhD Dissertation, Carnegie Mellon University http://www.niss.org/dgii/TR/Thesis-Trottini-final.pdf Truta, T M., Vinay, B (2006) Privacy protection: p-sensitive k-anonymity property Proc 2nd Int Workshop on Privacy Data management (PDM 2006) p 94

Trang 7

Willenborg, L., de Waal, T (2001) Elements of Statistical Disclosure Control, Lecture Notes

in Statistics, Springer-Verlag

Winkler, W E (1993) Matching and record linkage, Statistical Research Division, U S Bureau of the Census (USA), RR93/08

Winkler, W E (2004) Re-identification methods for masked microdata, PSD 2004, Lecture Notes in Computer Science 3050 216-230

Yancey, W E., Winkler, W E., Creecy, R H (2002) Disclosure risk assessment in pertur-bative microdata protection, in J Domingo-Ferrer (ed.) Inference Control in Statistical Databases, Lecture Notes in Computer Science 2316 135-152

Yao, A C (1982) Protocols for Secure Computations, Proc of 23rd IEEE Symposium on Foundations of Computer Science, Chicago, Illinois, 160-164

http://www.census.gov

Trang 8

Meta-Learning - Concepts and Techniques

Ricardo Vilalta1, Christophe Giraud-Carrier2, and Pavel Brazdil3

1 University of Houston

2 Brigham Young University

3 University of Porto

Summary The field of meta-learning has as one of its primary goals the understanding of the interaction between the mechanism of learning and the concrete contexts in which that mech-anism is applicable The field has seen a continuous growth in the past years with interesting new developments in the construction of practical model-selection assistants, task-adaptive learners, and a solid conceptual framework In this chapter we give an overview of different techniques necessary to build meta-learning systems We begin by describing an idealized meta-learning architecture comprising a variety of relevant component techniques We then look at how each technique has been studied and implemented by previous research In ad-dition we show how meta-learning has already been identified as an important component in real-world applications

Key words: Meta-learning

36.1 Introduction

We are used to thinking of a learning system as a rational agent capable of adapting to a specific environment by exploiting knowledge gained through experience; encountering multiple and diverse scenarios sharpens the ability of the learning system to predict the effect produced from selecting a particular course of action In this case, learning is made manifest because the quality of the predictions normally improves with an increasing number of scenarios or examples Nevertheless, if the predictive mechanism were to start afresh on different tasks, the learning system would find itself at a considerable disadvantage; learning systems capable

of modifying their own predictive mechanism would soon outperform our base learner by being able to change their learning strategy according to the characteristics of the task under analysis

Meta-learning differs from base-learning in the scope of the level of adaptation; whereas

learning at the base-level is based on accumulating experience on a specific learning task (e.g., credit rating, medical diagnosis, mine-rock discrimination, fraud detection, etc.), learning at the meta-level is based on accumulating experience on the performance of multiple applica-tions of a learning system If a base-learner fails to perform efficiently, one would expect the

DOI 10.1007/978-0-387-09823-4_36, © Springer Science+Business Media, LLC 2010

Trang 9

learning mechanism itself to adapt in case the same task is presented again Meta-learning is then important in understanding the interaction between the mechanism of learning and the concrete contexts in which that mechanism is applicable Briefly stated, the field of meta-learning is focused on the relation between tasks or domains and meta-learning strategies In that sense, by learning or explaining what causes a learning system to be successful or not on a particular task or domain, we go beyond the goal of producing more accurate learners to the additional goal of understanding the conditions (e.g., types of example distributions) under which a learning strategy is most appropriate

From a practical stance, meta-learning can solve important problems in the application of machine learning and Data Mining tools, particularly in the area of classification and regres-sion First, the successful use of these tools outside the boundaries of research (e.g., industry, commerce, government) is conditioned on the appropriate selection of a suitable predictive model (or combinations of models) according to the domain of application Without any kind

of assistance, model selection and combination can turn into stumbling blocks to the end-user who wishes to access the technology more directly and cost-effectively End-users often lack not only the expertise necessary to select a suitable model, but also the availability of many models to proceed on a trial-and-error basis (e.g., by measuring accuracy via some re-sampling technique such as n-fold cross-validation) A solution to this problem is attainable through the construction of meta-learning systems These systems can provide automatic and systematic user guidance by mapping a particular task to a suitable model (or combination of models) Second, a problem commonly observed in the practical use of ML and DM tools is how

to profit from the repetitive use of a predictive model over similar tasks The successful ap-plication of models in real-world scenarios requires a continuous adaptation to new needs Rather than starting afresh on new tasks, we expect the learning mechanism itself to re-learn,

taking into account previous experience (Thrun, 1998,Pratt et al., 1991,Caruana, 1997,Vilalta

and Drissi, 2002) Again, meta-learning systems can help control the process of exploiting cumulative expertise by searching for patterns across tasks

Our goal in this chapter is to give an overview of different techniques necessary to build learning systems To impose some structure, we begin by describing an idealized meta-learning architecture comprising a variety of relevant component techniques We then look at how each technique has been studied and implemented by previous research We hope that by proceeding in this way the reader can not only learn from past work, but in addition gain some insight on how to construct meta-learning systems

We also hope to show how recent advances in meta-learning are increasingly filling the gaps in the construction of practical model-selection assistants and task-adaptive learners,

as well as in the development of a solid conceptual framework (Baxter, 1998, Baxter, 2000,

Giraud-Carrier et al., 2004).

This chapter is organized as follows In the next section we illustrate an idealized meta-learning architecture and detail on its constituent parts In Section 65.3.3 we describe previous research in learning and its relation to our architecture Section 65.3.4 describes a meta-learning tool that has been instrumental as a decision support tool in real applications Lastly, section 65.3.5 discusses future directions and provides our conclusions

36.2 A Meta-Learning Architecture

In this section we provide a general view of a software architecture that will be used as a reference to describe many of the principles and current techniques in meta-learning Though

Trang 10

not every technique in meta-learning fits into this architecture, such a general view helps us understand the challenges we need to overcome before we can turn the technology into a set

of useful and practical tools

36.2.1 Knowledge-Acquisition Mode

To begin, we propose a meta-learning system that divides into two modes of operation During

the first mode, also known as the knowledge-acquisition mode, the main goal is to learn about

the learning process itself Figure 36.1 illustrates this mode of operation We assume the input

to the system is made of more than one dataset of examples (e.g., more than one set of pairs

of feature vectors and classes; Figure 36.1A) Upon arrival of each dataset, the meta-learning system invokes a component responsible for extracting dataset characteristics or meta-features (Figure 36.1B) The goal of this component is to gather information that transcends the par-ticular domain of application We look for information that can be used to generalize to other example distributions Section 36.3.1 details current research pointing in this direction During the knowledge acquisition mode, the learning technique (Figure 36.1C) does not exploit knowledge across different datasets or tasks Each dataset is considered independently

of the rest; the output to the system is a learning strategy (e.g., a classifier or combination of classifiers, Figure 36.1D) Statistics derived from the output model or its performance (Figure 36.1E) may also serve as a form of characterizing the task under analysis (Sections 36.3.1 and 36.3.1)

Information derived from the meta-feature generator and the performance evaluation

mod-ule can be combined into a meta-knowledge base (Figure 36.1F) This knowledge base is the

main result of the knowledge–acquisition phase; it reflects experience accumulated across different tasks Meta-learning is tightly linked to the process of acquiring and exploiting meta-knowledge One can even say that advances in the field of meta-learning hinge around one specific question: how can we acquire and exploit knowledge about learning systems (i.e., meta-knowledge) to understand and improve their performance? As we describe current re-search in meta-learning we will be pointing out to different forms of meta-knowledge 36.2.2 Advisory Mode

The efficiency of the meta-learner increases as it accumulates meta-knowledge We assume the lack of experience at the beginning of the learner’s life compels the meta-learner to use one or more learning strategies without a clear preference for one of them; experimenting with many different strategies becomes time consuming However, as more training sets have been examined, we expect the expertise of the meta-learner to dominate in deciding which learning strategy best suits the characteristics of the training set

In the advisory mode, meta-knowledge acquired in the exploratory mode is used to

con-figure the learning system in a manner that exploits the characteristics of the new data distri-bution Meta-features extracted from the dataset (Figure 36.2B) are matched with the meta-knowledge base (Figure 36.2F) to produce a recommendation regarding the best available learning strategy At this point we move away from the use of static base learners to the ability

to do model selection or combining base learners (Figure 36.2C)

Two observations are worth considering at this point First, the nature of the match be-tween the set of meta-features and the meta-knowledge base can have several interpretations The traditional view poses this problem as a learning problem itself where a meta-learner

is invoked to output an approximating function mapping meta-features to learning strategies

Ngày đăng: 04/07/2014, 05:21

TỪ KHÓA LIÊN QUAN