Data Mining and Knowledge Discovery Handbook, 2 Edition part 106 ppt

Three interpretations are: 1 multiple software agents applying Data Mining algorithms to solve the same problem; 2 humans using modern collaboration techniques to apply Data Mining to a

Trang 1

setting Such well deﬁned and strong processes include, for instance, clear model evaluation procedures (Blockeel and Moyle, 2002)

Different perspectives exist on what collaborative Data Mining is (this is discussed further

in section 54.5) Three interpretations are: 1) multiple software agents applying Data Mining algorithms to solve the same problem; 2) humans using modern collaboration techniques to apply Data Mining to a single, deﬁned problem; 3) Data Mining the artifacts of human collab-oration This chapter will focus solely on the second item – that of humans using collaboration

techniques to apply data mining to a single task With sufﬁcient deﬁnition of a particular Data

Mining problem, this is similar to a multiple software agent Data Mining framework (the ﬁrst item), although this is not the aim of the chapter Many of the difﬁculties encountered in human collaboration will also be encountered in designing a system for software agent collaboration Collaborative Data Mining aims to combine the results generated by isolated experts,

by enabling the collaboration of geographically dispersed laboratories and companies For each Data Mining problem, a virtual team of experts is selected on the basis of adequacy and availability Experts apply their methods to solving the problem – but also communicate with each other to share their growing understanding of the problem It is here that collaboration is key

The process of analyzing data through models has many similarities to experimental re-search Like the process of scientific discovery, Data Mining can benefit from different tech-niques used by multiple researchers who collaborate, compete, and compare results to improve their combined understanding The rest of this chapter is organized as follows The potential difficulties in (remote) collaboration and a framework for analyzing such difficulties are out-lined A standard Data Mining process is reviewed, and studied for the likely contributions that can be achieved collaboratively A collaboration process for Data Mining is presented, with clear guidelines for the practitioner so that they may avoid the potential pitfalls related to col-laborative Data Mining A brief summary of real examples of the application of colcol-laborative Data Mining are presented The chapter concludes with a discussion

54.2 Remote Collaboration

This section considers the motivations behind (remote) collaboration1, and types of collab-oration it enables It then reviews the framework proposed by McKenzie and Van

Winke-len (McKenzie and van WinkeWinke-len, 2001) for working within e-Collaboration Space The term e-Collaboration will be used as shorthand for remote collaboration, but many of the principles

can be applied to local collaboration also

54.2.1 E-Collaboration:Motivations and Forms

The main motivation for collaboration (Moyle et al., 2003) is to harness dispersed

exper-tise and to enable knowledge sharing and learning in a manner that builds intellectual capi-tal (Edvinsson and Malone, 1997) This offers tancapi-talizing potential rewards including boost-ing innovation, ﬂexible resource management, and reduced risk (Amara, 1990, Mowshowitz,

1997, Nohria and Eccles, 1993, Snow et al., 1996), but these rewards are offset by numerous

difﬁculties mainly due to the increased complexity of a virtual environment

In (McKenzie and van Winkelen, 2001) seven distinct forms of e-collaborating organiza-tions that can be distinguished either by their structure or the intent behind their formation are

1The term “remote” is removed in the sequel

Trang 2

identiﬁed These are: 1) virtual/smart organizations; 2) a community of interest and practice; 3) a virtual enterprise; 4) virtual teams; 5) a community of creation; 6) collaborative product commerce or customer communities; and 7) virtual sourcing and resource coalitions For col-laborative data mining forms 4, and 5 are most relevant These forms are summarized below

• Virtual Teams are temporary culturally diverse geographically dispersed work groups that

communicate electronically These can be smaller entities within virtual enterprises, or within a transnational organization They can be categorized by changing membership and multiple organizational contexts

• A Community of creation is revolves around a central ﬁrm and shares its knowledge for

the purpose of innovation This structure consists of individuals and organizations with ever changing boundaries

Having recognized the collaboration form makes it possible to analyze the difﬁculties that might be encountered Such an analysis can be performed with respect to the e-collaboration space model described in the next section

54.2.2 E-Collaboration Space

Each type of e-collaboration form can be usefully analyzed with respect to McKenzie and Van

Winkelen’s e-Collaboration Space model (McKenzie and van Winkelen, 2001) This model

casts each form into the space by studying their location on the three dimensions of: number

of boundaries crossed, task, and relationships.

• Boundaries crossed: The more boundaries that are crossed in e-collabo

ration, the more barriers to a successful outcome are present All communication takes place across some boundary (Wilson, 2002) Fewer boundaries between agents lead to a lower risk of misunderstanding In e-collaboration the number of boundaries is automat-ically increased Inﬂuential boundaries to successful e-collaboration are: technological, temporal, organizational, and cultural

• Task: The nature of the tasks involved in the collaborative project is inﬂuenced by the

complexity of the processes, uncertainty of the available information and outcomes, and interdependence of the various stages of the task The complexity can be broadly classiﬁed into linear – step-by-step processes; or non-linear The interdependence of a task relates

to whether it can be decomposed into subtasks which can be worked on independently by different participants

• Relationships: Relationships are key to any successful collaboration When electronic

communication is the only mode of interaction it is harder for relationships to form, be-cause the instinctive reading of signals that establish trust and mutual understanding are less accessible to participants

For the remainder of the chapter only the dimension of task will be highlighted within the

e-collaboration space model As will be described in the next sub-section, task complexity makes collaborative Data Mining risk prone

54.2.3 Collaborative Data Mining in E-Collaboration Space

Different forms of e-collaboration – as measured relative to the dimensions of task, bound-aries, and relationships – can be viewed as locations in a three dimensional e-collaboration

Trang 3

space The location of a collaborative Data Mining project depends on the actual setting of such a project The most well deﬁned dimension with respect to the Data Mining process

(refer back to section 60.2.1) is that of task.

The task complexity of Data Mining is high Not only is there a high level of expertise

in-volved in a Data Mining project, but also there is the risk that in reaching the ﬁnal solution(s), much effort will appear – in hindsight – to have been wasted Data miners have long under-stood the need for a methodology to support the Data Mining process (Adriaans and Zantinge,

1996, Fayyad et al., 1996, Chapman et al., 2000) All these methodologies are explicit that the

Data Mining process is non-linear, and warns that information uncovered in later phases can invalidate assumptions made in earlier phases As a result the previous phases may need to be re-visited To exacerbate the situation, Data Mining is by its very nature a speculative process – there may be no valuable information contained in the data sources at all, or the techniques being used may not have sufﬁcient power to uncover it A typical Data Mining project at the start of the collaboration is summarized with respect to the e-collaboration model in Table 54.1

Table 54.1 The position of a disperse collaborative Data Mining project in E-collaboration space (†potential boundary depending on situation)

Task Boundaries Crossed Relationships

- Complex non-linear - Medium technological - Medium commonality of view interdependencies - temporal† - Medium duration of existing

- Uncertainty - geographical relationship

- large organizational† - Medium duration of

- cultural† collaboration

54.3 The Data Mining Process

Data Mining processes broadly consist of a number of phases These phases, however, are interrelated and are not necessarily executed in a linear manner For example, the results of one phase may uncover more detail relating to an earlier phase and may force more effort

to be expended on a phase previously thought complete The CRISP-DM methodology —

CRoss Industry Standard Process for Data Mining (Chapman et al., 2000), is an attempt to

standardise the process of Data Mining In CRISP-DM, six interrelated phases are used to

describe the Data Mining process: business understanding, data understanding, data prepa-ration, modelling, evaluation, and deployment (Figure 54.1) The main outputs of the business understanding phase are the deﬁnition of business and data mining objectives as well as

busi-ness and Data Mining evaluation criteria In this phase an assessment of resource requirements

and estimation of risk is performed In the data understanding phase data collected and

char-acterized Data quality is also assessed

During data preparation, tables, records and attributes are selected and transformed for modelling Modelling is the process of extracting input/output patterns from given data and

deriving models — typically mathematical or logical models In the modelling phase, vari-ous techniques (e.g association rules, decision trees, logistic regression, k-means clustering)

Trang 4

Fig 54.1 The CRISP-DM cycle

are selected and applied and their parameters are calibrated – or tuned – to optimal values.

Different models are compared, and possibly combined

In the evaluation phase models are selected and reviewed according to the business

cri-teria The whole Data Mining process is reviewed and a list of possible actions is elaborated

In the last phase, deployment is planned, implemented, and monitored The entire project is

typically documented and summarized in a report

The CRISP-DM handbook (Chapman et al., 2000) describes in detail how each of the

main phases is subdivided into speciﬁc tasks, with clearly deﬁned predecessors/successors, and inputs/outputs

54.4 Collaborative Data Mining Guidelines

The CRISP-DM Data Mining process described in the preceding section can be adopted by Data Mining agents collaborating remotely on a particular Data Mining project (SolEuNet,

2002, Flach et al., 2003) Not all of the CRISP-DM methodology can be entirely performed

in a collaboartive setting Business understanding for instance, requires intense close contact

with the business environment for which the Data Mining is being performed The phases

that can most easily be performed in a remote-collaborative fashion are data preparation and modelling The other phases can nevertheless beneﬁt from a collaborative approach Although

many of the speciﬁc tasks can be carried out independently, care must be taken by the par-ticipants to ensure that efforts are not wasted Principles to guide the process of collaboration should be established in advance of a collaborative Data Mining project For instance, indi-vidual agents must communicate or share any intermediate results – or improvements in the

current best understanding of the Data Mining problem – so that all agents have the new

knowledge Providing a catalogue of up-to-date knowledge about the problem assists new agents entering the Data Mining project Furthermore, procedures are required for how results from different agents are compared, and ultimately combined, so that the value of efforts is greater than the sum of the individual components

54.4.1 Collaboration Principles

(Moyle et al., 2003) present a framework for collaborative Data Mining, involving both

prin-ciples and technological support Collaborative groupware technology, with speciﬁc

Trang 5

function-ality to support data mining are described (Vo et al., 2001) Principles for collaborative data mining are outlined as follows (Moyle et al., 2003).

1 Requisite management Sufﬁcient management processes should be established In

par-ticular the deﬁnition and objectives of the Data Mining problem should be clear from the start of the project to all participants An infrastructure ensuring information ﬂows within the network of agents should be provided

2 Problem Solving Freedom Agents should use their expertise and tools to execute Data

Mining tasks to solve problem in the manner they ﬁnd best

3 Start any time All the necessary information about the Data Mining problem should be

captured and made available to participants at all times This includes problem deﬁnition, data, evaluation criteria, and any knowledge produced

4 Stop any time Participants should work on their solutions so that a working solution

– however crude – is available whenever a stop signal is issued These solutions will typically be Data Mining models One approach is to try simpler modeling techniques ﬁrst (Holte, 1993)

5 Online knowledge sharing The knowledge about the Data Mining problem gained by

each participant at each phase should be shared with all participants in a timely manner

6 Security Data and information about the Data Mining problem may contain sensitive

information and must not to be revealed outside the project Access to information must

be controlled

Having established a collaborative Data Mining project with appropriate principles and sup-port, how can the results of the Data Mining efforts be compared and combined so that the results are maximized? This is the question that the next section deals with

54.4.2 Data Mining model evaluation and combination

One of the main outputs from the Data Mining process (Chapman et al., 2000) are the Data

Mining models These may take many forms including decision trees, rules, artiﬁcial neural-networks, regression equations (see (Mitchell, 1997) as an introduction to machine learning,

and (Hair et al., 1998) as an introduction to statistics text) Different agents may produce

models in the different forms, which requires methods for both evaluating them and combining them

When multiple agents produce multiple models as the result of data mining effort a process for evaluating their relative merits must be established Such processes are well deﬁned in Data

Mining challenge problems (e.g (Srinivasan et al., 1999,Page and Hatzis, 2001)) For example

a challenge recipe for the production of classiﬁcatory models can be found in (Moyle and Srinivasan, 2001) To ensure accurate comparisons, models built by different agents must be evaluated in exactly the same way, on the same data This sounds like an obvious statement, but agents can easily make adjustments to their copy of the data to suit their particular approaches, without making the changes available to the other agents This makes any model evaluation

ad comparison extremely difﬁcult

Furthermore, the evaluation criterion or criteria (there may be several) deemed most ap-propriate may change during the knowledge discovery process For instance, at some point one may wish to redeﬁne the data set on which models are evaluated (e.g because it is found that it contains outliers that make the evaluation procedure inaccurate) and re-evaluate pre-viously built models In (Blockeel and Moyle, 2002) it is discussed how this evaluation and re-evaluation leads to signiﬁcant extra efforts for the different agents and consequently is a barrier to the knowledge discovery process, unless adequate software support is provided

Trang 6

One approach to control model evaluation is to centralize the process Consider an

ab-stracted Data Mining process where agents ﬁrst tune their modeling algorithm (which outputs the algorithm and its parameter settings, I), before building a ﬁnal model (which is output as M) The agent then uses the model to predict the labels on a test set (producing predictions, P), from which an overall evaluation of the model (resulting in a score S) is determined The point

at which these outputs are published for all agents to access depend on the architecture of the evaluation system as shown in Figure 54.2 A single evaluation agent provides the evaluation procedures; different agents submit information on their models to this agent, which stores this information and automatically evaluates it according to all relevant criteria If criteria change, the evaluation agent automatically re-evaluates previously submitted models

In such a framework information about produced models can be submitted at several lev-els, as illustrated in Figure 54.2 Agents can run their own models on a test set and send only predictions to the evaluation agent (assuming evaluation is based on predictions only), they can submit descriptions of the models themselves, or even just send a complete description on the model producing algorithm and the used parameters to the evaluation agent which has been augmented with modeling algorithms These respective options offer increased centralization and increasingly ﬂexible evaluation possibilities, but also involve increasingly sophisticated software support (Blockeel and Moyle, 2002)

Communicating Data Mining models to the evaluation agent can be performed using a

standard format For instance in (Flach et al., 2003) models from multiple agents were

sub-mitted in a standard, XML style, format (using the standard Predictive Markup Modeling Language (PMML) (The Data Mining Group, 2003)) Such a procedure has been adopted for

a real-world collaborative Data Mining project (Flach et al., 2003).

Model combination is not always possible However, when restricted to binary-classiﬁcatory models it is possible to utilize Receiver Operating Characteristic (Provost and Fawcett, 2001) curves to assist both model comparison, and model combination ROC analysis plots differ-ent binary-classiﬁcation models on a two dimensional space with respect to the type of errors the models make – false positive errors, and false negative errors2 The actual performance

of a model at run-time depends on the costs of errors at run-time, and the distribution of the classes at run-time The values of these run-time parameters – or operating characteristics – determine the optimal model(s) for use in prediction ROC analysis enables models to be

com-pared, which may result in some models never being optimal under any operating conditions

and can be discarded The remaining models are those that are located on the ROC convex hull (ROCCH)

As well as determining non-optimal models, ROC analysis can be used to combine mod-els One method is to use more two adjacent models on the ROCCH that are located either side

of the operating condition in combination to make run-time predictions Another approach to using ROCCH is to modify a single model into multiple models, that then can be plotted in

ROC space (Flach et al., 2001) resulting in models that ﬁt a broader range of operating con-ditions (Wettschereck et al., 2003) describe a support system that performs model evaluation,

model visualization, and model comparison, which has been applied in a collaborative Data

Mining setting (Flach et al., 2003).

2The axes on an ROC curve are actually the true positive rate versus the false positive rate

Trang 7

Fig 54.2 Two different architectures for model evaluation The path ﬁnishing in dashed ar-rows depicts agents in charge of building and evaluating their own models before publishing their results centrally The path of solid arrows depicts Data Mining agents submitting their models to a centralized evaluation agent which provides the services of executing submitted models on a test set, evaluating the predictions to produce scores, and then publishing the results The information submitted to the central evaluation agent is: I=algorithm and param-eter settings to produce models; M=models; P=predictions made by the models on a test set; S=scores of the value of the models

54.5 Discussion

References containing the keywords: collaborative Data Mining collaboration partition

natu-rally into the following categories

• Multiple software agents applying Data Mining algorithms to solve the same problem:

(e.g (Ramakrishnan, 2001)) this presupposes that the Data Mining task and its associated

data are well deﬁned a priori.

• Humans using modern collaboration techniques to apply Data Mining to a single, deﬁned problem (e.g (Mladenic et al., 2003)).

• Data Mining the artifacts of human collaboration: (e.g (Biuk-Aghai and Simoff, 2001))

these artifacts are typically the conversations and associated documents collected via some electronic based discussion forum

• The collaboration process itself resulting in increased knowledge: a form of knowledge

growth by collection within a context

• Grid style computing facilities collaborating to provide resources for Data Mining: (e.g (Singh et al., 2003)) these resources are typically providing either federated data or

dis-tributed computing power

• Integrating Data Mining techniques into business process software: (e.g (McDougall,

2001)) for example Enterprise Resource Planning systems, and groupware Note that this,

too, implies a priori knowledge of what the Data Mining problems are to be solved.

Trang 8

This chapter focused mainly on the second item – that of humans using collaboration

tech-niques to apply Data Mining to a single task With sufﬁcient deﬁnition of a particular Data

Mining problem, this can lead to a multiple software agent Data Mining framework (the ﬁrst item), although this is not the aim of this chapter

Many Data Mining challenges have been issued, which by their nature always result in

“winners” and “losers” However, in collaborative approaches, much can be learned from the losers as the Data Mining projects proceed Much initial effort is required to establish a Data Mining challenge (e.g problem speciﬁcation, data collection and preprocessing, speciﬁcation

of evaluation criteria) – even before the participants register This effort also needs to be ex-pended in a collaborative setting so that the objectives of the Data Mining project are clearly articulated in advance

The application of the collaborative methodology and techniques described here has been

performed with mixed success in the data mining projects (Flach et al., 2003, Stepnkov et al.,

2003, Jorge et al., 2003) More development of collaborative Data Mining processes and

sup-porting tools and communication environments are likely to improve the results of harnessing dispersed Data Mining expertise

54.6 Conclusions

Collaborative Data Mining is more difﬁcult the single team setting Data mining beneﬁts from adhering to established processes One key notion in Data Mining methodologies is that of

understanding (e.g CRISP-DM contains the phases, business understanding and data

under-standing) How are such understandings produced, articulated, maintained, and communicated

to all collaborating agents? What happens when understandings change – how much of the data mining process will need re-work? How does one agent’s understanding differ from an-other, simply due to communication, language and cultural differences?

Practitioners embarking on collaborative Data Mining might wish to heed some of the lessons learned from other collaborative Data Mining projects:

• Analyze the form of collaboration proposed and understand how difﬁcult it is likely to be.

• Establish a methodology that all participants can utilize along with support tools and

tech-nologies

• Ensure that all results – intermediate or otherwise – are recorded, and shared in a timely

manner

• Encourage competition among participants.

• Deﬁne metrics for success at all stages.

• Deﬁne model evaluation and combination procedures.

References

Adriaans, P., and Zantinge, D., Data Mining Addison-Wesley, New York, 1996.

Amara, R., New directions for innovations Futures 53-22(2): p 142 - 152, 1990.

Bacon, F., Novum Organum, eds P Urbach and J Gibson Open Court Publishing Company,

1994

Biuk-Aghai, R.P and S.J Simoff An integrative framework for knowledge extraction in collaborative virtual environments In The 2001 International ACM SIGGROUP Con-ference on Supporting Group Work Boulder, Colorado, USA, 2001.

Trang 9

Blockeel, H and S.A Moyle Collaborative Data Mining needs centralised model evalua-tion In Proceedings of the ICML-2002 Workshop on Data Mining Lessons Learned The

University of New South Wales, Sydney, 2002

Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., and Wirth, R

CRISP-DM 1.0: Step-by-step data mining guide The CRISP-DM consortium, 2000 Edvinsson, L and Malone, M.S Intellectual Capital: Realizing Your Company’s True Value

by Finding Its Hidden Brainpower HarperBusiness, New York, USA, 1997.

Fayyad, U., et al., eds Advances in Knowledge Discovery and Data Mining MIT Press,

1996

Flach, P.A., et al., Decision support for Data Mining: introduction to ROC analysis and its application In Data Mining and Decision Support: Integration and Collaboration, D.

Mladenic, et al., editors Kluwer Academic Publishers, 2003

Flach, P., Blockeel, H., Gaertner, T., Grobelnik, M., Kavsek, B., Kejkula, M., Krzywania, D., Lavrac, N., Mladenic, D., Moyle, S., Raeymaekers, S.,

Rauch, J., Ribeiro, R., Sclep, G., Struyf, J., Todorovski, L., Torgo, L., Wettsc

-hereck, D., and Wu, S On the road to knowledge: mining 21 years of UK trafﬁc acci-dent reports, In Data Mining and Decision Support: Integration and Collaboration, D.

Mladenic, et al., editors Kluwer Academic Publishers, 2003

Hair, J.F., Anderson, R.E., Tatham, R.L., and Black, W.C Multivariate Data Analysis

Pren-tice Hall, 1998

Holte, R.C., Very Simple Classiﬁcation Rules Perform Well on Most Commonly Used Datasets Machine Learning, 1993.

53-3: p 63-91

Jorge, J., Alves, M.A., Grobelnik, M., Mladenic, D., and Petrak, J Web site access analysis for a national statistical agency In Data Mining and Decision Support: Integration and

Collaboration, D Mladenic, et al., editors, p 157 – 166 Kluwer Academic Publishers, 2003

Kuhn, T.S., The structure of scientiﬁc revolutions 2nd, enlarged ed 1962, University of

Chicago Press, Chicago, 1970

McDougall, P., Companies that dare to share information are cashing in on new opportuni-ties InformationWeek, May 7, 2001.

McKenzie, J and C van Winkelen Exploring E-collaboration Space In the proceedings

of The ﬁrst annual Knowledge Management Forum Conference Henley Management

College, 2001

Mitchell, T Machine Learning Department of Computer Science, Carnegie Mellon

Univer-sity McGraw-Hill Book Company, Pittsburgh, 1997

Mladenic, D., Lavrac, N., Bohanec, M., and Moyle, S editors Data Mining and Decision Support: Integration and Collaboration Kluwer Academic Publishers, 2003.

Mowshowitz, A., Virtual Organization Communications of ACM, 53-40(9): p 30 - 37 1997 Moyle, S A., Srinivasan A., Classiﬁcatory challenge-Data Mining: a recipe Informatica

53-25(3): p 343–347 2001

Moyle, S., J McKenzie, and A Jorge, Collaboration in a Data Mining virtual organization.

In Data Mining and Decision Support: Integration and Collaboration, D Mladenic, et

al., editors Kluwer Academic Publishers, 2003

Nohria, N and R.G Eccles, eds Network and organizations; structure form and action.

Harvard Business School Press, Boston, 1993

Page, C.D and C Hatzis, KDD Cup 2001 University of Wisconsin,

http://www.cs.wisc.edu/˜dpage/kddcup2001/, 2001

Popper, K The Logic of Scientiﬁc Discovery Routledge, 1977.

Trang 10

Provost, F and T Fawcett Robust Classiﬁcation for Imprecise Environments Machine

Learning 53-42: p 203-231, 2001

Ramakrishnan., R Mass Collaboration and Data Mining (keynote address) In The Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001) San Francisco, California, 2001.

Singh, R., Leigh, J., DeFanti, T.A., and Karayannis F TeraVision: a High Resolution Graph-ics Streaming Device for Ampliﬁed Collaboration Environments Journal of Future

Gen-eration Computer Systems (FGCS) 53-19(6): p 957-972, 2003

Snow, C.C., S.A Snell, and S.C Davison Using transnational teams to globalize your com-pany Organizational Dynamics 53-24(4): p 50 - 67, 1996.

SolEuNet The Solomon European Netowrk – Data Mining and Decision Support for Busi-ness CompetitiveBusi-ness: A European Virtual Enterprise.

http://soleunet.ijs.si/, 2002

Soukhanov, A., ed Microsoft Encarta College Dictionary: The First Dictionary for the In-ternet Age St Martin’s Press, 2001.

A Srinivasan, R.D King, and D.W Bristol An assessment of submissions made to the Pre-dictive Toxicology Evaluation Challenge In Proceedings of the Sixteenth International Conference on Artiﬁcial Intelligence (IJCAI-99) Morgan Kaufmann, Los Angeles, CA,

1999

Stepnkov, O., J Klma, and P Mikovsk Collaborative Data Mining with RAMSYS and Suma-tra TT: Prediction of resources for a health farm In Data Mining and Decision Support: Integration and Collaboration, D Mladenic, et al., editors p 215 – 227 Kluwer

Aca-demic Publishers, 2003

The Data Mining Group, The Predictive Model Markup Language (PMML).

http://www.dmg.org/, 2003

Vo, A., Richter, G., Moyle, S., Jorge, A Collaboration support for virtual data mining en-terprises In 3rd International Workshop on Learning Software Organizations (LSO’01).

Springer-Verlag, 2001

Wettschereck, D., A Jorge, and S Moyle Visaulisation and Evaluation Support of Knowl-edge Discovery through the Predictive Model Markup Language In 7th International Knowledge-Based Intelligent Information and Engineering Systems (KES 2003),

Ox-ford Springer-Verlag, 2003

Wilson, T.D The nonsense of knowledge management Information Research 53-8(1), 2002 Witten, I.H and E Frank Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations Morgan Kaufmann, San Francisco, 2000.

Định dạng
Số trang	10
Dung lượng	225,21 KB