Biologically-inspired computation techniques such as fuzzy logic, artificial neural networks, evolutionary algorithms and adaptive agents are considered as core concepts of ecological in
Trang 2Ecological Informatics
Scope, Techniques and Applications
Trang 3Ecological Informatics Scope, Techniques and Applications
2nd Edition
With 174 Figures and a CD-ROM
Trang 4E DITOR
SCHOOL OF EARTH AND ENVIRONMENTAL SCIENCES
THE UNIVERSITY OF ADELAIDE
5005 AUSTRALIA
E-mail: Friedrich.Recknagel@adelaide.edu.au
ISBN 3-540-43455-0 Springer Berlin Heidelberg New York 1st edition 2003
ISBN 10 3-540-28383-8 Springer Berlin Heidelberg New York
ISBN 13 978-3540-28383-6 Springer Berlin Heidelberg New York
Library of Congress Control Number: 2005930717
This work is subject to copyright All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broad- casting, reproduction on microfilm or in any other way, and storage in data banks Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law
of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag Violations are liable to prosecution under the German Copyright Law
Springer is a part of Springer Science+Business Media
Cover design: E Kirchner, Heidelberg
Production: A Oelschläger
Typesetting: Camera-ready by the Editor
Printing: Stürtz AG, Gemany
Binding: Stürtz AG, Germany
Printed on acid-free paper 30/2132/AO 5 4 3 2 1 0
Trang 6Ecological informatics (ecoinformatics) is an interdisciplinary framework for the processing, archival, analysis and synthesis of ecological data by advanced computational technology (Recknagel 2003) Processing and archival of ecological data aim at facilitating data standardization, retrieval and sharing by
means of metadata and object-oriented programming (e.g Michener et al 1997;
Dolk 2000; Sen 2003; Eleveld, Schrimpf and Siegert 2003) Analysis and synthesis of ecological data aim at elucidating principles of information processing, structuring and functioning of ecosystems, and forecasting of ecosystems behaviours by means of bio-inspired computation (e.g Fielding 1999; Lek and Guegan 2000; Recknagel 2003)
Ecological informatics currently undergoes the process of consolidation as a discipline It corresponds and partially overlaps with the well-established disciplines bioinformatics and ecological modeling but is taking its distinct shape and scope In Fig 1 a comparison is made between ecological informatics and bioinformatics Even though both are based on the same computational technology their focus is different Bioinformatics focuses very much on determining gene
function and interaction (e.g Overbeck et al 1999; Wolf et al 2001), protein structure and function (e.g Henikoff et al 1999; Lupas, Van Dyke and Stock
1991) as well as phenotype of organisms utilizing DNA microarray, genomic, physiological and metabolic data (e.g Lockhardt and Winzeler 2000) (Fig 1a) By contrast ecological informatics focuses to determine population function and interactions as well as ecosystem structure and functioning by utilizing genomic,
phenotypic, community, environmental and climate data (e.g D’Angelo et al 1995; Chon et al 2003; Park et al 2003, Jeong, Recknagel and Joo 2003) (Fig
1b)
A comparison is made between ecological modeling and ecological informatics
in Fig 2 Even though both rely on similar ecological data they adopt different approaches in utilizing the data Whilst ecological modeling processes ecological data top down by ad hoc designed statistical or mathematical models (e.g Straskraba and Gnauck 1985; Jorgensen 1994), ecological informatics infers ecological processes from ecological data patterns bottom up by computational techniques The cross-sectional area between ecological modeling and ecological informatics reflects a new generation of hybrid models that enable to predict emergent ecosystem structures and behaviours, and ecosystem evolution (e.g Booth 1997; Downing 1997; Hraber and Milne 1997; Huse, Strand and Giske 1999) Typically those models embody biologically-inspired computation in deterministic ecological models
Trang 7Figure 1 Ecological informatics versus bioinformatics, a) Scope of
bioinformatics (modified from Oltvai and Barabasi (2002)), b) Scope of ecoinformatics
Trang 8Figure 2 Ecological informatics versus ecological modeling
The term ecological informatics was suggested at the International Conference
on Applications of Machine Learning to Ecological Modelling in 2000 (see
Ecological Modelling 2001, 195) when the International Society for Ecological
Informatics ISEI (www.waite.Adelaide.edu.au/ISEI) was founded Since then an increasing number of researchers and research groups identify with this area, and biennial international conferences are organized by the ISEI Also the new journal
Ecological Informatics will be issued by Elsevier in October 2005
(www.elsevier.com/locate/ecolinf)
The contents of the 2nd edition of the book Ecological Informatics has been revised and extended Two new chapters have been added to Part I: Introduction
Chapter 2 by Bredeweg et al provides an introduction to the novel concept of
qualitative reasoning that emerges as an alternative approach to fuzzy logic for automated processing and utilizing of heuristic ecological knowledge Exemplary applications to population and community dynamics illustrate the potential of the
approach Chapter 7 by Tempesti et al addresses the novel concept of
Top-Down Empirical and Deterministic Approach
Bottom-Up Neural and Evolutionary Approach
Hybrid Approach
Top-Down Empirical and Deterministic Approach
Bottom-Up Neural and Evolutionary Approach Hybrid
Approach
Trang 9replicating cellular automata inspired by the nature of the genome as the hereditary information of an organism The authors demonstrate how self-replicating cellular automata can be explored for the design of nano-scale circuits for computer hardware The paper contributes to the fast growing research on bio-inspired design of both computer software and hardware
Three new chapters have been added to Part IV: Prediction and Elucidation of
Lake and Marine Ecosystems Chapter 16 by Recknagel et al presents an
integrated approach of super- and non-supervised artificial neural networks (ANN) for understanding and forecasting of phytoplankton population dynamics
in limnological time series data The authors complement qualitative ordination and clustering by non-supervised ANN with sensitivity curves from supervised ANN to reveal complex ecological relationships They apply recurrent supervised ANN for 7-days-ahead forecasting of algal species abundances and succession
Chapter 17 by Cao et al introduces hybrid evolutionary algorithms (HEA) as
powerful tools for the discovery of predictive rule sets The underlying algorithms optimize both the rule structures and multiple parameters The authors demonstrate that the rule sets discovered in complex limnological time series data achieve not only highly accurate 7-days-ahead forecasting of algal species abundances and succession but provide a high degree of explanation by means of THEN- and ELSE-branch specific sensitivity analysis A CD with a demo version
of HEA is attached and instructions for HEA can be found in the Appendix
Chapter 20 by Atanasova et al demonstrates computational assemblage of
ordinary differential equations (ODE) based on an ecological process function library and measured ecological data The authors document automatically
assembled ODE for chlorophyll a in a lake and related validation results that
indicate possibilities and limitations of the approach
I want to thank all of the authors who contributed to the book with great enthusiasm and delivered on time Finally I express my thanks to Dr Christian Witschel and Agata Oelschlaeger of the Geosciences Editorial Team of the Springer-Verlag for their close collaboration in producing the book
Informatics Understanding Ecology by Biologically-Inspired Computation Springer- Verlag, Berlin, Heidelberg, New York, 127-178
D’Angelo, D.J., Howard, L.M., Meyer, J.L., Gregory, S.V and L.R Ashkenas, 1995 Ecological uses of genetic algorithms: predicting fish distributions in complex physical habitats Can.J.Fish.Aquat.Sci 52, 1893-1908
Dolk, D.R., 2000 Integrated model management in the data warehouse area European Journal of Operational Research 1222, 1999-218
Downing, K., 1997 EUZONE: Simulating the evolution of aquatic ecosystems Artificial Life 3, 307-333
Trang 10Eleveld, M.A., Schrimpf, W.B.H and A.G Siegert, 2003 User requirements and
information definition for the virtual coastal and marine data warehouse Ocean & Coastal Management 46, 487-505
Fielding, A., 1999 Machine Learning Methods for Ecological Applications Kluwer, 1-262 Henikoff, S., Henikoff, J.G and S Pietrovski, 1999 Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations Bioinformatics 15, 471-
Jeong, K.-S., Recknagel, F and G.-J Joo, 2003 Prediction and elucidation of population
dynamics of the blue-green algae Microcystis aeruginosa and the diatom Stephanodiscus
hantzschii in the Nakdong River-Reservoir System (South Korea) by a recurrent artificial
neural network In: Recknagel, F (ed.), 2003 Ecological Informatics Understanding Ecology by Biologically-Inspired Computation Springer-Verlag, Berlin, Heidelberg, New York, 195-213
Jorgensen, S.E., 1995 Fundamentals of Ecological Modelling Elsevier, Amsterdam, 1-628 Lek, S and J-F Guegan (eds.), 2000 Artificial Neuronal Networks Application to Ecology and Evolution Springer, Berlin, Heidelberg, New York, 1-262
Lockhardt, D and E Winzeler, 2000 Genomics, gene expression and DNA arrays Nature
405, 827-836
Lupas, A., Van Dyke, M and J Stock, 1991 Predicting coiled coils from protein
sequences Science 252, 1162-1164
Michener, W.K., Brunt, J.W., Helly, J.J., Kirchner, T.B., and S.G.Stanford, 1997
Nongeospatial metadata for the ecological sciences Ecological Applications 7, 1, 330-
342
Oltavai, Z.N and A.-L Barabasi, 2002 Life’s complexity pyramid Science 298, 763-764 Overbeck , R., Fonstein, M., D’Souza, M., Pusch, G.D and N Maltsev, 1999 The use of gene clusters to infer functional coupling Proc Natl Acad Sci
USA 96, 2896-2901
Park, Y.-S., Verdonschot, P.F.M., Chon, T.-s., and S Lek, 2003 Patterning and predicting aquatic macroinvertebrate diversities using artificial neural networks Water Research 37, 1749-1758
Recknagel, F (ed.), 2003 Ecological Informatics Understanding Ecology by
Biologically-Inspired Computation Springer-Verlag, Berlin, Heidelberg, New York Sen, A., 2003 Metadata management: past, present and future Decision Support Systems
Friedrich Recknagel
Adelaide, 15 May 2005
Trang 11Preface 1st Edition
In the 50s and 60s cross-sectional data of lake surveys were utilized for steady state assessments of the eutrophication status of lakes by univariate nonlinear
regression This statistical approach (see Table 1) became exemplary for river,
grassland and forest models and - because of simplicity - widespread for classification of ecosystems
In the 70s and 80s multivariate time series data were collected from ecosystems such as lakes, rivers, forests and grasslands in order to improve understanding of ecosystem dynamics Process-based differential equations were used for the computer simulation of food web dynamics and functional group succession This
differential equation approach (see Table 1) is still widely used for scenario
Ecosystem Classification
Potential
Applications
Nonlinear Regression 9 ; Nonlinear PCA 10 ; DELAQUA 11 ; ANNA 12 ; Evolved Rules 13 ; Evolved Equations 14,15 ; ECHO 16 ; GECKO 17
AQUAMOD 4 ; MS-CLEANER 5 ; Bierman 6 ; Jorgensen 7 ; SALMO 8
Phosphorus-Chlorophyll Relationship 1,2 ; External P-Loading Concept 3
Aquatic Examples
Species Succession and Ecosystem Evolution
Nutrient Cycles and Food Web Dynamics Cross-Sectional Nutrient
and Abundance Means
Ecosystem
Complexity
Multivariate Nonlinear Multivariate Nonlinear
Univariate Nonlinear / Multivariate Linear
Ecosystem
Approximation
Evolving States Transitional States
Steady States
Ecosystem
Representation
Computational Approach Differential Equations
Approach Statistical Regression
Approach
Ecosystem Forecasting Scenario Analysis
Ecosystem Classification
Potential
Applications
Nonlinear Regression 9 ; Nonlinear PCA 10 ; DELAQUA 11 ; ANNA 12 ; Evolved Rules 13 ; Evolved Equations 14,15 ; ECHO 16 ; GECKO 17
AQUAMOD 4 ; MS-CLEANER 5 ; Bierman 6 ; Jorgensen 7 ; SALMO 8
Phosphorus-Chlorophyll Relationship 1,2 ; External P-Loading Concept 3
Aquatic Examples
Species Succession and Ecosystem Evolution
Nutrient Cycles and Food Web Dynamics Cross-Sectional Nutrient
and Abundance Means
Ecosystem
Complexity
Multivariate Nonlinear Multivariate Nonlinear
Univariate Nonlinear / Multivariate Linear
Ecosystem
Approximation
Evolving States Transitional States
Steady States
Ecosystem
Representation
Computational Approach Differential Equations
Approach Statistical Regression
Approach
Trang 12Vollenweider RA (1968) Scientific fundamentals of eutrophication of lakes and flowing waters with special reference to phosphorus and nitrogen OECD, Paris OECD/DAS/SCI/68.27
6
Bierman VJ (1976) Mathematical model of the selective enhancement of blue-green algae
by nutrient enrichment In: Canale RP (eds) Modelling Biochemical Processes in Aquatic Ecosystems Ann Arbour Science Publishers Inc., Ann Arbour, 1-32 7
Jorgensen SE (1976) A eutrophication model for a lake Ecol Modelling 2, 147-162
8
Recknagel F, Benndorf J (1982) Validation of the ecological simulation model SALMO Int Revue Ges.Hydrobiol 67, 1, 113-125
9
Lek S, Delacoste M, Baran P, Dimonopoulos I, Lauga J, Aulagnier J (1996) Application
of neural networks to modelling nonlinear relationships in ecology Ecol Modelling
Recknagel F, Petzoldt T, Jaeke O, Krusche F (1995) Hybrid expert system DELAQUA -
a toolkit for water quality control of lakes and reservoirs Ecol Modelling 71, 1-3, 36
computational approach (see Table 1) allows to discover knowledge in complex
multivariate databases for improving both ecosystem theory and decision support
The present book focuses on the computational approach for ecosystems
analysis, synthesis and forecasting called ecological informatics It provides the
scope and case studies of ecological informatics exemplary for applications of biologically-inspired computation to a variety of areas in ecology
Trang 13Ecological Informatics is defined as interdisciplinary framework promoting the use of advanced computational technology for the elucidation of principles of information processing at and between all levels of complexity of ecosystems -from genes to ecological networks -, and the provision of transparent decisions targeting ecological sustainability, biodiversity and global warming
Distinct features of ecological informatics are: data integration across ecosystem categories and levels of complexity, inference from data pattern to ecological processes, and adaptive simulation and prediction of ecosystems Biologically-inspired computation techniques such as fuzzy logic, artificial neural networks, evolutionary algorithms and adaptive agents are considered as core concepts of ecological informatics
Fig 1 represents the current scope of ecological informatics indicating that ecological data is consecutively refined to ecological information, ecosystem theory and ecosystem decision support by two basic computational operations: data archival, retrieval and visualization, and ecosystem analysis, synthesis and forecasting
Figure 1 Scope of Ecological Informatics
Computational technologies currently considered being crucial for data archival, retrieval and visualization are:
- High performance computing to provide high-speed data access and processing, and large internal storage (RAM);
&
FORECASTING
COMPUTATIONAL
TECHNOLOGY:
High Performance Computing
Object-Oriented Data Representation
Internet Remote Sensing
GIS Animation etc.
COMPUTATIONAL TECHNOLOGY:
High Performance Computing Cellular Automata Fuzzy Logic Artificial Neural Networks Genetic/Evolutionary Algorithms
Hybrid Models Adaptive Agents
&
FORECASTING
COMPUTATIONAL
TECHNOLOGY:
High Performance Computing
Object-Oriented Data Representation
Internet Remote Sensing
GIS Animation etc.
COMPUTATIONAL TECHNOLOGY:
High Performance Computing Cellular Automata Fuzzy Logic Artificial Neural Networks Genetic/Evolutionary Algorithms
Hybrid Models Adaptive Agents
Trang 14- Object-oriented data representation to facilitate data standardization and data integration by the embodiment of metadata and data operations into data structures;
- Internet to facilitate sharing of dynamic, multi-authored data sets, and parallel posting and retrieval of data;
- Remote sensing and GIS to facilitate spatial data visualization and acquisition;
- Animation to facilitate pictorial visualization and simulation
Following computational technologies are currently considered to be crucial for ecosystems analysis, synthesis and forecasting:
- High performance computing to provide high-speed data access and processing and large internal storage (RAM), and to facilitate high speed simulations;
- Internet and www to facilitate interactive and online simulation as well as software and model sharing;
- Cellular automata to facilitate spatio-temporal and individual-based simulation;
- Fuzzy logic to represent and process uncertain data;
- Artificial neural networks to facilitate multivariate nonlinear regression, ordination and clustering, multivariate time series analysis, image analysis at micro and macro scale;
- Genetic and evolutionary algorithms for the discovery and evolving of multivariate nonlinear rules, functions, differential equations and artificial neural networks; - Hybrid and AI models by the embodiment of evolutionary algorithms
in process-based differential equations, the embodiment of fuzzy logic in artificial neural networks or knowledge processing;
- Adaptive agents to facilitate adaptive simulation and prediction of ecosystem composition and evolution
The present book is an outcome of the International Conference on
Applications of Machine Learning to Ecological Modelling, 27 November to 1
December 2000, Adelaide, Australia, which concluded with the foundation of the
International Society for Ecological Informatics (ISEI)
based on selected papers of the conference, which are exemplary for current
research trends in ecological informatics.
Chapters 1 to 5 address principles and ecological application of fuzzy logic, artificial neural networks, genetic algorithms, evolutionary computation and adaptive agents Salski summarizes concepts of fuzzy logic and discusses applications for knowledge-based modeling, clustering and kriging related to ecotoxicological, geological and population dynamics data Giraudel and Lek discuss the design and application of unsupervised artificial neural networks for the classification and visualization of multivariate ecological data They demonstrate the potential of Kohonen-type algorithms by clustering data of forest communities in Wisconsin (USA) Morrall discusses origins and nature of genetic algorithms, and their suitability to induce numerical or rule-based models for ecological applications Whigham and Fogel provide a scope of evolutionary algorithms and their potential for evolving rules, algebraic and differential equations relevant to ecology They also address developments on individual and cooperative behaviour, prey-predator algorithms and hierarchical ecosystems
Trang 15based on evolutionary algorithms Recknagel reflects on Holland’s adaptive agents concept and its potential to more realistically simulate emergent ecosystem structures and behaviours He distinguishes between individual-based and state variable-based agents, and emphasizes on the embodiment of evolutionary computation in state-variable based agents
Chapters 6 to 9 provide case studies for the prediction and elucidation of stream ecosystems by means of machine learning techniques Goethals, Dedecker, Gabriels and de Pauw demonstrate applications of classification trees and artificial neural networks for the bioassessment of the Zwalm river system in Belgium Schleiter, Obach, Wagner, Werner, Schmidt and Borchardt carried out a comprehensive study of the Breitenbach stream (Germany) based on a variety of unsupervised and supervised learning algorithms for artificial neural networks They draw interesting conclusions regarding suitability of different algorithms for bioindication of stream habitats and input sensitivity of streams Chon, Park, Kwak and Cha provide a summary of achievements in the structural classification and dynamic prediction of macroinvertebrate communities in Korean streams by artificial neural networks They also discuss patterning of organizational aspects
of macroinvertebrate communities Huong, Recknagel, Marshall and Choy study relationships between environmental factors, stream habitat characteristics and the occurrence of macroinvertebrate taxa in the Queensland stream system (Australia)
by means of a neural network based sensitivity analysis
Chapters 10 to 12 contain examples of time series analysis of river water quality by artificial neural networks Jeong, Recknagel and Joo apply recurrent neural networks to explain and predict the seasonal abundance and succession of different algae species in the River Nakdong (Korea) Validation results reveal a reasonable correspondence between seven days ahead forecasts and observations
of algal abundance Information on favouring conditions and processes for certain algal species discovered by a comprehensive sensitivity analysis comply well with domain knowledge Bowden, Maier and Dandy combine super- and unsupervised artificial neural networks as well as genetic algorithms for automated input determination of neural networks in order to forecast the abundance of an algae species in the River Murray (Australia) Gevrey, Lek and Oberdorff apply two approaches of sensitivity analysis for the study of riverine fish species by means
of artificial neural networks
Chapters 14 to 17 provide case studies for the application of fuzzy logic, artificial neural networks and evolutionary algorithms to freshwater lakes and marine fishery systems Karul and Soyupak compare results for the chlorophyll-a estimation in three Turkish lakes achieved by multiple regression and artificial neural networks Wilson and Recknagel design a generic neural network model for forecasting algal blooms that is validated by means of six lake databases It considers bootstrapping, bagging and time-lagged training as crucial techniques for minimising prediction errors Bobbin and Recknagel apply evolutionary algorithms to discover rules for the abundance and succession of blue green algae species in the hypereutrophic Lake Kasumigaura (Japan) Resulting rules correspond with literature findings, reveal hypothetical relationships and are able
to predict timing and magnitudes of algal dynamics
Trang 16Reick, Gruenewald and Page address the issue of data quality in the context of ecological time-series analysis and prediction They describe cross-validation and automated training termination of neural networks applied for multivariate time-series predictions of marine zooplankton in the German Northern Sea Chen combines fuzzy logic and artificial neural networks in order to classify fish stock-recruitment relationships in different environmental regimes near the West Coast Vancouver Island (Canada) and southeast Alaska (USA)
Chapters 18 to 20 provide examples for the classification of ecological images
at micro and macro scale by artificial neural networks Wilkins, Boddy and Dubelaar demonstrate possibilities for the identification of marine microalgae by the analysis of flow cytometric pulse shapes with the help of neural networks Robertson and Morison applied a probabilistic neural network for the automation
of age estimation in three fish species Thin-sections of sagittal otoliths viewed with transmitted light were used for all species, and the number of opaque increments used to estimate the age The neural network correctly classified a larger range of age classes Foody gives a representative summary of neural network algorithms currently used for the pattern recognition and classification of remotely sensed landscape images
At this point I want to thank all of the authors who responded with great enthusiasm to my request for chapters to the theme of the book and delivered on time I am also grateful to 24 colleagues and friends in Australia and overseas who significantly improved the quality of chapters by their critical reviews
Finally I express my thanks to Dr Christian Witschel and Agata Oelschlaeger
of the Geosciences Editorial Team of the Springer Verlag for their close collaboration in producing the book
Friedrich Recknagel
Adelaide, 15 April 2002
Trang 17Part I Introduction 1
1 Ecological Applications of Fuzzy Logic 3
1.1 Fuzzy Sets and Fuzzy Logic 3
1.2 Fuzzy Approach to Ecological Modelling and Data Analysis 4
1.3 Fuzzy Classification: A Fuzzy Clustering Approach 6
1.4 Fuzzy Regionalisation: A Fuzzy Kriging Approach 9
1.5 Fuzzy Knowledge-Based Modelling 9
1.6 Conclusions 12
References 12
2 Ecological Applications of Qualitative Reasoning 15
2.1 Introduction 15
2.2 Why Use QR for Ecology? 16
2.3 What is Qualitative Reasoning? 17
2.3.1 A Working Example 18
2.3.2 World-view: Ontological Distinctions 19
2.3.2.1 Component-based Approach 19
2.3.2.2 Process-based Approach 21
2.3.2.3 Constraint-based Approach 22
2.3.2.4 Suitability of Approaches 23
2.3.3 Inferring Behaviour from Structure 23
2.3.4 Qualitativeness and Representing Time 25
2.3.5 Causality 27
2.3.6 Model-fragments and Compositional Modelling 30
2.4 Tools and Software 30
2.4.1 Workspaces in Homer 31
2.4.2 Building a Population Model 32
2.4.3 Running and Inspecting Models with VisiGarp 35
2.4.4 Adding Migration to the Population model 36
2.5 Examples of QR-based Ecological Modelling 39
2.5.1 Population and Community Dynamics 39
2.5.2 Water Related Models 41
2.5.3 Management and Sustainability 42
2.5.4 Details in Qualitative Algebra 42
2.5.5 Details in Automated Model Building 43
Trang 182.5.6 Diagnosis 43
2.6 Conclusion 44
References 44
3 Ecological Applications of Non-Supervised Artificial Neural Networks 49
3.1 Introduction 49
3.2 How to Compute a Self-Organizing Map (SOM) with an Abundance
Dataset? 50
3.2.1 A Dataset for Demonstrations 50
3.2.2 The Self-Organizing Map (SOM) Algorithm 52
3.3 How to Use a Self-Organizing Map with an Abundance Dataset? 56
3.3.1 Mapping the Stations 56
3.3.2 Displaying a Variable 58
3.3.3 Displaying an Abiotic Variable 59
3.3.4 Clustering with a SOM 60
3.4 Discussion 63
3.5 Conclusion 65
References 66
4 Ecological Applications of Genetic Algorithms 69
4.1 Introduction 69
4.2 Ecology and Ecological Modelling 70
4.3 Genetic Algorithm Design Details 72
4.4 Applications of Genetic Algorithms to Ecological Modelling 74
4.5 Predicting the Future with Genetic Algorithms 78
4.6 The Next Generation: Hybrids Genetic Algorithms 79
References 80
5 Ecological Applications of Evolutionary Computation 85
5.1 Introduction 85
5.2 Ecological Modelling 86
5.2.1 The Challenges of Ecological Modelling 86
5.2.2 Summary 88
5.3 Evolutionary Computation 88
5.3.1 The Basic Evolutionary Algorithm 90
5.3.2 Summary 93
5.4 Ecological Modelling and Evolutionary Algorithms 93
5.4.1 Equation Discovery 93
5.4.2 Optimisation of Difference Equations 94
5.4.3 Evolving Differential Equations 95
5.4.4 Rule Discovery 95
5.4.5 Modelling Individual and Cooperative Behaviour 97
5.4.6 Predator-Prey Algorithms 100
Trang 195.4.7 Modelling Hierarchical Ecosystems 100
5.5 Conclusion 102
References 102
6 Ecological Applications of Adaptive Agents 109
6.1 Introduction 109
6.2 Adaptive Agents Framework 110
6.3 Individual-Based Adaptive Agents 112
6.4 State Variable-Based Adaptive Agents 114
6.4.1 Algal Species Simulation by Adaptive Agents 116
6.4.1.1 Embodiment of Evolutionary Computation in Agents 116
6.4.1.2 Adaptive Agents Bank 117
6.4.2 Pelagic Food Web Simulation by Adaptive Agents 121
6.5 Conclusions 122
Acknowledgements 122
References 123
7 Bio-Inspired Design of Computer Hardware by Self-Replicating Cellular Automata 125
7.1 Introduction 125
7.2 Cellular Automata 126
7.3 Von Neumann’s Universal Constructor 128
7.4 Self-Replicating Loops 131
7.5 Self-Replication in the Embryonics Project 132
7.5.1 Embryonics 132
7.5.2 The Tom Thumb Algorithm 136
7.5.2.1 Construction of the Minimal Cell 136
7.5.2.2 Growth and Self-Replication 140
7.5.2.3 The LSL Acronym Design Example 141
7.5.2.4 Universal Construction 144
7.6 Conclusions 145
Acknowledgements 146
References 146
Part II Prediction and Elucidation of Stream Ecosystems 149
8 Development and Application of Predictive River Ecosystem Models Based On Classification Trees and Artificial Neural Networks 151
8.1 Introduction 151
8.2 Study Sites, Data Sources and Modelling Techniques 152
8.2.1 The Zwalm River Basin 152
Trang 208.2.2 Data Collection 153
8.2.3 Classification Trees 154
8.2.4 Artificial Neural Networks 155
8.2.5 Model Assessment 156
8.3 Results 157
8.3.1 Classification Trees 157
8.3.1.1 Model Development and Validation 157
8.3.1.2 Application of Predictive Classification Trees for River Management 158
8.3.2 Artificial Neural Networks 160
8.3.2.1 Model Development and Validation 160
8.3.2.2 Application of Predictive Artificial Neural Networks for River Management 162
8.3.2.2.1 Prediction of Environmental Standards 162
8.3.2.2.2 Feasibility Analysis of River Restoration Options 163
8.4 Discussion 164
Acknowledgements 165
References 165
9 Modelling Ecological Interrelations in Running Water Ecosystems with Artificial Neural Networks 169
9.1 Introduction 169
9.2 Materials and Methods 170
9.2.1 Data Base 170
9.2.2 Data Pre-Processing 170
9.2.3 Artificial Neural Network Types 171
9.2.4 Dimension Reduction 171
9.2.5 Quality Measures 171
9.3 Data Exploration with Unsupervised Learning Systems 172
9.4 Correlations and Predictions with Supervised Learning Systems 175
9.4.1 Correlations and Predictions of Environmental Variables 177
9.4.2 Dependencies of Colonisation Patterns of Macro-Invertebrates on Water Quality and Habitat Characteristics 177
9.4.2.1 Aquatic Insects in a Natural Stream, the Breitenbach 177
9.4.2.2 Anthropogenically Altered Streams 180
9.4.3 Bioindication 181
9.5 Assessment of Model Quality and Visualisation Possibilities: Hybrid Networks 182
9.6 Conclusions 183
Acknowledgements 185
References 185
10 Non-linear Approach to Grouping, Dynamics and Organizational Informatics of Benthic Macroinvertebrate Communities in Streams by Artificial Neural Networks 187
10.1 Introduction 187
Trang 2110.2 Grouping Through Self-Organization 190
10.2.1 Static Grouping 190
10.2.2 Grouping Community Changes 203
10.3 Prediction of Community Changes 207
10.3.1 Multilayer Perceptron with Time Delay 207
10.3.2 Elman Network 211
10.3.3 Fully Connected Recurrent Network 214
10.3.4 Impact of Environmental Factors Trained with the Recurrent Network 218
10.4 Patterning Organizational Aspects of Community 221
10.4.1 Relationships among Hierarchical Levels in Communities 221
10.4.2 Patterning of Exergy 227
10.5 Summary and Conclusions 233
Acknowledgements 234
References 234
11 Elucidation of Hypothetical Relationships between Habitat Conditions and Macroinvertebrate Assemblages in Freshwater Streams by Artificial Neural Networks 239
11.1 Introduction 239
11.2 Study Site 240
11.3 Materials and Methods 240
11.3.1 Data 240
11.3.2 Neural Network Modelling 241
11.3.3 Sensitivity Analysis 242
11.4 Results and Discussion 243
11.4.1 Elucidation of Hypothetical Relationships 243
11.4.2 Discovery of Contradictory Relationships 247
11.4.3 Limitations of the Method 248
11.5 Conclusions 249
References 250
Part III Prediction and Elucidation of River Ecosystems 253
12 Prediction and Elucidation of Population Dynamics of the Blue-green Algae Microcystis aeruginosa and the Diatom Stephanodiscus hantzschii in the Nakdong River-Reservoir System (South Korea) by a Recurrent Artificial Neural Network
255
12.1 Introduction 255
12.2 Description of the Study Site 256
12.3 Materials and Methods 257
12.3.1 Data Collection and Analysis 257
Trang 2212.3.2 Modelling the Phytoplankton Dynamics 259 12.3.3 Neural Network Validation and Knowledge Discovery on
Algal Succession 261 12.4 Results and Discussion 261 12.4.1 Limnological Aspects and Plankton Dynamics in the Lower
Nakdong River 261 12.4.2 Configuring the Neural Network Architecture for
Predictability 263 12.4.3 Elucidation of Ecological Hypothesis 265
12.4.3.1 Microcystis aeruginosa 267 12.4.3.2 Stephanodiscus hantzschii 267
12.5 Implications of Ecological Informatics for Limnology 268 12.6 Conclusions 269 Acknowledgements 270 References 270
13 An Evaluation of Methods for the Selection of Inputs for an Artificial Neural Network Based River Model 275
13.1 Introduction 275 13.2 Methods 277 13.2.1 Unsupervised Input Preprocessing 277 13.2.2 Supervised Input Determination 280 13.3 Case Study 282 13.4 Model Development 282 13.4.1 Performance Measures and Model Validation 283 13.4.2 Data Division 283 13.4.3 Determination of Model Inputs 284 13.5 Results and Discussion 284 13.6 Conclusions 290 Acknowledgements 291 References 291
14 Utility of Sensitivity Analysis by Artificial Neural Network Models to Study Patterns of Endemic Fish Species 293
14.1 Introduction 293 14.2 Contribution of Environmental Variables 294 14.3 Application to Ecological Data 295 14.4 Results 296 14.4.1 Predictive Power 296 14.4.2 Sensitivity Analysis 298 14.5 Discussion 302 14.6 Conclusions 304 References 304
Trang 23Part IV Prediction and Elucidation of Lake and Marine Ecosystems 307
15 A Comparison between Neural Network Based and Multiple Regression Models in Chlorophyll-a Estimation 309
15.1 Introduction 309 15.1.1 Eutrophication in Water Bodies and Relevant Models 309 15.1.2 Artificial Neural Networks 310 15.1.3 The Use of Artificial Neural Networks in Environmental
Modelling 311 15.2 Data and Lakes 311 15.3 Methodology 313 15.3.1 Artificial Neural Network Approach 314 15.3.1.1 Training Method 314 15.3.1.2 Data Pre-Processing 316 15.3.1.3 Improving Generalisation 316 15.3.2 Multiple Regression Modelling Approach 317 15.4 Results 317 15.5 Conclusions and Recommendations 320 15.5.1 Conclusions 320 15.5.2 Recommendations 321 Acknowledgments 322 References 322
16 Artificial Neural Network Approach to Unravel and Forecast Algal Population Dynamics of Two Lakes Different in Morphometry and Eutrophication 325
16.1 Introduction 325 16.2 Materials and Methods 326 16.2.1 Study Sites and Data 326 16.2.2 Methods 327 16.3 Results 330 16.3.1 Forecasting Seasonal Algal Abundances and Succession 330 16.3.2 Relationships between Algal Abundances and Water Quality Conditions 331 16.3.3 Relationships between Algal Abundances, Seasons and Water Quality Changes 336 16.4 Discussion 340 16.4.1 Forecasting Seasonal Algal Abundances and Succession 340 16.4.2 Relationships between Algal Abundances, Seasons and Water Quality Changes 341 16.5 Conclusions 344 Acknowledgements 344 References 344
Trang 2417 Hybrid Evolutionary Algorithm* for Rule Set Discovery in Time-Series Data to Forecast and Explain Algal Population Dynamics in Two Lakes Different in Morphometry and Eutrophication 347
17.1 Introduction 347 17.2 Materials and Methods 348 17.2.1 Study Sites and Data 348 17.2.2 Hybrid Evolutionary Algorithms 349 17.2.2.1 Structure Optimisation of Rule Sets Using GP 351 17.2.2.2 Parameter optimization of Rule Sets Using a General Genetic
Algorithm 356 17.2.2.3 Forecasting by Rule Sets 357 17.3 Case Studies Lake Kasumigaura and Lake Soyang 358 17.3.1 Parameter Settings and Measures 358 17.3.2 Results and Discussion 359 17.4 Conclusions 366 References 366
18 Multivariate Time-Series Prediction of Marine Zooplankton by Artificial Neural Networks 369
18.1 Introduction 369 18.2 Generalisation 371 18.3 Automatic Termination of Training 374 18.4 Case Study: Zooplankton Prediction 378 18.5 Conclusions 381 Acknowledgement 382 References 382
19 Classification of Fish Stock-Recruitment Relationships in Different Environmental regimes by Fuzzy Logic Combined with a Bootstrap Re-sampling Approach 385
19.1 Introduction 385 19.2 Fuzzy Stock-Recruitment Model 386 19.2.1 Traditional Stock-Recruitment Model 386 19.2.2 Fuzzy Stock-recruitment Model 388 19.2.2.1 Fuzzy Membership Function (FMF) 389 19.2.2.2 Fuzzy Rules 390 19.2.2.3 Fuzzy Reasoning 391 19.3 Hybrid Optimal Learning and Bootstrap Re-sampling Algorithms 393 19.3.1 Hybrid Optimal Learning Algorithms 394 19.3.2 Bootstrap re-sampling Procedure 396 19.4 Two Real Data Analyses 397 19.4.1 West Coast Vancouver Island Herring Stock 397 19.4.1.1 Data Prescription and Preliminary Analyses 397 19.4.1.2 Fuzzy-SR Model Analysis 398 19.4.1.3 Bootstrap Re-sampling Analysis 400
Trang 2519.4.2 Southeast Alaska Pink Salmon 402 19.4.2.1 Data Prescription and Preliminary Analysis 402 19.4.2.2 Fuzzy-SR Model Analysis 403 19.4.2.3 Bootstrap Re-sampling Analysis 404 19.5 Summary and Discussion 404 Acknowledgements 406 References 406
20 Computational Assemblage of Ordinary Differential Equations
for Chlorophyll-a Using a Lake Process Equation Library and
Measured Data of Lake Kasumigaura 409
20.1 Introduction 409 20.2 Methods and Materials 410 20.2.1 LAGRAMGE: Computational Assemblage of ODE 410 20.2.2 Domain Knowledge Library for Lake Ecosystems 411 20.2.3 Task Specification 412 20.2.4 Data of Lake Kasumigaura 415 20.2.5 Experimental Framework 416 20.3 Results and Discussion 418 20.3.1 Experiment 1 418 20.3.2 Experiment 2 422 20.3.3 Experiment 3 424 20.4 Conclusions
21.1 Introduction 431 21.2 Materials and Methods 435 21.2.1 Pulse Shape Extraction 435 21.2.2 Data Filtering 435 21.2.3 Data Transformation 435 21.2.4 Principal Component Analysis 436 21.2.5 Neural Network Analysis 438 21.2.6 Hardware and Software 439 21.3 Results 439 21.4 Discussion 441 21.5 Conclusions 441 Acknowledgement 441
Trang 26References 442
22 Age Estimation of Fish Using a Probabilistic Neural Network 445
22.1 Introduction 445 22.2 Traditional Methods of Age Estimation 445 22.3 Approaches to Automation in Fish Age Estimation 447 22.4 The Application of a Probabilistic Neural Network to Fish Age
Estimation 448 22.5 Results 452 22.6 Discussion 454 Acknowledgements 456 References 456
23 Pattern Recognition and Classification of Remotely Sensed Images by Artificial Neural Networks 459
23.1 Introduction 459 23.2 Neural Networks in Remote Sensing 460 23.2.1 Classification Applications 460 23.2.2 Regression Applications 461 23.3 The Neural Networks Used in Remote Sensing 461 23.3.1 Feedforward Neural Networks 462 23.3.1.1 Multi-Layer Perceptron (MLP) 463 23.3.1.2 Radial Basis Function (RBF 464 23.3.1.3 Probabilistic Neural Networks (PNN) 465 23.3.1.4 Generalised Regression Neural Networks (GRNN) 466 23.3.1.5 Other Network Types 467 23.4 Current Status 468 23.4.1 An Example of Neural Networks for Classification 469 23.4.2 Concerns with neural Networks 471 23.5 Conclusions 472 Acknowledgments 473 References 473
Index 479 Appendix 483
Trang 27Department of Civil and Environmental Engineering,
University of Adelaide, Adelaide 5005
E-mail: Hongqing.Cao@adelaide.edu.au
Eui Young Cha
Division of Electronics and Computer Sciences,
Pusan National University, Pusan, Korea
Trang 28Din Chen
International Pacific Halibut Commission
University of Washington, Seattle WA 98195
USA
E-mail: din@iphc.washington.edu
Tae-Soo Chon
Division of Biological Sciences,
Pusan National University, Pusan
Department of Civil and Environmental Engineering
Adelaide University, Adelaide 5005
Trang 29Department of Geography, University of Southampton
Highfield, Southampton, SO17 1BJ
CESAC (Center d'Ecologie des Systemes Aquatiques et Continentaux)
University of Paul Sabatier, Toulouse
FRANCE
E-mail: gevrey@cict.fr
Jean-Luc Giraudel
CESAC (Center d'Ecologie des Systemes Aquatiques et Continentaux)
University of Paul Sabatier, Toulouse
Trang 30Pusan National University
Jang-Jeon Dong, Gum-Jeong Gu, Busan, 609-735
Korea
E-mail: pow5150@hotmail.com, pow0606@hananet.net
Gea-Jae Joo
Department of Biology
Pusan National University
Jang-Jeon Dong, Gum-Jeong Gu, Busan, 609-735
Korea
E-mail: gjjoo@pusan.ac.kr
Cueneyt Karul
Department of Environmental Engineering
Middle East Technical University, 06531, Ankara
Division of Biological Sciences
Pusan National University, Pusan
Trang 31Enrico Petraglia
Ecole Polytechnique Federale de Lausanne
School of Computing and Communication Sciences
CH-1015 Lousanne
Switzerland
Holger Maier
Department of Civil and Environmental Engineering
Adelaide University, Adelaide 5005
Australia
E-mail: hmaier@civeng.adelaide.edu.au
Daniel Mange
Ecole Polytechnique Federale de Lausanne
School of Computing and Communication Sciences
The Procter & Gamble Co
Environmental Science Department, Miami Valley Laboratories
P.O Box 538707, Cincinnati, OH 45253-8707, U.S.A
E-mail: morrall.dd@pg.com
Alexander K Morison
Marine and Freshwater Resources Institute
PO Box 114, Queenscliff, VIC 3225
Trang 32Institut dEcologie et de Gestion de la Biodiversite, MNHN,
Lab Ichtyologie appliquee, 43 rue Cuvier, 75005 Paris
Young Seuk Park
CESAC (Center d'Ecologie des Systemes Aquatiques et Continentaux) University of Paul Sabatier, Toulouse
Marine and Freshwater Resources Institute
PO Box 114, Queenscliff, VIC 3225
AUSTRALIA
E-mail: simon.robertson@nre.vic.gov.au
Paulo Salles
Universidade de Brasilia
Instituto de Ciências Biológicas
Campus Darcy Ribeiro
Brasilia - DF, 70.910-900, Brasil
E-mail: psalles@unb.br
Trang 33Department of Environmental Engineering
Middle East Technical University
06531 Ankara
Turkey
E-mail: soyupak@metu.edu.tr
Andre Stauffer
Ecole Polytechnique Federale de Lausanne
School of Computing and Communication Sciences
Ecole Polytechnique Federale de Lausanne
School of Computing and Communication Sciences
Limnologische Fluss-Station Schlitz der Max-Planck-Gesellschaft
P.O.Box 260, D-36105 Schlitz, Germany
E-mail: RWAGNER@MPIL-SCHLITZ.MPG.DE
Trang 34Heinrich Werner
University of Kassel,
Dept of Mathematics and Computer Sciences
Research Group Neural Networks
Trang 35Introduction
Trang 36Ecological Applications of Fuzzy Logic
A Salski
1.1
Fuzzy Sets and Fuzzy Logic
The Fuzzy Set Theory developed by L Zadeh (Zadeh 1965) as a possible way to handle uncertainty is particularly useful for the representation of vague expert knowledge and processing uncertain or imprecise information The Fuzzy Set Theory is based on an extension of the classical meaning of the term "set" and formulates specific logical and arithmetical operations for processing information defined in the form of fuzzy sets and fuzzy rules
The theory of fuzzy sets deals with subsets of a given universe, where the transition between full membership and no membership is gradual Therefore the
boundaries of fuzzy sets are not sharp An example of a fuzzy set is the set A of all
large carps as a subset of all carps in Lake Belau (Salski and Kandzia 1996) Traditionally, the grade of membership 1 is assigned to those objects of the universe that fully belong to a set, while 0 is assigned to objects that do not belong
to the set In traditional set theory, the sets considered are defined as collections of objects having some property, for example the property "carp in Lake Belau" The property "large carp in Lake Belau" does not constitute a set in the usual sense, the property does not offer a precisely defined criterion of membership Intuitively, a fuzzy set is a collection of objects that admits the possibility of partial
membership in it Thus a fuzzy set A in a given universe is characterized by a
function PA(x ) termed "the grade of membership of x in A" We shall assume
that the values of PA(x ) are elements of the interval [0,1] , with the grades 1 and
0 representing full membership and non-membership, respectively PA(x ) is
called the membership function of A
Fuzzy logic is based on the extension of the rules of conventional logic This extension enables us to process fuzzy rules in the “IF – THEN” form with fuzzy sets in the premise and conclusion parts of these rules These fuzzy sets represent imprecise expressions used by experts to describe their knowledge Therefore fuzzy inference methods are particularly useful to work with such a vague knowledge representation The main difference to conventional methods is that the Fuzzy Set Theory offers inference methods for the calculation of the conclusion values of rules when the premises of these rules are not completely fulfilled
Trang 37There are a lot of good books containing details about fuzzy sets and fuzzy logic such as Zimmermann (1993), Kruse et al (1995), Bárdossy and Duckstein (1995) and Pedrycz (1996)
The problem of uncertainty often appears in ecological modelling, in particular
it concerns the uncertainty of data and vaguely defined expert knowledge A large inherent uncertainty of ecological data results from the presence of random variables, incomplete or inaccurate data, approximate estimations instead of measurements (due to technical or financial problems) or incomparability of data (resulting from varying measurement or observation conditions) There are a number of ways to deal with uncertainty problems, e.g probabilistic inference networks (Pearl 1988) or belief intervals (Shafer et al 1990) One of the most successful methods of dealing with uncertainty is the fuzzy approach Fuzzy approach does not mean a particular method but the integration of a fuzzy concept into conventional methods of knowledge processing and data analysis That means an extension of conventional methods, which is capable of utilising imprecise, heterogeneous and uncertain data Compared to conventional methods the fuzzy approach enables us to make better use of imprecise ecological data and vague expert knowledge in two ways:
- the representation and handling of imprecise data defined as fuzzy sets,
- the representation and processing of vague knowledge in the form of linguistic rules with imprecise terms defined as fuzzy sets
Ecological data or classes of ecological objects can be defined as fuzzy sets with
no sharply defined boundaries, which reflects better the continuous character of nature Fuzzy sets can be used to handle uncertainty of data and fuzzy logic to handle inexact reasoning Fuzzy logic allows working with uncertain knowledge about relations between ecosystem components and building models based on this
Trang 38type of information
Ecological modelling and data analysis are the main application areas of the
fuzzy set theory in ecological research The integration of the fuzzy inference
mechanisms and the expert system technique provides development tools for
fuzzy expert systems and fuzzy knowledge-based models of ecological processes
(Salski 1999) The evolution of conventional knowledge-based systems into fuzzy
systems (adding imprecision or uncertainty handling to conventional systems)
makes the extension of their application area for complex ecological problems
possible (Kampichler et al 2000; Freyer 2000; Zhu et al 1996; Bock and Salski
1996) There are also other fuzzy approaches to ecological modelling, e.g the
fuzzy statistical approach to ecological assessments (Li 2001), the fuzzy
differential equations for fuzzy modelling in population dynamics (Barros et al
2000) or ecological impact analysis using fuzzy logic (Enea et al 2001; Silvert
1997) The fuzzy memberships can be also used as environmental indices (Silvert
2000) or as a fuzzy association degree in the ecosystem modelling (Liu 2001)
There are also an increasing number of other combined approaches, which result
from linking the fuzzy approach with other techniques, e.g.:
- fuzzy approach with neural networks for assessment in spatial decision making
(Zheng 2001) or for habitat modelling in agricultural landscapes (Wieland et al
1996),
- fuzzy modelling with conventional dynamic programming to optimal biological
control of a greenhouse mite (Cheng et al 1996),
- fuzzy approach with linear programming for the optimization of land use
scenarios (Salski et al 2001),
- fuzzy approach with probabilistic uncertainty to model climate-plant-herbivore
interactions in grassland ecosystems (Wu et al 1996),
- fuzzy approach with three-dimensional modelling technique (Ameskamp 1997)
The next important research field is handling uncertainty in geographic
information systems, that means dealing with fuzziness in reasoning with spatial
data (Dragicevic 2000; Guesgen 2000) and in the assignment of locations to
classes (Burrough 2000; MacMillan 2000) or fuzziness in the definitions of object
boundaries (Cross 2000)
Some application examples of a fuzzy approach to ecological modelling and
data analysis are presented in this paper, namely fuzzy clustering as a tool for
fuzzy classification of ecological data, fuzzy kriging as a method of fuzzy
interpolation of spatial data and fuzzy knowledge-based modelling
Fuzzy classification and fuzzy geostatistik belong to the main problems of the
analysis of ecological data Conventional classification methods based on Boolean
logic ignore the continuous nature of ecological parameters and the uncertainty of
data, which can result in misclassification Fuzzy classification, which means the
division of objects into classes that do not have sharply defined boundaries, can be
carried out in various ways, for example:
- application of fuzzy arithmetical and logical operations, e.g to determine land
suitability (Burrough et al 1992),
Trang 39- fuzzy clustering, e.g to classify some crop growth parameters (Marsili-Libelli 1994) or to classify existing chemicals according to their ecotoxicological properties (Friederichs et al 1996)
Compared to conventional classification methods fuzzy clustering methods enable a better interpretation of the data structure
Spatial data is an essential part of ecological data The fuzzy extension of the interpolation procedure for spatial data, the so-called fuzzy kriging, can be mentioned as an example of fuzzy approach to spatial data analysis (Bárdossy 1989; Diamond 1989; Piotrowski et al 1996) Fuzzy kriging is a modification of the conventional kriging procedure; it utilizes exact (crisp) measurement data as well as imprecise estimates obtained from an expert and defined as fuzzy numbers Regionalization of ecological parameters based on fuzzy kriging reflects better the imprecision of input data
Fuzzy knowledge-based modelling can be particularly useful where there is no analytical model of the relations to be examined or where there is an insufficient amount of data for statistical analysis, or where the degree of uncertainty of these data is very high (Salski 1992; Salski et al 1996; Li 1996; Daunicht et al 1996; Bárdossy and Duckstein 1995; Pedrycz 1996; Bock and Salski 1998) In these cases the only basis for modelling is the expert knowledge, which is often uncertain and imprecise
1.3
Fuzzy Classification: A Fuzzy Clustering Approach
Conventional clustering methods definitely place an object within only one cluster With fuzzy clustering this is no longer essential, since the membership value of this object can be split up between different clusters In comparison to conventional clustering methods the distribution of the membership values provides additional information - the membership values of a particular object can
be interpreted as the degree of similarity between this object and the respective clusters (Salski and Kandzia 1996)
Classifying existing chemicals according to their ecotoxicological properties (Friederichs et al 1996) can be taken as an application example of the fuzzy cluster analysis The large number of existing chemicals makes it necessary to select representative chemicals which reflect the relevant properties of possibly a major group of compounds Therefore the main tasks of this application are:
- to find distinguishable clusters with characteristic properties,
- to find chemicals representative for each cluster,
- to examine the role of different parameters for clustering
Compared to conventional clustering methods the fuzzy clustering technique is more appropriate to handle the uncertainty of ecotoxicological data, which results, for example, from the difficult comparability of these data The analysis of the partition efficiency indicators was used to choose the fuzzifier value and the determination of the optimal number of clusters, e.g.:
Trang 40- partition entropy (should be minimal),
- partition coefficient, where values closer to 1 indicate the "better" partition,
- non-fuzziness index, indicating the “best” partition by the highest value,
independently of the number of clusters
The normalized values of these indicators for cluster numbers between 4 and 8
and fuzzifier values of 1.3 and 1.6 are presented in Figure 1.1 Five clusters can be
taken as the "optimal" number of clusters for a fuzzifier of 1.3 - whereas a
fuzzifier of 1.6 does not lead to a clear statement
Fig 1.1 Partition efficiency indicators for fuzzifier values of 1.3 (left) and 1.6
(right) (Friederichs et al 1996)
indexPay off
indexPay off
indexPay off
indexPay off
a
b