Morphometric and Artificial Neural Network Approaches to the Automated Species Recognition Problem in Systematics Norman MacLeod, M.. Status within the Systematics Community?...58 5.2..
Trang 1Morphometric and Artificial Neural Network Approaches to the Automated Species Recognition
Problem in Systematics
Norman MacLeod, M O’Neill and Steven A Walsh
Contents
Abstract 37
5.1 Introduction 38
5.1.1 The.Need.for.Automated.Species.Recognition.in.Systematics 38
5.1.2 Approaches 40
5.1.3 Objectives 43
5.1.4 Materials.and.Methods 43
5.1.5 Results 47
5.1.6 Discussion 53
5.1.6.1 Which.Approach? 53
5.1.6.2 Scope for Synthesis? 57
5.1.6.3 Further Research Directions? 57
5.1.6.4 Status within the Systematics Community? 58
5.2 Summary.and.Conclusions 60
Acknowledgements 61
References 61
AbstrACt
One.approach.to.addressing.long-standing.concerns.associated.with.the.taxonomic.imped-iment and occasional low reproducibility of taxonomic data is through development of
automated species identification systems Such systems can, in principle, be combined
with.image-based.or.image-.and.text-based.taxonomic.databases.to.add.elements.of.expert
system.functionality Two.generalized.approaches.are.considered.relevant.in.this.context:
morphometric systems based on some form of linear discriminant analysis (LDA) and
Trang 3economic imperatives are rapidly drawing to a close In order to attract personnel and.
resources, morphology-based taxonomy must transform itself into a ‘large, coordinated,
international.scientific.enterprise’.(Wheeler,.2003,.p 4) Many.have.recently.touted.use.of
Trang 7they.are.used.in.a.wide.variety.of.scientific.contexts.(e.g.,.oceanography,.biogeog-logical.diversity;.and
Trang 10of within-group ordinations within the shape spaces defined by CVA axes The
digital.automated.image-analysis.system.(DAISY;.Weeks.et.al 1997,.1999a,.b) This.imple-mentation accepts training sets in the form of standard format images (e.g., jpeg, tiff)
of authoritatively identified specimens These image-based training sets were processed
Trang 11comparison between brightness values between pixel locations The result allows each.
Trang 15Bookstein 1990; Marcus et al 1993, 1996; MacLeod and Forey 2002) These variables.
Once again, using least-squares superposition to normalize the coordinate data for
generalized size differences (thereby achieving an entirely shape-based discrimination)
and.employing.CVA.to.construct.a.discriminant.space,.an.unprecedented.correct.cross-validation.identification.ratio.of.0.99.was.obtained.(Table.5.1.and.Figure.5.8) Of.the.two
misidentified.specimens, a.Globigerinelloides conglobatus.was.mistaken.completely for.
Globigerinelloides ruber (posterior.probability.=.1.00).while.a.Globigerinella
inaequilate-ralis was.ambiguously.mistaken.for.Globorotalia tumida.(posterior.probability.=.0.67).
Cross-validation.results.for.the.DAISY-based.ANN.analysis.differ.from.those.of.the
CVA.analysis.in.terms.of.the.manner.in.which.the.posterior.probabilities.are.calculated
Instead of using a distance-based approach for assigning unknowns to groups, DAISY
uses.a.combined eightfold.distance-coordination approach.with the.minimum
Trang 20frame This approach relaxes the morphometric requirement for landmarks to represent.
a comparatively small number but biologically well-known set of close topological
cor-respondences between objects in favour of more inclusive information drawn from the
both approaches are limited by complimentary deficiencies: morphometric methods are
rich in biological meaning, but deficient in overall geometric information content while
ANN.methods.are.rich.in.overall.information.content,.but.deficient.in.biological.meaning
A.synthesis.between.the.two.is.not.only.possible,.but.highly.desirable
Until such a synthesis is achieved, however, it makes sense to match the available
strengths of each approach to the diversity of morphological problems at hand
Mor-phometrics would appear to be best utilized for the investigation of precise distinctions
Trang 21... Further Research Directions?
For morphometric and ANN approaches, one of the most important needs is for better
specification of adequate training set attributes In the technical literature produced on
these.methods.over.the.years,.scarcely.any.but.the.most.general.statements.about.the.com-position.and.nature.of.reliable.training.sets.have.been.made To.be.sure,.a.large.body.of
Trang 23concerns Systems that can authoritatively achieve consistent, semi-automated and fully
automated identifications of planktonic microfossil species — and, by extension, many
A.major.factor.hold-to.perform.to.necessary taxonomic identifications to.a.high degree of.accuracy
systematic.research.guiding.the.discovery.and.testing.of.new.characters.and.refin-Reinvigoration of the discipline of morphological systematiclenges.such.as.DNA.bar.coding.and.GeneBank,.morphological.systematics.must
s In.the.face.of.chal-ibly.morphology-based.refugia.(e.g.,.palaeontology) Because.of.their.generality,
become.more.automated.and.efficient.or.it.will.cease.to.exist.outside.a.few.irreduc-automated species (= image) recognition systems can be used in a wide variety
•
•
•
•
Trang 24sis This ability extends across the spectrum of systematic data (e.g., morphol-ogy,.ecology,.geography,.stratigraphic,.chemical,.molecular,.audio,.olfactory,.DNA.
Trang 25MacLeod,.N (1998).Impacts.and.marine.invertebrate.extinctions In.Meteorites: Flux with time and
impact effects,.ed N.M Grady,.R Hutchinson,.G.J.H McCall,.and.D.A Rotherby Geological
Society.of.London,.London,.217–246
MacLeod,.N (1999).Generalizing.and.extending.the.eigenshape.method.of.shape.visualization.and
analysis Paleobiology.25(1):.107–138.
Trang 26Ripley,.B.D (1996).Pattern recognition and neural networks Cambridge.University.Press,.Cambridge.
Rohlf,.F.J and.Bookstein,.F.L (1990).Proceedings of the Michigan morphometrics workshop The.