30 argmaxx It indicates the element that reaches its maximum value32 xT Transposition of vector x 33 Product of all values in range of series Q 35 siIk The confidence score of the plant
Trang 1HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
NGUYEN THI THANH NHAN
INTERACTIVE AND MULTI-ORGAN BASED
PLANT SPECIES IDENTIFICATION
Major: Computer ScienceCode: 9480101
INTERACTIVE AND MULTI-ORGAN BASED PLANT SPECIES
IDENTIFICATION
SUPERVISORS:
1 Assoc Prof Dr Le Thi Lan
2 Prof Dr Hoang Van Sam
Hanoi − 2020
Trang 2HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Thi Thanh Nhan
INTERACTIVE AND MULTI-ORGAN
BASED PLANT SPECIES
IDENTIFICATION
Major: Computer ScienceCode: 9480101
DOCTORAL DISSERTATION OF COMPUTER SCIENCE
SUPERVISORS:
1 Assoc Prof Dr Le Thi Lan
2 Prof Dr Hoang Van Sam
Hanoi − 2020
Trang 3This work was done wholly or mainly while in candidature for a Ph.D
research degree at Hanoi University of Science and Technology
Where any part of this dissertation has previously been submitted for a
degree or any other qualification at Hanoi University of Science and
Technology or any other institution, this has been clearly stated
Where I have consulted the published work of others, this is always clearly at-tributed
Where I have quoted from the work of others, the source is always given With the exception of such quotations, this dissertation is entirely my own work
I have acknowledged all main sources of help
Where the dissertation is based on work done by myself jointly with others, I have made exactly what was done by others and what I have contributed myself.
Hanoi, May, 2020PhD Student
Nguyen Thi Thanh Nhan
SUPERVISORS
Trang 4First of all, I would like to thank my supervisors Assoc Prof Dr Le Thi Lan atThe International Research Institute MICA - Hanoi University of Science andTechnology, Assoc Prof Dr Hoang Van Sam at Vietnam National University ofForestry for their inspiration, guidance, and advice Their guidance helped me allthe time of research and writing this dissertation
Besides my advisors, I would like to thank Assoc Prof Dr Vu Hai, Assoc Prof Dr Tran Thi Thanh Hai for their great discussion Special thanks to my friends/colleagues in MICA, Hanoi University of Science and Technology: Hoang Van Nam, Nguyen Hong Quan, Nguyen Van Toi, Duong Nam Duong, Le Van Tuan, Nguyen Huy Hoang, Do Thanh Binh for their technical supports They have assisted me a lot in my research process as well as they are co-authored in the published papers.
Moreover, I would like to thank reviewers of scientific conferences, journalsand protection council, reviewers, they help me with many useful comments
I would like to express a since gratitude to the Management Board of MICAIn-stitute I would like to thank the Thai Nguyen University of Information andCommu-nication Technology, Thai Nguyen over the years both at my career workand outside of the work
As a Ph.D student of the 911 program, I would like to thank this program forfinancial support I also gratefully acknowledge the financial support for attending theconferences from the Collaborative Research Program for Common Regional Is-sue(CRC) funded by ASEAN University Network (Aun-Seed/Net), under the grantreference HUST/CRC/1501 and NAFOSTED (grant number 106.06-2018.23)
Special thanks to my family, to my parents-in-law who took care of my familyand created favorable conditions for me to study I also would like to thank mybeloved husband and children for everything they supported and encouraged mefor a long time to study
Hanoi, May, 2020Ph.D StudentNguyen Thi Thanh Nhan
Trang 51.1 Plant identification 10
1.1.1 Manual plant identification 10
1.1.2 Plant identification based on semi-automatic graphic tool 11
1.1.3 Automated plant identification 12
1.2 Automatic plant identification from images of single organ 13
1.2.1 Introducing the plant organs 13
1.2.2 General model of image-based plant identification 16
1.2.3 Preprocessing techniques for images of plant 17
1.2.4 Feature extraction 19
1.2.4.1 Hand-designed features 20
1.2.4.2 Deeply-learned features 22
1.2.5 Training methods 25
1.3 Plant identification from images of multiple organs 28
1.3.1 Early fusion techniques for plant identification from images of multiple organs 30
1.3.2 Late fusion techniques for plant identification from images of multiple organs 31
1.4 Plant identification studies in Vietnam 33
1.5 Plant data collection and identification systems 35
1.6 Conclusions 43
Trang 62 LEAF-BASED PLANT IDENTIFICATION METHOD BASED ON
2.1 The framework of leaf-based plant identification method 45
2.2 Interactive segmentation 46
2.3 Feature extraction 50
2.3.1 Pixel-level features extraction 50
2.3.2 Patch-level features extraction 51
2.3.2.1 Generate a set of patches from an image with adaptive size 51
2.3.2.2 Compute patch-level feature 52
2.3.3 Image-level features extraction 55
2.3.4 Time complexity analysis 56
2.4 Classification 57
2.5 Experimental results 57
2.5.1 Datasets 57
2.5.1.1 ImageCLEF 2013 dataset 57
2.5.1.2 Flavia dataset 57
2.5.1.3 LifeCLEF 2015 dataset 58
2.5.2 Experimental results 58
2.5.2.1 Results on ImageCLEF 2013 dataset 58
2.5.2.2 Results on Flavia dataset 61
2.5.2.3 Results on LifeCLEF 2015 dataset 61
2.6 Conclusions 68
3 FUSION SCHEMES FOR MULTI-ORGAN BASED PLANT IDEN-TIFICATION 69 3.1 Introduction 69
3.2 The proposed fusion scheme RHF 71
3.3 The choice of classification model for single organ plant identification 77 3.4 Experimental results 79
3.4.1 Dataset 80
3.4.2 Single organ plant identification results 81
3.4.3 Evaluation of the proposed fusion scheme in multi-organ plant identification 81
3.5 Conclusion 89
4 TOWARDS BUILDING AN AUTOMATIC PLANT RETRIEVAL BASED ON PLANT IDENTIFICATION 90 4.1 Introduction 90
4.2 Challenges of building automatic plant identification systems 90
Trang 74.3 The framework for building automatic plant identification system 94
4.4 Plant organ detection 96
4.5 Case study: Development of image-based plant retrieval in VnMed plication 101
Trang 84 CBF Classification Base Fusion
5 CNN Convolution Neural Network
6 CNNs Convolution Neural Networks
8 CMC Cumulative Match Characteristic Curve
15 GPU Graphics Processing Unit
16 GUI Graphic-User Interface
17 HOG Histogram of Oriented Gradients
18 ILSVRC ImageNet Large Scale Visual Recognition Competition
19 KDES Kernel DEScriptors
22 L-SVMLinear Support Vector Machine
23 MCDCNNMulti Column Deep Convolutional Neural Networks
26 OPENCV OPEN source Computer Vision Library
28 PCA Principal Component Analysis
29 PNN Probabilistic Neural Network
30 QDA Quadratic Discriminant Analysis
Trang 931 RAM Random Acess Memory
32 ReLU Rectified Linear Unit
33 RHF Robust Hybrid Fusion
35 ROI Region Of Interest
36 SIFT Scale-Invariant Feature Transform
38 SURF Speeded Up Robust Features
39 SVM Support Vector Machine
40 SVM-RBF Support Vector Machine-Radial Basic Function kernel
Trang 107 Rd Set of real number has d dimensions
13 k w k L2 normalize of vector w
14 xi The i-th element of vector x
15 sign(x) The sign function that determines the sign Equals 1 if x ≥ 0, −1
if x < 0
17 max The function takes the largest number from a list
20 I(x, y) The intensity value at (x, y) of an image
23 arctan(x) It returns the angle whose tangent is a given number
24 cos(θ) Function of calculating cosine value of angle θ
25 sin(θ) Function of calculating sine value of angle θ
26 m(z) The magnitude of the gradient vector at pixel z
27 θ(z) The orientation of gradient vector at pixel z
28 ˜ The normalized gradient vector
θ(z)
Trang 1130 argmax(x) It indicates the element that reaches its maximum value
32 xT Transposition of vector x
33 Product of all values in range of series
Q
35 si(Ik) The confidence score of the plant species i−th when using image Ik
as a query from a single organ plant
36 c The predicted class of the species for the query q
37 C The number of species in dataset
38 k
m˜
The gradient magnitude kernel
39 ko The orientation kernel
40 kp The position kernel
41 m˜(z) The normalized gradient magnitude
Trang 12LIST OF TABLES
Table 1.1 Example dichotomous key for leaves [14] 11
Table 1.2 Methods of plant identification based on hand-designed features 21
Table 1.3 A summary of available crowdsourcing systems for plant
informa-tion collecinforma-tion 36
Table 1.4 The highest results of the contest obtained with the same
recog-nition approach using hand-crafted feature 41
Table 2.1 Leaf/leafscan dataset of LifeCLEF 2015 58Table 2.2 Accuracy obtained in six experiments with ImageCLEF 2013 dataset 60
Table 2.3 Precision, Recall and F-measure in improved KDES with
interac-tive segmentation for ImageCLEF 2013 dataset 62Table 2.4 Comparison of the improved KDES + SVM with the state-of-the-
art hand-designed features-based methods on Flavia dataset 63Table 2.5 Precision, Recall and F-measure of the proposed method for Flaviadataset 64
Table 3.1 An example of test phase results and the retrieved plant list termination using the proposed approach 74
de-Table 3.2 The collected dataset of 50 species with four organs 81Table 3.3 Single organ plant identification accuracies with two schemes: (1)
A CNN for each organ; (2) A CNN for all organs The best result for
each organ is in bold 82Table 3.4Obtained accuracy at rank-1, rank-5 when combining each pair of organs with di erent fusion schemes in case of using AlexNet The best ffresult for each pair of organs is in bold 83
Table 3.5Obtained accuracy at rank-1, rank-5 when combining each pair of organs with di erent fusion schemes in case of using ResNet The best ffresult for each pair of organs is in bold 84
Trang 13Table 3.6 Obtained accuracy at rank-1, rank-5 when combining each pair oforgans with di erent fusion schemes in case of using GoogLeNet Theffbest result for each pair of organs is in bold 84
Table 3.7 Comparison of the proposed fusion schemes with the state of theart method named MCDCNN [79] The best result for each pair of
Table 4.1 Plant images dataset using conventional approaches 91
Table 4.2 Plant image datasets built by crowdsourcing data collection tools 92
Table 4.3 Dataset used for evaluating organ detection method 97
Table 4.4 The organ detection performance of the GoogLeNet with di erentffweights initialization 97
Table 4.5 Confusion matrix for plant organ detection obtained (%) 98Table 4.6 Precision, Recall and F-measure for organ detection with Life-CLEF2015 dataset 99
Table 4.7 Confusion matrix for detection 6 organs of 100 Vietnam species
on VnDataset2 (%) 103 Table 4.8Four Vietnamese medicinal species datasets 104 Table 4.9Results for Vietnamese medicinal plant identification 104
Trang 14LIST OF FIGURES
Figure 1 Automatic plant identification 2
Figure 2 Examples of these terminologies used in the thesis [12] 3
Figure 3 One observation of a plant [12] 4
Figure 4 (a) Example of large inter-class similarity: leaves of two distinctspecies are very similar; (b) example of large intra-class variation: leaves
of the same species vary significantly due to the growth stage 5
Figure 5 Challenges of plant identification (a) Viewpoint variation; (b)Occlusion; (c) Clutter; (d) Lighting variation; (e) color variation of same
species 5
Figure 6 Confusion matrix for two-class classification 6
Figure 7 A general framework of plant identification 8
Figure 1.1 Botany students identifying plants using manual approach [13] 11
Figure 1.2 (a) Main graphical interface of IDAO; (b), (c), (d) Graphical icons
for describing characteristics of leaf, fruit and flower respectively [16] 12Figure 1.3 Snapshots of Leafsnap (left) and Pl@ntNet (right) applications 13
Figure 1.4Some types of leaves: a,b) leaves on simple and complex ground of the Acer pseudop latanus L, c) a single leaf of the Cercis
back-siliquastrum L, d) a compound leaf of the Sorbus aucuparia L 14
Figure 1.5Illustration of flower inflorescence types (structure of the flower(s) onthe plant, how they are connected between them and within the plant)
[11] 15
Figure 1.6 The visual diversity of the stem of the Crataegus monogyna Jacq 15
Figure 1.7 Some examples branch images 16Figure 1.8 The entire views for Acer pseudoplatanus L 16Figure 1.9 Fundamental steps for image-based plant species identification .17Figure 1.10 Accuracy of plant identification based on leaf images on complexbackground in the ImageCLEF 2012 [21] 19
Trang 15Figure 1.11 Feature visualization of convolutional net trained on ImageNet
from [61] 23
Figure 1.12 Architecture of a Convolutional Neural Network 23
Figure 1.13 Hyperplane separates data samples into 2 classes 27
Figure 1.14 Two fusion approaches, (a) early fusion, (b) late fusion 29
Figure 1.15 Early fusion method in [77] 30
Figure 1.16 Di erent types of fusion strategies ff [78] 31
Figure 1.17 Some snapshot images of Pl@ntNet 37
Figure 1.18 Obtained results on three flower datasets Identification rate reduces when the number of species increases 42
Figure 1.19 Comparing the performances of datasets consisting of 50 species Blue bar: The performances on original dataset collected from Life-CLEF; Red bar: Performances with riched datasets The species on two datasets are identical 43
Figure 2.1 The complex background leaf image plant identification framework 46 Figure 2.2 The interactive segmentation scheme 47
Figure 2.3 Standardize the direction of leaf (a): leaf image after segmen-tation; (b): Convert to binary image; (c): Define leaf boundary using Canny filter; (d): Standardized image direction 49
Figure 2.4Examples of leafscan and leaf, the first row are raw images, the second row are images after applying corresponding pre-processing tech-niques 50
Figure 2.5An example of the uniform patch in the original KDES and the adaptive patch in our method (a,b) two images of the same leaf with di erent sizes are divided using uniform patch; (b,c): two images of the ff same leaf with di erent sizes are divided using adaptive patch ff 52 Figure 2.6An example of patches and cells in an image and how to convert adaptive cells 53
Figure 2.7 Construction of image-level feature concatenating feature vectors of cells in layers of hand pyramid structure 56
Figure 2.8 32 images of 32 species of Flavia dataset 58
Trang 16Figure 2.9Interactive segmentation developed for mobile devices Top left: original image, top right: markers, bottom right: boundary with
Water-shed, bottom left: segmented leaf 59
Figure 2.10 Some imprecise results of image segmentation 60
Figure 2.11 Detail accuracies obtained on ImageCLEF 2013 dataset in our experiments For some classes such as Mespilus germanica, the obtained accuracy in the 4 experiments is 0% 65
Figure 2.12 Some images of class 14, class 27 of Flavia dataset 66
Figure 2.13 Images of class 32 and 16 were misidentified to class 13, 31 of Flavia dataset 66
Figure 2.14 Detailed scores obtained for Leafscan [1], our team’s name is Mica 66 Figure 2.15 Detailed scores obtained for all organs [1], our team’s name is Mica 67
Figure 3.1An example of a two plant species that are similar in leaf but di erent in flower (left) and those are similar in leaf and di erent in fruits ff ff 70 Figure 3.2 The framework for multi-organ plant identification 70
Figure 3.3 Explanation for positive and negative samples 72
Figure 3.4 Illustration of positive and negative samples definition With a pair of images from leaf (a) and flower (c) of the species #326, the corresponding confidence score of all species in the dataset (e.g., 50) when using leaf and flower image are shown in (b) 73
Figure 3.5 In RHF method, each species has an SVM model based on its positive and negative samples 75
Figure 3.6 The process of computing the corresponding positive probabilities for a query using the RHF method 75
Figure 3.7 AlexNet architecture taken from [49] 77
Figure 3.8 ResNet50 architecture taken from [143] 78
Figure 3.9 A schematic view of GoogLeNet architecture [63] 79
Figure 3.10 Single organ plant identification 79
Trang 17Figure 3.11 Comparison of identification results using leaf, flower, and bothleaf and flower images The first column is query images The secondcolumn shows top 5 species returned by the classifier The third column
is the corresponding confidence score for each species The name ofspecies in the ground truth is Robinia pseudoacacia L 85
Figure 3.12 Cumulative Match Characteristic curve obtained by the proposedmethod with AlexNet (Scheme 1 for single organ identification) 86
Figure 3.13 Cumulative Match Characteristic curve obtained by the proposedmethod with ResNet (Scheme 1 for single organ identification) 87
Figure 3.14 Cumulative Match Characteristic curve obtained by the propsedmethod with AlexNet (Scheme 2 for single organ identification) 87
Figure 3.15 Cumulative Match Characteristic curve obtained by the proposedmethod with ResNet (Scheme 2 for single organ identification) 88
Figure 4.1 Some challenges in plant and non-plant classification 93
Figure 4.2 Illustration of di cult cases for plant organ detection .ffi 93
Figure 4.3 The proposed framework for building automatic plant tion 94
identifica-Figure 4.4 Some images of data collection for two species: (a) Camellia
sinensis, (b) Terminalia catappa First row shows images are collected
by manual image acquisitions, second row shows images are collected bycrowdsoucring 95
Figure 4.5 Some examples for wrong identification 98
Figure 4.6 Visualization of the prediction of GoogLeNet used for plant organdetection Red pixels are evidence for a class, and blue ones against it 99
Figure 4.7 Detection results of the GoogLeNet with di erent classificationffmethods at the first rank (k=1) 100
Figure 4.8 Results obtained by the proposed GoogLeNet and the method
in [7] for six organs 101
Figure 4.9 Architecture of Vietnamese medicinal plant search system [127]
102
Figure 4.10 Snapshots of VnMed; a) list of species for a group of diseases; b)
a detail information for one species; c) a query image for plant cation; d) top five returned results 102
Trang 18identifi-Figure 4.11 Data distribution of 596 Vietnamese medicinal plants 105
Figure 4.12 Illustration of image-based plant retrieval in VnMed 106
Trang 19Motivation
Plants play an important part in ecosystem They provide oxygen, food, fuel,medicine, wood and help to reduce air pollution and prevent soil erosion Goodknowl-edge of flora allows to improve agricultural productivity, protects thebiodiversity, balances ecosystem and minimizes the e ects of climate change Theffpurpose of plant identification is matching a given specimen plant to a knowntaxon This is consid-ered as an important step to assess flora knowledge Thetraditional identification of plants is usually done by the botanists with specificbotanical terms However, this process is complex, time-consuming, and evenimpossible for people in general who are interested in acquiring the knowledge ofspecies Nowadays, the availability of relevant technologies (e.g digital camerasand mobile devices), image datasets and advance techniques in image processingand pattern recognition makes the idea of automated plants/species identificationbecome true The automatic plant identification can be defined as the process ofdetermining the name of species based on their observed im-ages (see Figure 1)
As each species has certain organs and each organ has its own distinguishingpower, the current automatic plant identification follows two main ap-proaches:using only images of one sole organ type or combining images of di erent organs.ff
In recent years, we have witnessed a significant performance improvement ofautomatic plant identification in terms of both accuracy and the number of speciesclasses [1–4] According to [4, 5], automatic plant identification results are lower thanthe best experts but approximate to the experienced experts and far exceeds those ofbeginners or amateurs in plant taxonomy Based on the impressive results on auto-matic plant identification, some applications have been deployed and widely usedsuch as the Pl@ntNet [6], Leafsnap [7], MOSIR [8] However, the use of plantidentification in reality still has to overcome some limitations First, the number ofcovered plant species (e.g., 10,000 in LifeCLEF [3]) is relatively small in comparisonwith the num-ber of plant species on the earth (e.g., 400,000 [9]) Second, theaccuracy of automatic plant identification still needs to be improved In ourexperiments (section 1.5) we have shown that when the number of species increases,the rate of identification decreases dramatically due to the inter-class similarity
Trang 20Figure 1 Automatic plant identification.
Objective
The main aim of this thesis is to overcome the second limitation of theautomatic plant identification (low recognition accuracy) by proposing novel androbust methods for plant recognition For this, we first focus on improving therecognition accuracy of plant identification based on images of one sole organ.Among di erent organs of the plant, we select leaf as this organ is the most widelyffused in the literature [10] However, according to [10], most analyzed images inthe previous studies were taken under simplified conditions (e.g., one mature leafper image on a plain background) Towards real-life application, the plantidentification methods should be experimented with more realistic images (e.g.,having a complex background, and been taken in di erent lighting conditions).ffSecond, taking into consideration that using one sole organ for plantidentification is not always relevant because one organ cannot fully reflect allinformation of a plant due to the large inter-class similarity and the large intra-classvariation Therefore, multi-organ plant identification is also studied in this thesis Inthis thesis, multi-organ plant identification will be formulated as a late fusionproblem: the multi-organ plant results will be determined based on those obtainedfrom single-organ Therefore, the thesis will focus on fusion schemes
Finally, the last objective of the thesis is to build an application of Vietnamesemedicinal plant retrieval based on plant identification By this application, the knowl-edge that previously only belongs to botanists can be now popular for the community
To this end, the concrete objectives are:
Develop a new method for leaf-based plant identification that is able to
recognize the plants of interest even in complex background images;
Trang 21Propose a fusion scheme in multiple-organ plant identification;
Develop a image-based plant search module in Vietnamese medicinal plant re-trieval application
Context, constraints, and challenges
Our work based on an assumption that the query images are available In real applications, we require users to provide images of the to-be-identified plant by directly capturing images in the field or selecting images in the existing albums Through this thesis, we use the following terminologies that are defined in plant identification task of ImageCLEF [11] Examples of these terminologies are illustrated in Figure 2.
Figure 2 Examples of these terminologies used in the thesis [12]
Image of plant is an image captured from a plant This image contains one type
of organs In this work, we focus on six main organs including leaf, flower,
fruit, branch, stem and entire.
SheetAsBackground leaf images are pictures of leaves in front of a white orcolored uniform background produced by a scanner or a camera with asheet, these images are also named leafscan The leafscan image can bedivided to Scan (scan of a single leaf) and Scan-like (photograph of a singleleaf in front of a uniform artificial background)
3
Trang 22NaturalBackground images are the directly-captured photographs from the plant including one among 6 types of organ It is worth to note that Natural-Background images may contain more than one type of organs.
Observation of a plant is a set of images captured from a single plant by thesame person in the same day using the same device and lightning conditions.Figure 3 shows an observation of a plant which contains of five images
Figure 3 One observation of a plant [12]
The automatic plant identification has to face di erent challenges The first chal-fflenge is the large inter-class similarity and the large intra-class variation Figure 4(a)illustrates the case of the large inter-class similarity (leaves of two distinct species arevery similar) while Figure 4(b) shows an example of the large intra-class variation(leaves of the same species vary significantly due to the growth stage) The secondchallenge is the background of the plant images is usually complex especially for Nat-uralBackground images Data imbalance is the third challenge of automatic plantidentification as the distribution of plant species on the planet is diverse The fourthchallenge is the high number of species To the best of our knowledge, the biggestimage dataset of LifeCLEF 2017 contains more than 1.8M images of 10,000 plantspecies [3] Finally, plants images are usually captured by di erent users with di erentff ffacquisition protocols Therefore, they have lighting and viewpoint variations and maycontain occlusions, clutter, and object deformations These issues are illustrated inFigure 5 with several species
Trang 23Figure 4 (a) Example of large inter-class similarity: leaves of two distinct speciesare very similar; (b) example of large intra-class variation: leaves of the samespecies vary significantly due to the growth stage.
Figure 5 Challenges of plant identification (a) Viewpoint variation; (b) Occlusion; (c) Clutter; (d) Lighting variation; (e) color variation of same species
Trang 24Evaluation metrics
In plant identification, for each query containing one or multiple images of one or several organs, a list of species being sorted according to the confidence score of the method/system in the suggested species is provided In order to evaluate the proposed methods, in this thesis we employ five main metrics.
ones.
Precision To analyze the behavior of classification problems, a confusionmatrix has been usually provided Figure 6 illustrates a confusion matrix for atwo-class classification problem where TP (True Positive) and TN (TrueNegative) represent the correct decisions while FP (False Positive) and FN(False Negative) represent the errors For a multi-class classificationproblem, we also build a confusion matrix for each class if we consider thatclass to be positive, and the remaining classes are combined into negativeclass From confusion matrix, we can compute three evaluation metrics thatare Precision, Recall and F-measure as follows:
Figure 6 Confusion matrix for two-class classification
Trang 25 P recision = T P (2)
TP +FPRecall
TP +FNF-measure
P recison + RecallScore at image level
The fifth metric denoted S is the score at image level This metric is definedand employed as an evaluation metric for plant identification in LifeCLEF
2015 competition [1] The value of S is defined as:
Nu,p is the number of pictures of the p-th plant observation of the u-th user, ru,p,n
is the rank of the correct species within the ranked list of images returned by theidentification method This metric allows compensating the long-tail distribution
e ects occurring in social data (most users provide much less data, only a fewffpeople provide huge quantities of data) The value of S ranges from 0 to 1 Thegreater the value of S is, the better the identification method is
Contributions
The dissertation has three main contributions as follows:
Contribution 1: A complex background leaf-based plant identification methodhas been proposed The proposed method takes the advantages ofsegmentation with a few interactions from the user to determine the leafregion The features are then extracted on this region by the representativepower of Kernel Descrip-tor (KDES) The experimental results obtained on
di erent benchmark datasets have shown that the proposed methodffoutperforms state of the art hand-crafted feature-based methods
Trang 26Contribution 2: One fusion scheme for two-organ based plant identification has been introduced The fusion is an integration between a product rule and a classification-based approach.
Contribution 3: Finally, an image-based plant searching module has been veloped and deployed in Vietnamese medicinal plant retrieval applicationnamed VnMed
de-General framework and dissertation outline
In this dissertation, we propose a unified framework for plant identification The proposed framework consists of three main phases as illustrated in Figure 7 The first step
is to build a dataset based on crowdsourcing, then evaluate the data based on the proposed organ detection to remove non-plant images and classify images by organs The second phase is plant identification at image level The third phase is organ combination By utilizing this framework, we deploy a real application for Vietnamese medicinal plant Particularly, these research works in the dissertation are composed into six chapters as following:
Introduction: This section describes the main motivations and objectives of the study We also present critical points of the research’s context, constraints and challenges that we meet and address in the dissertation Additionally, the general
framework and main contributions of the dissertation are also presented Chapter 1: A Literature Review: This chapter mainly surveys existing works and
approaches proposed for automatic plant identification.
Chapter 2: In this chapter, a method for plant identification based on leafimage is proposed In the proposed method, to extract leaf region fromimages, we pro-posed to apply interactive segmentation Then, the improvedKDES is employed to extract leaf features
Figure 7 A general framework of plant identification
Trang 27Chapter 3: This chapter focuses on multi-organ plant identification We propose a method named RHF (Robust Hybrid Fusion) for determining the result of two-organ identification based on those of single-organ ones.
Chapter 4: In this chapter, we propose a method of organ detection and an
application for Vietnamese medicinal plant retrieval system based on this method.
Conclusion: We give some conclusions and discuss the limitations of the proposed method Research directions are also described for future works
Trang 28CHAPTER 1
LITERATURE REVIEW
This chapter aims at presenting the existing works proposed for plantidentification in the literature First, we introduce three main approaches that aremanual, semi-automatic and automatic plant identification in section 1.1 As thethesis follows the automatic plant identification approach, we will analyze in detailthe related works pro-posed for this category According to the image of organused for plant identification, it can be divided into single-organ and multiple-organplant identification approaches Two sections (section 1.2 and 1.3) in this chapterwill presented with di erent aspects of these approaches Finally, the current plantffcollection and identification systems are discussed on the open issues in botanicaldata collection and identification in section 1.5
1.1 Plant identification
Plant identification is a process of matching a specimen plant to a known taxon.The name of a plant (common name or scientific name) is a key to access other infor-mation of a plant such as images, descriptions, etc Nowadays, three mainapproaches are used for plant identification: manual recognition, semi-automaticidentification and automatic plant identification based on the images of plants
1.1.1 Manual plant identification
In manual plant identification approach, botanists observe a plant, use di erentffplant characteristics to identify species based on dichotomous key with technicalterms These keys are used to answer a series of questions about one or moreattributes of an unknown tree to narrow down the set of candidate species based onthe knowledge of botanical classification A series of questions answered eventuallyleads to identification of the desired species Figure 1.1 describes a group botanystudent performing plant identification by using manual approach while Table 1.1
describes one example of this approach The manual approach is the most widelyused approach in botany commu-nity However, it is very time consuming The use ofdichotomous key, which cannot tolerate any error and imposes the choice as well asthe order of questions Moreover, it is not suitable for general public as selecting andlooking up specific botanical terms is a complicated and tedious procedure
Trang 29Figure 1.1 Botany students identifying plants using manual approach [13].
Table 1.1 Example dichotomous key for leaves [14]
2 a Needles are clustered Pine
b Needles are in singlets Spruce
3 a Simple leaves (single leaf) go to 4
b Compound leaves (made of “leaflets”) go to 7
6 a Leaf edge is small and tooth-like Elm
b Leaf edge is large and thorny Holly
7 a Leaflets attached at one single point Chestnut
b Leaflets attached at multi point Walnut
1.1.2 Plant identification based on semi-automatic graphic tool
In the above approach, the use of scientific terms is very di cult to remember With ffi the expectation that end users of plant identification systems are non-botanists, the system should be intuitive and easy to use In this approach, terms will be repre-sented by icons, making it easier for many people to understand and use IDAO [15] is one of the systems that belong to this approach In IDAO, a Graphic-User Interface (GUI) supports end-users describing biological features by iconic symbols By this way, the visual descriptions on each organ are described through pictures or icons on a graphic interface,
as shown in Figure 1.2 The end-user can select organ features by choosing the corresponding icons Then the system will return the list of predicted plants according to the description selected by the users This approach is intuitive and
Trang 30Figure 1.2 (a) Main graphical interface of IDAO; (b), (c), (d) Graphical icons fordescribing characteristics of leaf, fruit and flower respectively [16].
language independent However, the disadvantage of this approach is that itrequires time for selecting appropriate icons In addition, the number of icons isstill limited, therefore, for some plants, we could not find the matching icons
1.1.3 Automated plant identification
Nowadays, the strong development of technology such as mobile devices, digital cameras, powerful computers, networks, large bandwidth and the advanced research in computer vision community make automated plant identification from plant im-ages/photographs become feasible Recently, a number of works have been dedicated for plant identification based on images The main principle of automatic plant iden- tification based on plant images is to extract the visual characteristic of the plants through images content and predict the name of plant by using image processing, computer vision and machine learning techniques The input for the methods in this approach could be an image or a set of images of some plant organs According to the study of Bonnet et al [4,
5], automatic plant identification results are lower than the best experts but approximate experienced experts and far exceeds those of beginners or amateurs in plant taxonomy Based on the impressive results of automatic plant identification, some applications have been deployed and widely used on mobile devices such as the Pl@ntNet [6], Leafsnap [7], MOSIR [8], etc Figure 1.3 shows some screens of Leafsnap and Pl@ntNet applications With these tools, users simply upload images of a plant on the system, then
a list of predicted plants is returned The automatic plant identification is simple, fast and intuitive for users However, it is challenging research topics and the accuracy of this approach is still far from user expectations.
As a plant may have di erent organs and each organ has a di erent role inff ffde-termining the plant name The state-of-the-art methods can be divided into twocat-egories: based on image(s) of single organs or multiple organs The followingsections aim to analyzing in detail the methods that have been proposed forsingle-organ and multiple organ based plant identification
Trang 31Figure 1.3 Snapshots of Leafsnap (left) and Pl@ntNet (right) applications.
1.2 Automatic plant identification from images of single organ
This section is dedicated for plant identification from images of single organ.1.2.1 Introducing the plant organs
Among di erent organ of plants, the most widely used organs for automaticffplant identification that are ranked by decreasing order are leaf, flower, fruit,branch, stem and entire
Leaf: among di erent organs of a plant, the leaf is the most widely used becauseffleaf usually exists in a whole year, numerous in number, easily collected and isusually flat In the review study [10], referring to 120 papers on plant identification, 107studies focus on leaf organ, only 15 studies of plant identification based on otherorgans such as flower, fruit, entire Among these studies, majority studies are usedleafscan (leaf images that are taken on simple/uniform background) Only 12 studieswork with leaf images on complex background As results, a wide number of leafimages on simple background are collected and made available such as Flavia, ICL,Swedish, etc (refer to Table 4.1) Experimental results have shown that usingleafscan often give the best results in comparison with other organs [17] There aretwo main kinds of leaf: single leaf and compound leaf that is composed of a number ofleaflets Figure 1.4 illustrates some examples of single leaf and compound leaf.However, using leaf for plant identification has certain challenges as the leaf of thesame plant may vary a lot due to the weather conditions and stages of development
Trang 32Figure 1.4 Some types of leaves: a,b) leaves on simple and complex background
of the Acer pseudop latanus L, c) a single leaf of the Cercis siliquastrum L, d) acompound leaf of the Sorbus aucuparia L
Flower: the next popular organ is flower because of its highly distinguishingappearances Moreover, the appearances of flower are usually stable and less variantwith the weather conditions, plant development stage, or other factors There are twokinds of flowers: single flower and inflorescences Figure 1.5 gives some types ofinflorescence From the botanic experts view, flower images are the valuable sourcefor the plant identification task [11, 18] The experimental results also show that therecognition results on flowers are often better than the other organs while working withcomplex background images [1, 11, 17] From flower images, di erent characteristicsffcan be extracted to identify the plants such as the color, symmetry, the number ofpetals and the size of flower However, there are some challenges when working withflower organ: First, flowers usually exist for a short time such as a few days or a fewweeks in a year There exist some plants that have not blossomed in several yeas.Second, the color and 3D shape of the flower of the same plant may vary significantly.Besides leaf and flower, other organs that are fruit, stem, branch and entire plant are also used for plant identification However, the number of studies on fruit, stem, branch, entire images is still limited [1, 6, 11, 17, 19] because of the challenging issues encountered while working with these organs Concerning stem organ, images of the
Trang 33Figure 1.5 Illustration of flower inflorescence types (structure of the flower(s) onthe plant, how they are connected between them and within the plant) [11].
plant stem contain mainly texture characteristic The di erent ages of trees also pro-duce ff very di erent images of stem Figure ff 1.6 shows some examples of plant stem at di erent ff development stages This is really a challenge to identify plants based on only stem organ For branch and entire organ, the branch images usually contain many other organs such as leaf, flower while entire image is captured from far viewpoint Figure 1.7 and Figure 1.8 show some examples of branch and entire images.
As conclusions, leaf and flower are two most widely used organs in automaticplant identification from images Besides flower and leaf, stem, fruit, branch and otherorgans can be also used for plant identification However, due to the weakdiscriminative power of these organs, the current methods usually combine theseorgans with leaf and flower in order to increase the plant identification performance
Figure 1.6 The visual diversity of the stem of the Crataegus monogyna Jacq
Trang 34Figure 1.7 Some examples branch images.
Figure 1.8 The entire views for Acer pseudoplatanus L
1.2.2 General model of image-based plant identification
A numerous methods have been proposed for image-based plantidentification and they all share the common model that is illustrated in Figure 1.9
Three main steps in the image-based plant species identification arepreprocessing, feature extraction, classification/prediction
The input of the image-based plant species identification is the image of plantorgans The aim of image preprocessing is enhancing image quality so thatrelevant features are highlighted for the next step It is an important step in theplant recog-nition system because it increases the probability of getting thedesired output The tasks in this step for plant identification usually includeenhancing image quality, image normalization and image segmentation
Trang 35Figure 1.9 Fundamental steps for image-based plant species identification.
Feature extraction is the process of transforming input data into a set offeatures Features often have high distinguishing characteristics of the inputimage Extracting features can reduce the size of the information displayed in theimage while features are highly distinctive Feature extraction can be considered
as the most important step in the identification system Appropriate featuresselected will secure the identification accuracy Futures used for plant speciesrepresentation may include both hand-designed and deeply-learn features
Training is the process of processing data using features analytical techniques toclassify samples into groups Training methods often include machine learning tech-niques It is also worth to note that in some deep-learning based methods the featureextraction step and training is coupled together in single network
In the following sections, we will analyze in detail three main steps:preprocessing, feature extraction and training methods
1.2.3 Preprocessing techniques for images of plant
While working with plant images, the plant images may contain others objects orbackground besides the plant organs of interest Before extracting plant characteristic,preprocessing technique is usually applied in plant images The most widely used isimage segmentation with the aim is to separate the region of interest (ROI) in imagesfrom background [20–22] Plant image segmentation is a challenging topic especiallywhen working with complex background images Based on the requirement of theuser interaction, the image segmentation is divided into automatic segmentation (thatdoes not require user manipulation) and semi-automatic or interactive segmentation.Automatic plant image segmentation: the method in [23] used automatic marker-controlled watershed segmentation method to segment leaf from complex back-
Trang 36ground With this method, the markers are automatically selected by choosing theOtsu thresholding method and erosion operation This method is suitable when theleaf needs to be separated in the center of the image and occupies a large area inworking image One automatic segmentation method for leaf images based on thecombination of spectral and space techniques is also proposed in [22] However,this method is not e ective when leaves have more than one dominant color.ffOther automated segmen-tation methods for complex leaf with leaflets based onthe Otsu algorithm, k-mean are described in [11, 21] Besides leaf, thesegmentation methods are also used for flower In [24], flower regions areautomatically determined based on color clustering and domain knowledge while
The advantage of automatic segmentation methods is that they do not require theparticipation of the user These techniques usually obtain good results when theobjects to be separated are in the center of images or take a significant area Whileworking with complex background images, these methods can not guarantee a goodsegmentation It is also worth to note that recent deep learning methods such as MaskR-CNN allow to produce while good result for image segmentation However, thesetechniques requires specific computation resources
Semi-automatic plant image segmentation: in this approach, the users are required to provide some cues that can guide the segmentation process A number of semi-automatic segmentation have presented for plant images such as GrabCut [21, 26, 27], interactive mean shift, guided active contour, watershed [21] In [28, 29], a user will draw a region inside the leaf Then a model of the color of the leaf is estimated, and to compute the distance of each pixel to this model Next, the segmen-tation method bases on polygon model evaluation and active contour segmentation A similar framework for compound leaves is proposed in [30] The authors design to estimate the number and shape of leaflets Segmentation technique used multiple region-based active contours In [31],
authors developed a semi-automated approach based on the GrabCut method This study also compares the identification of datasets: non-preprocessed, cropped, segmented The result achieved on the segmented dataset is highest, it outperforms than the remaining 2 datasets (non-preprocessed and cropped data) This result demonstrates the importance
of segmentation [32] performs inter-active segments flower based on color models and graph-cuts techniques Marker-controlled watershed segmentation method is also applied
in [33] in order to determine leaf region The research in [22] also used the marker-based watershed transform This method has achieved the highest results among teams using interactive segment method in the ImageCLEF 2012 plant identification competition Figure 1.10 shows the comparison of di erent methods proposed for complex background ff plant image in
Trang 37ImageCLEF 2012 [21] The accuracy of the teams that use automaticsegmentation are in red while those applying semi-automatic are in blue Theexperimental results show that using interactive segmentation often achieve betterresults than automatic segmentation.
Figure 1.10 Accuracy of plant identification based on leaf images on complexback-ground in the ImageCLEF 2012 [21]
In addition, the segmentation methods are used to extract a ROI, then the min-imum bounding box of interest object is computed [34, 35] Other operations of leaf pre- processing are geometric transformation and petiole removal The purpose of geo-metric transformation is to standardize the leaf direction to the same direction so that the petiole
is below and the top of the leaf is above The petiole is removed to increase the performance of the leaf recognition system [36] In [28, 29], images are rotated and cropped so that they contain only one leaf of interest with its apex pointing to the top of the image Thanks to the e ectiveness of the interactive segmentation, in this thesis, this ff technique also is applied for complex background leaf-based plant identification.
1.2.4 Feature extraction
The feature extraction method can be categorized into two groups: designed features and deeply-learned features The former group requires expertknowledge on appropriate feature selection for the problem to be solved while thelater one tries to learn the features from plant images
Trang 38hand-1.2.4.1 Hand-designed features
Hand-designed features are widely used in plant identification They includecolor, texture, shape features and organ-specific features For example, a leaf hasorgan-specific features such as vein structure, margin, tooth Table 1.2 summarizesthe hand-designed features used for plant identification This table shows that theshape of leaf plays the most important role Shape and color are also importantfeatures for a flower For stem, texture feature is often extracted [35] Table 1.2 alsoshows that studies often combine two or more feature types for each organ becausethere is no single feature strong enough to separate all categories [37, 38]
In [32] the authors extracted a di erent type of features such as HSV values, MR8 ff filter, Scale-Invariant Feature Transform (SIFT), Histogram of Oriented Gradi-ents (HOG)
on a dataset consisting of 17 flower categories After that, they used an SVM classifier and combine di erent linear weighted kernels They evaluated and se-lected the optimal ff features and utilized them on a dataset of 102 category/species A good recognition rate
is achieved In [39], the authors also utilized color and shape features extracted from flower images To discriminate species, they used Principal Component Analysis (PCA) from di erent types of flower In ff [40], the authors ex-tracted HOG features then employed
an SVM for classification In [24], the authors proposed a flower image retrieval tool based
on ROI They use the color histogram of a flower region and two shape-based features, which are Centroid-Contour Distance and Angle Code Histogram, to characterize the shape features of a flower For evaluation, they used a dataset of 14 plant species Aakif
et al [41] focused on extracting three di erent feature types including morphological ff features, Fourier descriptor and shape-defining features The morphological features are aspect ratio, eccentricity, roundness and convex hull Then, an artificial neural network (ANN) is implemented for clas-sifying feature vectors The results achieved 96% accuracy
on both Flavia and ICL datasets The authors in [42] also proposed to extract features from the contours of a leaf The features include the geometrical features and moment features for plant iden-tification Du et al [43] extracted leaf shape information by Pyramid
of Histograms of Orientation Gradient (PHOG) First canny edges are extracted Then the orientation gradients are calculated at the grids of the image at all pyramid levels Finally,
at each level of the pyramid, the gradients of each grid are linked together The authors in
[44] proposed a novel contour-based shape descriptor to extract the shape geome-try This descriptor is invariant to translation, rotation, scaling and symmetry This method is easy to implement and fast In [45], the authors employed leaf vein and shape for plant leaf identification In this study, 21 leaf features were extracted The proposed system obtains an accuracy of 97.19% based on the leaf image dataset of 32 plant species In
[38], Mounie et al combined leaf salient points and leaf contour
Trang 39Table 1.2 Methods of plant identification based on hand-designed features.
-year
[23], 2008 Leaf maker-controlled Zernike moments hypersphere
watershed
[28], 2012 Leaf Mode based Shape Random Forest
leaf segmentation
[44], 2012 Leaf - Contour-based shape descriptor Nearest neighbor
[26], 2012 Leaf Interactive, Complex network, Random Forest
[22], 2014 Leaf Interactive, Shape, texture, Support Vector
[50], 2014 Leaf kernelized fuzzy Geometric, Tooth Machine
C-means
[41], 2015 Leaf threshold shape defining feature, neural network
Fourier descriptor
[51], 2015 Leaf Threshold Shape, morphology, Support Vector
Histograms of Curvature over
[52], 2015 Leaf - Scale,hand-crafted shape, Random Forest
statistical features
[53], 2015 Leaf - Multiscale-arch-height Nearest neighbor
descriptor
[54], 2015 Leaf Mixture-of-Gausians Inner-distanceshapecontext Nearest neighbor
Relative sub-image sparse
[55], 2016 Leaf - coe cient,gray level ffi Nearest neighbor
co-occurrence matrix Haralick texture,
[56], 2016 Leaf Region of Interest gabor features, Fuzzy Relevance
shape, color histograms, Vector Machine co-occurrence matrices
Entropy sequence, Zernike moments, Support vector
[36], 2016 Leaf - Hu’s invariants machine
aspect ratio, rectangularity, form factor and circularity
[31], 2017 Leaf GrabCut with Convolutional machine
neural network
[57], 2006 Flower Graph cut Color, shape, texture Nearest neighbor
[58], 2008 Flower Graph cut Colour, histogram of gradient Support vector
Color and shape features Weighted Euclidean
[18], 2011 Flower Interactive of the whole flower region distance
/ pistil/stamen area
[59], 2010 Fruit - Colour, texture Minimum distance
classifier
[19], 2012 Fruit Split-and-merge Colour histogram, Support vector
Radial basis
[60], 2006 Stem - Colour, texture probabilistic network,
Support vector machine
Trang 40descriptions by applying a late fusion method Kernel descriptor presented by Bo et al.
[46] has been proved to be robust for many object recognition applications Re-cently,
Le et al [47] have proposed to use kernel descriptor for leaf identification, then theSVM classifier is applied Experiments were conducted on 2 datasets: 32-speciesFlavia dataset and a subset of ImageCLEF 2013 consisting of simple background leafimages of 126 species The identification accuracy achieved with Flavia and Image-CLEF 2013 datasets is 97.5% and 58.0%, respectively This descriptor outperformsthe state of the art descriptors such as SURF For Flavia dataset, this methodachieved very good results However, the results of ImageCLEF 2013 are still limiteddue to the high number of species and the complicated background The method of Le
et al is limited to simple background images and the kernel descriptor is not invariant
to scale and rotation This thesis will overcome the limitations of this method andapply on complex background leaf images
As conclusions, the hand-designed features obtains promising results on di erentffdatasets However, the accuracy of hand-designed features decreases significantlywhen working with a large number of species or complex background images
1.2.4.2 Deeply-learned features
Deep learning is a special branch of Machine Learning and has started to become popular in the last decade as scientists have been able to take advantage of the powerful computing power of modern computers as well as the huge data (image, audio, text, etc)
on the internet Recently, learning feature representations using a Convolutional Neural Network (CNN) show a number of successes in di erent topics in the field of computer ff vision such as object detection, segmentation, and image classification [48, 49] The CNN can automatically learn the features from the data without requiring determining the features in advance The CNN learns a feature hierarchy all the way from pixels to a classifier Each layer extracts features from the output of the previous layer The first layers in the network are very simple which are used to extract lines, curves, or blobs in the input image This information will be used as input for the next layer, with the task more di cult to extract the components of the object in the image (Figure ffi 1.11) With the way of learning information from images through many di erent layers, these methods ff can help computers understand complex data by multiple layers of simple information through each step That is why they are called deep learning methods.
Some famous architectures of CNN are AlexNet [49], VGG [62], GoogLeNet [63] and ResNet
[64] Basically, a CNN includes convolutional layer, non-linear activation layer ReLU (Rectified Linear Unit), pooling and fully-connected [65] (see Figure 1.12)
Convolutional layer: This is the most important component in the CNN, which also represents the idea of building local connections instead of connecting all the