12 1.2 Automatic plant identification from images of single organ.. 2 LEAF-BASED PLANT IDENTIFICATION METHOD BASED ON 2.1 The framework of leaf-based plant identification method.. 99 Tab
Trang 1HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
NGUYEN THI THANH NHAN
INTERACTIVE AND MULTI-ORGAN
BASED PLANT SPECIES
1 Assoc Prof Dr Le Thi Lan
2 Prof Dr Hoang Van Sam
Trang 2
Hanoi − 2020
Trang 3HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Thi Thanh Nhan
INTERACTIVE AND MULTI-ORGAN
BASED PLANT SPECIES
1 Assoc Prof Dr Le Thi Lan
2 Prof Dr Hoang Van Sam
Trang 4
Hanoi − 2020
Trang 5DECLARATION OF AUTHORSHIP
I, Nguyen Thi Thanh Nhan, declare that this dissertation entitled, ”Interactive
and multi-organ based plant species identification”, and the work presented in it is myown
Hanoi, May, 2020PhD Student
Nguyen Thi Thanh Nhan
SUPERVISORS
Trang 6i
Trang 7First of all, I would like to thank my supervisors Assoc Prof Dr Le Thi
Lan at The International Research Institute MICA - Hanoi University of Science andTechnology, Assoc Prof Dr Hoang Van Sam at Vietnam National University ofForestry for their inspiration, guidance, and advice Their guidance helped me all the
time of research and writing this dissertation
Besides my advisors, I would like to thank Assoc Prof Dr Vu Hai, Assoc.Prof Dr Tran Thi Thanh Hai for their great discussion Special thanks to my
friends/colleagues in MICA, Hanoi University of Science and Technology: Hoang VanNam, Nguyen Hong Quan, Nguyen Van Toi, Duong Nam Duong, Le Van Tuan, Nguyen
Huy Hoang, Do Thanh Binh for their technical supports They have assisted me a lot
of the work
As a Ph.D student of the 911 program, I would like to thank this program for
financial support I also gratefully acknowledge the financial support for attendingthe conferences from the Collaborative Research Program for Common Regional Is-sue (CRC) funded by ASEAN University Network (Aun-Seed/Net), under the grantreference HUST/CRC/1501 and NAFOSTED (grant number 106.06-2018.23)
Special thanks to my family, to my parents-in-law who took care of my family and
Trang 8ii
Trang 9
1.1 Plant identification 10
1.1.1 Manual plant identification 10
1.1.2 Plant identification based on semi-automatic graphic tool 11
1.1.3 Automated plant identification 12
1.2 Automatic plant identification from images of single organ 13
1.2.1 Introducing the plant organs 13
1.2.2 General model of image-based plant identification 16
1.2.3 Preprocessing techniques for images of plant 17
1.2.4 Feature extraction 19
1.2.4.1 Hand-designed features 20
1.2.4.2 Deeply-learned features 22
1.2.5 Training methods 25
1.3 Plant identification from images of multiple organs 28
1.3.1 Early fusion techniques for plant identification from images of
multiple organs 30
1.3.2 Late fusion techniques for plant identification from images of
multiple organs 31
1.4 Plant identification studies in Vietnam 33
1.5 Plant data collection and identification systems 35
1.6 Conclusions 43
Trang 10iii
Trang 112 LEAF-BASED PLANT IDENTIFICATION METHOD BASED ON
2.1 The framework of leaf-based plant identification method 45
2.2 Interactive segmentation 46
2.3 Feature extraction 50
2.3.1 Pixel-level features extraction 50
2.3.2 Patch-level features extraction 51
2.3.2.1 Generate a set of patches from an image with adaptive
size 51
2.3.2.2 Compute patch-level feature 52
2.3.3 Image-level features extraction 55
2.3.4 Time complexity analysis 56
2.4 Classification 57
2.5 Experimental results 57
2.5.1 Datasets 57
2.5.1.1 ImageCLEF 2013 dataset 57
2.5.1.2 Flavia dataset 57
2.5.1.3 LifeCLEF 2015 dataset 58
2.5.2 Experimental results 58
2.5.2.1 Results on ImageCLEF 2013 dataset 58
2.5.2.2 Results on Flavia dataset 61
2.5.2.3 Results on LifeCLEF 2015 dataset 61
2.6 Conclusions 68
3 FUSION SCHEMES FOR MULTI-ORGAN BASED PLANT IDEN-TIFICATION 69
3.1 Introduction 69
3.2 The proposed fusion scheme RHF 71
3.3 The choice of classification model for single organ plant identification 77
3.4 Experimental results 79
3.4.1 Dataset 80
3.4.2 Single organ plant identification results 81
3.4.3 Evaluation of the proposed fusion scheme in multi-organ plant
identification 81
3.5 Conclusion 89
4 TOWARDS BUILDING AN AUTOMATIC PLANT RETRIEVAL BASED ON PLANT IDENTIFICATION 90
4.1 Introduction 90
Trang 124.2 Challenges of building automatic plant identification systems 90
iv
Trang 13
4.3 The framework for building automatic plant identification system 94
4.4 Plant organ detection 96
4.5 Case study: Development of image-based plant retrieval in VnMed
plication 101
4.6 Conclusions 106
CONCLUSIONS AND FUTURE WORKS 107
4.6.1 Short term 108
4.6.2 Long term 108
Trang 14v
Trang 154 CBF Classification Base Fusion
5 CNN Convolution Neural Network
6 CNNs Convolution Neural Networks
7 CPU Central Processing Unit
8 CMC Cumulative Match Characteristic Curve
15 GPU Graphics Processing Unit
16 GUI Graphic- ser nterfaceU I
17 HOG Histogram of Oriented Gradients
18 ILSVRC ImageNetLarge caleS Visual Recognition Competition
19 KDES Kernel DEScriptors
22 L-SVM Linear upportS Vector Machine
23 MCDCNN Multi Column Deep Convolutional Neural Networks
26 OPENCV OPEN source Computer Vision Library
28 PCA Principal Component Analysis
29 PNN Probabilistic Neural Network
30 QDA Quadratic Discriminant Analysis
Trang 16vi
Trang 1731 RAM Random Acess Memory
32 ReLU Rectified Linear Unit
33 RHF Robust HybridFusion
35 ROI Region O If nterest
36 SIFT Scale- nvariantI Feature Transform
38 SURF Speeded U Rp obust Features
39 SVM Support Vector Machine
40 SVM-RBF Support Vector Machine- adialR Basic Function kernel
Trang 18vii
Trang 197 R d Set of real number has d dimensions
27 θ z( ) The orientation of gradient vector at pixel z
28 θ z˜( ) The normalized gradient vector
Trang 20viii
Trang 2130 argmax x( ) It indicates the element that reaches its maximum value
32 x T Transposition of vector x
Product of all values in range of series
35 s i (I k) The confidence score of the plant speciesi−th when using image I k
37 C The number of species in dataset
38 k m˜ The gradient magnitude kernel
39 k o The orientation kernel
41 m z˜( ) The normalized gradient magnitude
Trang 22ix
Trang 23LIST OF TABLES
Table 1.1 Example dichotomous key for leaves [14] 11
Table 1.2 Methods of plant identification based on hand-designed features 21
Table 1.3 A summary of available crowdsourcing systems for plant informa-
tion collection 36
Table 1.4 The highest results of the contest obtained with the same recog-
nition approach using hand-crafted feature 41
Table 2.1 Leaf/leafscan dataset of LifeCLEF 2015 58
Table 2.2 Accuracy obtained in six experiments with ImageCLEF 2013 dataset 60
Table 2.3 Precision, Recall and F-measure in improved KDES with interac-
tive segmentation for ImageCLEF 2013 dataset 62Table 2.4 Comparison of the improved KDES + SVM with the state-of-the-
art hand-designed features-based methods on Flavia dataset 63
Table 2.5 Precision, Recall and F-measure of the proposed method for Flavia
dataset 64
Table 3.1 An example of test phase results and the retrieved plant list de-
termination using the proposed approach 74
organs with different fusion schemes in case of using AlexNet The best
result for each pair of organs is in bold 83Table 3.5 Obtained accuracy at rank-1, rank-5 when combining each pair of
organs with different fusion schemes in case of using ResNet The best
result for each pair of organs is in bold 84
Trang 24x
Trang 25Table 3.6 Obtained accuracy at rank-1, rank-5 when combining each pair of
organs with different fusion schemes in case of using GoogLeNet The
best result for each pair of organs is in bold 84
Table 3.7 Comparison of the proposed fusion schemes with the state of theart method named MCDCNN [79] The best result for each pair of
organs is in bold 86
Table 3.8 Rank number (k) where 99% accuracy rate is achieved in case of
using AlexNet The best result is in bold 88
Table 3.9 Rank number (k) to achieve a 99% accuracy rate in case of using
for ResNet The best result is in bold 89
Table 4.1 Plant images dataset using conventional approaches 91Table 4.2 Plant image datasets built by crowdsourcing data collection tools 92
Table 4.3 Dataset used for evaluating organ detection method 97Table 4.4 The organ detection performance of the GoogLeNet with different
weights initialization 97
Table 4.5 Confusion matrix for plant organ detection obtained (%) 98Table 4.6 Precision, Recall and F-measure for organ detection with Life-
CLEF2015 dataset 99
Table 4.9 Results for Vietnamese medicinal plant identification 104
Trang 26xi
Trang 27Figure 6 Confusion matrix for two-class classification 6
Figure 7 A general framework of plant identification 8
Figure 1.1 Botany students identifying plants using manual approach [13] 11
Figure 1.2 (a) Main graphical interface of IDAO; (b), (c), (d) Graphical icons
for describing characteristics of leaf, fruit and flower respectively [16] 12
Figure 1.3 Snapshots of Leafsnap (left) and Pl@ntNet (right) applications 13
Figure 1.4 Some types of leaves: a,b) leaves on simple and complex back-
Figure 1.5 Illustration of flower inflorescence types (structure of the flower(s)
on the plant, how they are connected between them and within the
plant) [11] 15
Figure 1.6 The visual diversity of the stem of the Crataegus monogyna Jacq 15
Figure 1.7 Some examples branch images 16
Figure 1.8 The entire views for Acer pseudoplatanus L 16
Figure 1.9 Fundamental steps for image-based plant species identification 17Figure 1.10 Accuracy of plant identification based on leaf images on complex
Trang 28background in the ImageCLEF 2012 [21] 19
xii
Trang 29Figure 1.11 Feature visualization of convolutional net trained on ImageNet
from [61] 23
Figure 1.12 Architecture of a Convolutional Neural Network 23
Figure 1.13 Hyperplane separates data samples into 2 classes 27
Figure 1.14 Two fusion approaches, (a) early fusion, (b) late fusion 29
Figure 1.15 Early fusion method in [77] 30
Figure 1.16 Different types of fusion strategies [78] 31
Figure 1.17 Some snapshot images of Pl@ntNet 37Figure 1.18 Obtained results on three flower datasets Identification rate
reduces when the number of species increases 42Figure 1.19 Comparing the performances of datasets consisting of 50 species.Blue bar: The performances on original dataset collected from Life-CLEF; Red bar: Performances with riched datasets The species on two
datasets are identical 43Figure 2.1 The complex background leaf image plant identification framework 46
Figure 2.2 The interactive segmentation scheme 47
Figure 2.3 Standardize the direction of leaf (a): leaf image after segmen-
tation; (b): Convert to binary image; (c): Define leaf boundary using
Canny filter; (d): Standardized image direction 49
Figure 2.4 Examples of leafscan and leaf, the first row are raw images, thesecond row are images after applying corresponding pre-processing tech-
niques 50
same leaf with different sizes are divided using adaptive patch 52
Figure 2.6 An example of patches and cells in an image and how to convert
adaptive cells 53Figure 2.7 Construction of image-level feature concatenating feature vectors
of cells in layers of hand pyramid structure 56
Trang 30Figure 2.8 32 images of 32 species of Flavia dataset 58
xiii
Trang 31Figure 2.9 Interactive segmentation developed for mobile devices Top left:original image, top right: markers, bottom right: boundary with Water-
shed, bottom left: segmented leaf 59
Figure 2.10 Some imprecise results of image segmentation 60
Figure 2.11 Detail accuracies obtained on ImageCLEF 2013 dataset in ourexperiments For some classes such as Mespilus germanica, the obtained
accuracy in the 4 experiments is 0% 65
Figure 2.14 Detailed scores obtained for Leafscan [1], our team’s name is Mica 66Figure 2.15 Detailed scores obtained for all organs [1], our team’s name is
Mica 67
Figure 3.1 An example of a two plant species that are similar in leaf butdifferent in flower (left) and those are similar in leaf and different in fruits 70
Figure 3.2 The framework for multi-organ plant identification 70
Figure 3.3 Explanation for positive and negative samples 72
Figure 3.4 Illustration of positive and negative samples definition With
positive and negative samples 75
Figure 3.6 The process of computing the corresponding positive probabilities
for a query using the RHF method 75
Figure 3.7 AlexNet architecture taken from [49] 77
Figure 3.8 ResNet50 architecture taken from [143] 78
Figure 3.9 A schematic view of GoogLeNet architecture [63] 79
Figure 3.10 Single organ plant identification 79
Trang 32xiv
Trang 33Figure 3.11 Comparison of identification results using leaf, flower, and both
leaf and flower images The first column is query images The secondcolumn shows top 5 species returned by the classifier The third column
is the corresponding confidence score for each species The name of
Figure 4.4 Some images of data collection for two species: (a) Camelliasinensis, (b) Terminalia catappa First row shows images are collected
by manual image acquisitions, second row shows images are collected by
crowdsoucring 95
Figure 4.5 Some examples for wrong identification 98Figure 4.6 Visualization of the prediction of GoogLeNet used for plant organ
detection Red pixels are evidence for a class, and blue ones against it 99
Figure 4.7 Detection results of the GoogLeNet with different classification
methods at the first rank (k=1) 100Figure 4.8 Results obtained by the proposed GoogLeNet and the method
in [7] for six organs 101
Figure 4.9 Architecture of Vietnamese medicinal plant search system [127] 102
Trang 34cation; d) top five returned results 102
xv
Trang 35Figure 4.11 Data distribution of 596 Vietnamese medicinal plants 105
Figure 4.12 Illustration of image-based plant retrieval in VnMed 106
Trang 36xvi
Trang 37Motivation
Plants play an important part in ecosystem They provide oxygen, food, fuel,medicine, wood and help to reduce air pollution and prevent soil erosion Good knowl-
edge of flora allows to improve agricultural productivity, protects the biodiversity,
the best experts but approximate to the experienced experts and far exceeds those of
Trang 381
Trang 39Figure 1 Automatic plant identification.
Objective
The main aim of this thesis is to overcome the second limitation of the automaticplant identification (low recognition accuracy) by proposing novel and robust methodsfor plant recognition For this, we first focus on improving the recognition accuracy
with more realistic images (e.g., having a complex background, and been taken indifferent lighting conditions)
Second, taking into consideration that using one sole organ for plant identification
Finally, the last objective of the thesis is to build an application of Vietnamesemedicinal plant retrieval based on plant identification By this application, the knowl-
edge that previously only belongs to botanists can be now popular for the community
To this end, the concrete objectives are:
Develop a new method for leaf-based plant identification that is able to recognize
the plants of interest even in complex background images;
Trang 402