... Correlation theorem states that the Fourier transform of the correlation of two images is the product of the Fourier transform of one image and the complex conjugate of the Fourier transform of the other... because of their tolerance to low image overlap and image scale changes Keypoint indexing methods begin with keypoint detection and localization, and then followed by extraction of an invariant descriptor... translation invariant feature by using some other feature points in the same image (Without specification, rotation and translation in this thesis stand for 2D rotation and 2D translation only.) Here
Trang 1Acknowledgement
I am deeply indebted to my supervisor, Dr Huang zhiyong for his precious guidance,
continuous support, and encouragement throughout my thesis I also want to thank
Dr Tong San Koh of NTU for discussions, Dr Wee Kheng Leow and Dr Alan
Cheng Holun of NUS for the detailed comments and suggestions
Trang 2Table of Content
Acknowledgement i
Table of Content ii
Summery v List of Tables vi
List of Figures vii
List of Figures vii
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Contributions 2
1.3 Thesis Organization 3
Chapter 2 Literature Review 4
2.1 Image Registration in Theory 4
2.1.1 Applications 4
Trang 32.1.2 Standard image registration stages 6
2.2 Registration Methods 8
2.2.1 Area based methods 8
2.2.2 Feature-based methods 12
2.2.3 Recent registration methods 13
Chapter 3 Image Registration 17
3.1 Algorithm Overview 18
3.2 Feature points detection 24
3.2.1 Feature point position extraction 24
3.2.2 Feature point orientation estimation 25
3.3 Feature points matching 28
3.3.1 Define a feature descriptor 28
3.3.2 Local structure matching 31
3.3.3 Global structure matching 43
Trang 43.3.4 Eliminating the low-quality matching pairs 46
3.3.5 Performance analysis 47
3.4 Transformation model estimation 48
Chapter 4 Experimental Results 50
4.1 Results of local structure matching 51
4.2 Results of global structure matching 57
4.3 Registration results on various images 61
Chapter 5 Conclusions and Further works 80
Bibliography 82
Trang 5Summery
In this these, we propose a novel feature-based image registration method using both
the local and global structures of the feature points To address various imaging
conditions, we improve the local structure matching method Compared to the
conventional feature-based image registration methods, our method is robust by
guaranteeing the reliable feature points to be selected and used in the registration
process We have successfully applied our method to images of different conditions
Trang 6List of Tables
Table 1: Comparison of two local structure matching methods .56
Table 2: The registration results on 8 pairs of images in Figure 13-20 .62
Trang 7List of Figures
Figure 1: System diagram of the feature points matching method 21
Figure 2: feature point i be represented by a feature vector f i =( x i ,y i,ϕi ) .29
Figure 3: The local spatial relation between two feature points f i and f j 30
Figure 4: Spurious or dropped feature points in the neighborhood will result in an
invalid local structure for matching .34
Figure 5: The local structure matching on images with geometry transformations 52
Figure 6: The local structure matching on images with large temporal difference 53
Figure 7: The local structure matching on images with image distortions (highly
JPEG compressed) .54
Figure 8: The local structure matching on images from different sensors 55
Figure 9: The matching pairs detected from the global structure matching in cue1 59
Figure 10: The matching pairs detected from the global structure matching in cue2 59
Figure 11: The matching pairs obtained from intersection of results in cue1 and
Trang 8cue2 .60
Figure 12: The final matching pair set after cross-validation .60
Figure 13: Registration of high resolution images 68
Figure 14: Registration of urban images from different sensors .69
Figure 15: Registration of two Amazon region images from Radar, JERS-1 with two year difference .70
Figure 16: Registration of Landsat images with four year difference and associated rotation 72
Figure 17: Registration of Amazon region image with deforestations .74
Figure 18: Registration of images with high temporal changes .75
Figure 19: Registration of images with compression distortions 77
Figure 20: Registration of retina images with associated rotation and translation .79
Trang 9
Chapter 1
Introduction
1.1 Motivation
Image registration is the process of matching two or more images of the same scene
taken in different times, from different view points, or by different sensors It
geometrically aligns the input image and the reference image Image registration is
widely used in many applications, such as image mosaicking, aerial image analysis,
medical imaging, stereo vision, automated cartography, motion analysis, and the
recovery of the 3D characteristics of a scene [1] In general, most large systems
which evaluate images require the registration of image as an intermediate step [2]
In this thesis we propose and implement a feature-based image registration
algorithm The images under consideration are roughly of the same scale (but not
necessarily the same size) Here we adapt Jiang and Yau’s fingerprint minutiae
matching algorithm [3] In [3] Jiang and Yau first establish a feature descriptor
which fulfills four important conditions: 1) invariance (the descriptions of the
corresponding features from the reference and sensed image have to be the same), 2)
uniqueness (two different features should have different descriptions), 3) stability
Trang 10(the description of a feature which is slightly deformed in an unknown manner
should be close to the description of the original feature), and 4) independence (if
the feature description is a vector, its elements should be functionally independent)
[2] Then they propose a simple and efficient fingerprint minutiae matching
algorithm based on the ‘local’ and ‘global’ structures of fingerprint minutiae (Note
that the so called ‘global structure’ in [3] is still a local structure because it is local to
the position of a feature It should be called ‘absolute feature’ In contrast, a better
name for ‘local structure’ is ‘relative feature’ In this thesis, we still keep the names
‘local structure’ and ‘global structure’ for the consistence with [3].) However, this
algorithm is only suitable for fingerprint image under rotate and translate
transformations We improve the local and global structure matching methods in [3]
such that we can obtain a set a applicable corresponding feature points for general
images taken under various imaging conditions, such as images taken at different
times, from highly different view points, or by different sensors The proposed
feature matching method can also be applied to images with compression distortion
or object movement or high deformations
1.2 Contributions
Based on the fingerprint minutiae matching algorithm represented in [3], we propose
and implement a feature-based registration algorithm Our major contributions are in
Trang 11the part of feature matching
We improve the local structure matching method in [3] for image registration
Therefore we can handle the cases where image has significant scene changes such
as object movement, growths or deformations In these cases, the local structure
matching method in [3] is not effective We provide a more reliable local structure
matching so that two best-matched local structure pairs are correctly computed
under various imaging conditions, such as images taken at different times, by
different sensors, and from highly different viewpoints The improved matching
method can also be applied to images with compression distortion or object
movement or high deformations
We implement the method in a software system and conduct various experiments
with applicable results
1.3 Thesis Organization
The rest of this thesis is organized as follows In Chapter 2 we give a short review of
related work In Chapter 3, we present our image registration algorithm, of which
the reliable feature points matching algorithm is our major concern In Chapter 4, a
series of experiments are performed to evaluate the performance of our registration
algorithm Finally, our work is summarized in Chapter 5
Trang 12Chapter 2
Literature Review
2.1 Image Registration in Theory
2.1.1 Applications
Image registration is widely used in remote sensing, medical imaging, computer
vision, etc In general, according to the manner of the image acquisition the
application of image registration can be divided into four main groups [1]
Different viewpoints (multi-view analysis) Images of the same scene are acquired
from different viewpoints The aim is to gain a larger 2D view or a 3D
representation of the scanned scene Examples of applications include remote
sensing—mosaicking of images of the surveyed area, computer vision—shape
recovery (shape from stereo)
Different times (multi-temporal analysis) Images of the same scene are acquired
at different times, often at regular time interval, and possibly under different
conditions The aim is to find and evaluate changes in the scene between the
consecutive image acquisitions Examples of applications include remote
Trang 13vision—automatic change detection for security monitoring, and medical
imaging—monitoring of the healing therapy, monitoring of tumor evolution
Different sensors (multi-modal analysis) Images of the same scene are acquired
by different sensors The aim is to integrate the information obtained from different
source streams to gain more complex and detailed scene representation Examples of
applications include remote sensing — fusion of information from sensors with
different characteristics such as panchromatic images, offering better spatial
resolution, color/multi-spectral images with better spectral resolution, or radar
images independent of cloud cover and solar illumination; medical
imaging—combination of sensors showing the anatomical structure like MRI or CT
with sensors showing functional and metabolic activities like PET, SPECT or MRS
Results can be applied , for instance, in radiotherapy and nuclear medicine
Scene to model registration Image of a scene and a model are registered The
model can be a computer representation of the scene, for instance maps, another
scene with similar content The aim is to localize the acquired image in the
scene/model and to compare them Example of applications includes remote
sensing—registration of aerial or satellite data into maps; and medical
imaging—comparison of the patient’s image with the digital anatomical atlases,
specimen classification
Trang 142.1.2 Standard image registration stages
Due to the diversity of image registration applications and due to various types of
image variation stated above, it is impossible to design a universal method
applicable to all registration tasks However, the standard image registration
technique usually consists of three stages as follows
Feature detection Features are salient structures or distinctive objects in the image
These features can be represented by their point representatives such as centers of
gravity, line intersections, corners In this stage, features are manually or, preferably,
automatically detected Usually the physical interpretability of the feature is required
The major problem in this stage is to decide what kind of feature is applicable to the
given task The detected features sets in sensed image and reference image should
have enough common elements, and the detection method should not be sensitive to
the assumed image variations
Feature matching The detected features in sensed image and reference images are
matched in this stage Various feature descriptors and similarity measures are
employed for the purpose The two major categories for feature matching are
area-based and feature-based methods Area-based methods, sometimes called
correlation-like methods, usually adapt a window to determine a matched location
Trang 15using the correlation technique Area based methods deal with the images without
attempting to detect salient objects They are preferable when the images do not
have enough prominent details and distinctive objects While feature-based method
is used to extract common features such as curvature, moments, areas, or line
segments to perform accurate registration They are typically applied when the local
structural information is more important than the information carried by the image
intensities They are applicable to images of completely different nature (like aerial
photograph and map) and can handle complex image distortions
In feature matching stage, problems caused by incorrect feature detection or by
image degradations can arise Physically corresponding features can be missed due
to different imaging condition or due to different spectral sensitivity of the sensors
The choice of the feature description and similarity measure has to consider these
factors There are several conditions that a good feature descriptor should fulfill [2]
The most important ones are invariance, (the feature descriptor should be invariant
to the assumed image degradations), uniqueness (two different features should have
different description), stability (the description of a feature should be sufficiently
stable to tolerate slight unexpected feature variations and noise), and independence
(the elements of a vector feature descriptor should be functionally independent) The
matching algorithm in the space of invariants should be robust and efficient
Trang 16Transformation model estimation and image resampling In the last stage, the
type and parameters of the mapping function are estimated by the feature
correspondences estimated from previous stage Applying the spatial mapping and
interpolation, the sensed image is resampled onto the reference image Image values
in non-integer coordinates are computed by the appropriate interpolation technique
There are two major problems need to be considered in this stage Firstly, the type of
the mapping functions should be chosen correctly In case there is no priori
information available, the model should be flexible enough to handle all possible
image transformations Secondly, there are differences between two images which
we would like to detect Therefore the decisions about which type of image
variations is variations of interest must be made in this stage
2.2 Registration Methods
The current automated registration techniques can be classified into two broad
categories: area-based and feature-based
2.2.1 Area based methods
Area-based methods, sometimes called correlation-like methods, merge the feature
detection step with the feature matching step Instead of attempting to detect salient
Trang 17objects, windows of predefined size (or even entire images) are used for the
correspondence estimation
The area–based methods usually adapt a small window of points to determine a
matched location using the correlation technique [4] Window correspondence is
based on the similarity measure between two given windows in both the sensed
image and the reference image The most commonly used measure of similarity is
normalized cross-correlation Other useful similarity measures are the correlation
coefficient and the sequential-similarity detection [1] In normalized
cross-correlation, the measure of similarity is computed for window pairs from the
sensed and reference images and its maximum is searched The window pairs for
which the maximum is achieved are set as the corresponding ones Although the
cross-correlation based registration can exactly align mutually translated images
only, it can also be successfully applied when slight rotation and scaling are present
Another useful property of correlation is given by the Correlation theorem The
Correlation theorem states that the Fourier transform of the correlation of two
images is the product of the Fourier transform of one image and the complex
conjugate of the Fourier transform of the other This theorem gives an alternate way
to compute the correlation between images The Fourier transform is simply another
way to represent the image function Instead of representing the image in the spatial
Trang 18domain, as we normally do, the Fourier transform represents the same information in
the frequency domain It can be computed efficiently for images using the Fast
Fourier Transform (FFT) Hence, an important reason why the correlation metric is
chosen in many registration problems is because the Correlation theorem enables it
to be computed efficiently, with existing, well-tested programs using the FFT (and
occasionally in hardware using specialized optics) The use of the FFT becomes
most beneficial for cases where the image and template to be tested are large
The area-based methods are preferable when the images do not have enough
prominent details and distinctive information is provided by graylevels/colors rather
than by local shapes and structure [5] The limitations of the area-based methods
are:
(1) The rectangular window, which is most often used, suits the registration of
images which locally differ only by a translation If images are deformed by more
complex transformations, this type of the window is not able to cover the same parts
of the scene in the reference and sensed images (the rectangle can be transformed to
some other shape) Several authors proposed to use circular shape of the window for
mutually rotated images However, the comparability of such simple-shaped
windows is violated too if more complicated geometric deformations (similarity,
perspective transforms, etc.) are present between images
Trang 19(2) Another disadvantage of the area-based methods refers to the ‘remarkableness’ of
the window content There is high probability that a window containing a smooth
area without any prominent details will be matched incorrectly with other smooth
areas in the reference image due to its non-saliency The features for registration
should be preferably detected in distinctive parts of the image Windows, whose
selection is often not based on their content evaluation, may not have this property
(3) Classical area-based methods like cross-correlation (CC) exploit for matching
directly image intensities, without any structural analysis Consequently, they are
sensitive to the intensity changes, introduced for instance by noise, varying
illumination, and/or by using different sensor types
(4) Typically the cross-correlation between the image and the template is computed
for each allowable transformation of the template The transformation whose
cross-correlation is the largest specifies how the template can be optimally registered
to the image This is the standard approach when the allowable transformations
include a small range of translations, rotations, and scale changes; the template is
translated, rotated, and scaled for each possible translation, rotation, and scale of
interest As the number of transformations grows, however, the computational costs
quickly become unmanageable So the correlation methods are generally limited to
registration problems in which the images are misaligned only by a small rigid or
Trang 20affine transformation
2.2.2 Feature-based methods
There are two tasks generally need to be handled in the feature-based techniques:
feature extraction and feature matching For feature extraction, the aim is to detect
two sets of features in the reference and sensed images represented by the feature
points (points themselves, end points or centers of line features, centers of gravity of
regions, etc) A variety of image segmentation techniques have been used for
extraction of edge and boundary features, such as the Canny operator, the Laplacian
of Gaussian (LoG) operator, the thresholding technique in [6], the classification
method in [7], the region growing in [8], and the wavelet transformations in [9] In
feature matching, the aim is to find the pair-wise correspondence between two
feature sets by their spatial relations or various descriptors of features Feature
correspondence is performed based on the characteristics of the features detected
Existing feature-matching algorithms include binary correlation, distance transform,
Chamfer matching, structural matching, chain-code correlation, and distance of
invariant moments [8] In most existing feature-based techniques, the crucial point is
to have discriminative and robust feature descriptor that invariant to assumed image
variations
Trang 21Feature-based methods are typically applied when the local information is more
significant than the information carried by the image intensities In contrast to the
area-based methods, the feature-based methods do not work directly with the image
intensities The feature represents information on higher level This property makes
feature-based methods suitable to handle complex image distortions (such as image
with illuminations changes) and can apply to images of completely different nature
(such as multi-sensor analysis) However, the limitation of the feature-based
methods is that the feature may be hard to detect or unstable in time, such as some
medical images lack of distinctive objects
2.2.3 Recent registration methods
Among all the recent works, we focus on two classes of methods that appear most
appropriate for the general-purpose registration problem
1) Keypoint Indexing Methods:
Keypoint methods have received growing attention recently because of their
tolerance to low image overlap and image scale changes Keypoint indexing
methods begin with keypoint detection and localization, and then followed by
extraction of an invariant descriptor from the intensities around the keypoint In the
end the extracted invariant descriptor is used by indexing methods to match
Trang 22keypoints between images
Existing extraction algorithms are based on approaches such as
Laplacian-of-Gaussian operator [10], Harris corners [11], information theory [12],
and intensity region stability measures [ 13] They are usually invariant to 2d
similarity or affine transformations of the image, as well as linear changes in
intensity For example, in [10], distinctive invariant features are extracted from
images that can be used to perform reliable matching between different views of an
object or scene The features are invariant to image scale and rotation, and are
shown to provide robust matching across a substantial range of affine distortion,
change in 3D viewpoint, addition of noise, and change in illumination
2) ICP
ICP is based on point features, where the “points” may be raw measurements such as
(x, y, z) values from range images, intensity points in three dimensional medical
images [ 14 ], and edge elements, corners and interest points [ 15 ] that locally
summarize the geometric structure of the images Starting from an initial estimate,
the ICP algorithm iteratively (a) maps features from the sensed image to the
reference image, (b) finds the closest reference image point for each mapping, and (c)
re-estimates the transformation based on these temporary correspondences
Trang 23The Dual-Bootstrap ICP (DB-ICP) algorithm [16] uses the ICP algorithm DB-ICP
begins with an initial transformation estimate and initial matching regions from the
two images obtained by keypoint matching The algorithm iterates among the
following 3 steps: (1) refining the current transformation in the current “bootstrap”
region by symmetric matching, (2) applying model selection to determine if a more
sophisticated model may be used, and (3) expanding the region, growing inversely
proportional to the uncertainty of the mapping on the region boundary The
framework of this algorithm has been described elsewhere for other image
registration, such as for aerial images under different lighting conditions
The advantage of the Dual-Bootstrap ICP algorithm includes:
(1) In comparison to current image registration algorithms, it handles lower image
overlaps, image changes and poor image quality, all of which reduce the number of
common landmarks between images Moreover, by effectively exploiting the
vascular structure during the dual-bootstrap procedure it avoids the need for
expensive global search techniques
(2) In comparison with current indexing-based initialization methods and
minimal-subset random sampling methods, Dual-Bootstrap ICP has the major
advantage requiring fewer initial correspondences This is because it starts from an
Trang 24initial low-order transformation that must only be accurate in small initial regions
(3) Instead of matching globally, which could require simultaneous consideration of
multiple matches, Dual-Bootstrap ICP uses region and model bootstrapping to
resolve matching ambiguities
However, one common problem with DB-ICP [17] is that ICP has a narrow domain
of convergence, and therefore must be initialized relatively accurately
Trang 25Chapter 3
Image Registration
In this chapter we present a new image registration algorithm based on the local and
global structures of the feature points We apply both the local and global structure
matching methods in [3] to image registration Moreover, we improve the flexibility
of the local structure matching method to handle various image variations, and
increase the accuracy of the global structure matching method in the correspondence
estimations The major techniques of feature point matching are summarized in
section 3.3
To make the algorithm more flexible, we propose a new local structure matching
method in section 3.3.2 to handle the cases where image has significant scene
changes or distortions The proposed matching method provides more reliable local
structure matching so that two best-matched local structure pairs are correctly
computed in various imaging conditions What’s more, to improve the accuracy of
the feature points matching, we employ consistent checking and cross-validation in
our feature points matching method: we first perform global structure matching in
two cues to eliminate the false matching in section 3.3.3, and then employ
cross-validation to eliminate the low-quality matching in section 3.3.4
Trang 26Chapter 3 is organized as following: after an overview of our algorithm in section
3.1, in section 3.2, we discuss how to extract the positions of a set of feature points
and how to estimate their orientation In section 3.3, we find correct matching pairs
between two partially overlapping images Based on the matching pair found in
section 3.3, we derive the correct transformations between two target images in
section 3.4
3.1 Algorithm Overview
The standard point mapping registration method usually consists of three stages:
feature points detection, feature points matching, and transformation model
estimation
In feature detection stage, the positions of a set of feature points are extracted by
OpenCV function GoodFeaturestoTrack [18], which computes the ‘goodness’ of a
feature points using the eigenvalue of a matrix formed from the intensities of the
pixels in a neighborhood of a feature point Then the orientations of those feature
points are estimated by a least mean square estimation method After the orientation
field of an input image is estimated, we calculate the reliability level of the
orientation data For each feature point, if its reliability level of the orientation field
is below a certain threshold, then this feature point is eliminated from the feature
Trang 27points set, so that we only keep the feature points with reliable orientation estimation
The detail of feature points’ position extraction and orientation estimation will be
discussed in section 3.2.1and 3.2.2 respectively
Feature points matching is our major concern in the registration since the accuracy
of the feature points matching lays the foundation for accurate registration In the
feature points matching stage, both the local and the global structure matching
proposed in Jiang and Yau’s fingerprint matching method [3] are applied The
fingerprint matching method in [3] attempts to automate a human expert’s behavior
in process of aligning two fingerprints While comparing two fingerprints, a human
expert used to firstly manually examine the local positional relations between the
minutiae (referred as local structure of a minutia), and then align two fingerprints
using the unique global position structures of the whole image In our approach, we
not only adapt the local and global structure matching methods in [3] so that they
can be applied to general image, but also improve both of them so that they are more
flexible and reliable To make the local structure matching more flexible for various
imaging variations, we improve the local structure matching method to handle the
cases where the image has significant scene changes or distortions What’s more,
consistent checking and cross-validation are both employed in our feature points
matching method to guarantee the reliability of estimated feature correspondences
Trang 28We first perform consistent checking in two cues to eliminate the false matching
pairs, and then employ cross-validation to eliminate the low-quality matching pairs
Trang 29Feature
descriptor
F ij
Serious deformation Slight deformation
{f sp ↔ f tq } {f su ↔ f tv }
MP 1
Local structure matching
Complex matching Direct matching
Trang 30The main procedure of our feature points matching algorithm is shown in Figure 1
At the beginning, we adapt the feature descriptor F ij defined in [3] to describe of the
spatial relations between the feature points f i and f j by their relative distance, radial
angle and orientation difference Thus for every feature point f i , a local structure LS i
is formed as the spatial relations between f i and its k-nearest neighbors The detail of
feature descriptor is summarized in section 3.3.1
Then given two feature points sets Fs ={f s1 ,…f sn} and Ft ={f t1 ,…f tm}, the local structure matching is performed to find two best-matched local structure
pairs{f sp ↔ f tq} and {f su ↔ f tv} Since the local structure matching method proposed in [3] can only apply to images with simple geometry transformations and
slight distortions, we propose a more complex local structure matching method in
section 3.3.2 to handle complex image variations Basic idea of our proposed
method is as follows: when comparing two local structures of two feature points,
instead of simply matching their k-nearest neighbors in order of their relative
distances as [3], we compute the similarity of two local structures only according to
those matched neighbors We will discuss how to qualify two matched neighbor in
section 3.3.2 Employing this method, we can provide applicable local structure
matching for images with complex variations, such high deformations or object
movements
Trang 31Assume that we obtain two best-matched local structure pairs, say
{f sp ↔ f tq},{ f su ↔ f tv}, from the local structuring matching, either one of them
can serve as a reliable correspondence of the two feature points’ sets Fs and Ft All
other feature points in Fs and Ft will be converted to the polar coordinate system with respect to the corresponding reference pair Here we perform the global
structure matching in two cues for consistence check: only those correspondences
found by both cues are considered as valid matches, the other candidates points are
excluded from the further processing As shown in Figure 1, the best-matched local
structure pair {f sp ↔ f tq}is input to cue1 to provide correspondence for aligning
the global structure of the feature points, and a matching pair set MP1 is generated
from cue1; while the other best-matched local structure pair {f su ↔ f tv}is input to
cue2 to generate a matching pair set MP2 Only those pairs are generated from both
cues are considered as the valid matching pairs The global matching method is
presented in section 3.3.3
After we obtain a number of matching pairs from the global structure matching, we
apply the validation step to eliminate those low-quality matching pairs by
cross-validation The details of eliminating the low-quality matching pairs are
presented in section 3.3.4
After obtaining a set of correct matching pairs, we can decide the transformation
Trang 32between two images using QR factorization We discuss how to derive the correct
transformations in section 3.4
3.2 Feature points detection
In this section, we shall describe in detail how to extract the positions of a set of
feature points using eigenvalue, and how to estimate the orientation of those feature
points by a least mean square estimation method
3.2.1 Feature point position extraction
Features are salient structure in the images, such as significant regions, lines, or
points Typical features that are used are corners, line intersections, points on curves
with high curvature, high variance points, and local extreme of wavelet
transformation In our approach, we employ OpenCV function goodFeaturestoTrack
[ 18 ] to extract feature point positions In OpenCV the function
GoodFeaturesToTrack is designed to find corners by computing the ‘goodness’ of a
feature points using Tomasi’s algorithm This algorithm computes the eigenvalue of
a matrix formed from the intensities of the pixels in a neighborhood of a feature
point
Trang 333.2.2 Feature point orientation estimation
A number of methods have been proposed have been proposed for orientation
estimation of the feature points In our system, we apply the least mean square
estimation algorithm proposed by [19] [20] The steps for calculating the orientation
at pixel (i , j) are as follows:
1 Divide the input image into blocks of sizeW W× .
2 For each pixel in the block, calculate image gradientsG and x G y at each pixel by
Sobel operator, where G and x G y are the gradient magnitudes in x and y directions,
respectively The horizontal Sobel operator is used to compute G and the vertical x
Sobel operator is used to compute G y
3 Estimate the local orientation at each pixel( )i j by finding the principal axis of ,
variation in the image gradients:
(3.1) / 2
/ 2 / 2 / 2
Trang 341 ( , )1
y x
where θ( , )i j is the least square estimate of the local orientation at the block
centered at pixel ( )i j,
4 Smooth the orientation field in a local neighborhood using a Gaussian filter The
orientation image is first converted into a continuous vector field, which is defined
where U x and U y are the x and y components of the vector field, respectively After
the vector field has been computed, we smooth the orientation by a Gaussian
low-pass filter of sizew w'× ':
Trang 355 The final smoothed orientation field O at pixel ( )i j is defined as: ,
1 ' ( , )1
y x
In the feature point matching process, the orientation of feature points is the most
important criteria for feature measurement Therefore the reliability of orientation
estimation is important To measure the reliability of orientation data, we first
calculate the area moment of inertia about the orientation axis found as the
minimum inertia, and then calculate the axis perpendicular as the maximum inertia:
where I min and I max denotes the minimum and maximum inertia, respectively If the
ratio of the minimum to maximum inertia is close to one, we have little orientation
information Therefore we calculate the reliability of orientation at pixel( using
Trang 36feature point, if its reliability of the orientation field is below a certain threshold, it
will be eliminated from the feature points’ set for the subsequent feature points
matching stage
3.3 Feature points matching
The main procedure of our feature point matching algorithm is shown in Figure 1
As shown in Figure 1, there are four major steps in our matching algorithm: define
an invariant feature descriptor to describe the local positional relations between two
feature points; local structure matching to get the best-matched local structure pairs;
global structure matching to get a set of matching pairs; and cross-validation to
eliminate the low-quality matching pairs
3.3.1 Define a feature descriptor
Each feature point i detected before can be represented by a feature vector f i as:
where (x i ,y i) is its coordinate, ϕi is the orientation.(see Figure 2) The feature vector f i
represents a feature point’s absolute structure However, in this thesis we adapt the
name from [3], so that f i is called ‘global structure’ of the feature point i
Trang 37i
x i
y i
ϕi
Figure 2: feature point i be represented by a feature vector fi = ( xi ,y i,ϕi )
The global characteristic of the feature point x y i, i,ϕi are dependent on the rotation
and translation of the image However, a feature point can be described with
rotation and translation invariant feature by using some other feature points in the
same image (Without specification, rotation and translation in this thesis stand for
2D rotation and 2D translation only.) Here we adapt the feature descriptor F ij
defined in [3] to describe the local positional relations between two feature points f i
and f j by their relative distance d ij, radial angle θ and orientation difference ij ϕ ij
(see Figure 3) by equation (3.13):
Trang 38From Figure 3, we see that a feature point f i together with the feature description F ij
can uniquely decide the position and orientation of another feature point f j
(uniqueness) For any feature point pair, their relative distance d ij, radial angle
ij
θ and orientation differenceϕ , are invariant to 2D rotation and translation of the ij
image (invariance) Moreover, the elements in the vector F ij are functionally
Trang 393.3.2 Local structure matching
Employing the feature descriptor described in section 3.3.1, for every feature point f i,
a local structure LS i can be formed as the spatial relations between the feature point
f i and its k-nearest neighbors:
where F ij is the feature descriptor describing the local positional relations between
two feature points f i and f j defined in equation(3.13), and F ij T is the transpose of F ij
We should note that LS i is ordered ascendingly by the relative distance d ik between
the feature point i and its neighbor k
It is easy to see that the local structure feature vector LS i is independent of the 2D
rotation and translation of the image So it can directly be used for matching Thus in
local structure matching, given two feature sets Fs ={f s1 ,…f sn} and Ft ={f t1 ,…f tm},
where Fs and Ft consist of all feature points detected from sensed image s and
reference image t, respectively, the aim is to find two best-matched local structure
pairs{f sp ↔ f tq} and {f su ↔ f tv} to serve as the corresponding reference pairs
later in the global structure matching stage The reason why we need two
best-matched local structure pairs is to perform consistence check, which we will
explain in more detail in section 3.3.3
Trang 40Here we have two ways to measure the similarity level of two local structures
according to the complex level of image variations In case there are only rotate and
translate transformations between images and the image distortions are slight, we
employ the direct local structure matching method proposed in [3] for its efficiency
Otherwise we propose a more complex local structure matching method to solve the
problem where the sensed image and the reference image from the same scene have
only a few similar local structure pairs We present both methods in the follows
Direct local structure matching [3]
Suppose LS
iand LS j are the local structure feature vectors of the feature point i
from sensed image s and the feature points j from the reference image t, respectively
A similarity level between two feature points i and j is defined as
, ( , )