An image registration method based on the local and global structures

... Correlation theorem states that the Fourier transform of the correlation of two images is the product of the Fourier transform of one image and the complex conjugate of the Fourier transform of the other... because of their tolerance to low image overlap and image scale changes Keypoint indexing methods begin with keypoint detection and localization, and then followed by extraction of an invariant descriptor... translation invariant feature by using some other feature points in the same image (Without specification, rotation and translation in this thesis stand for 2D rotation and 2D translation only.) Here

Trang 1

Acknowledgement

I am deeply indebted to my supervisor, Dr Huang zhiyong for his precious guidance,

continuous support, and encouragement throughout my thesis I also want to thank

Dr Tong San Koh of NTU for discussions, Dr Wee Kheng Leow and Dr Alan

Cheng Holun of NUS for the detailed comments and suggestions

Trang 2

Table of Content

Acknowledgement i

Table of Content ii

Summery v List of Tables vi

List of Figures vii

Chapter 1 Introduction 1

1.1 Motivation 1

1.2 Contributions 2

1.3 Thesis Organization 3

Chapter 2 Literature Review 4

2.1 Image Registration in Theory 4

2.1.1 Applications 4

Trang 3

2.1.2 Standard image registration stages 6

2.2 Registration Methods 8

2.2.1 Area based methods 8

2.2.2 Feature-based methods 12

2.2.3 Recent registration methods 13

Chapter 3 Image Registration 17

3.1 Algorithm Overview 18

3.2 Feature points detection 24

3.2.1 Feature point position extraction 24

3.2.2 Feature point orientation estimation 25

3.3 Feature points matching 28

3.3.1 Define a feature descriptor 28

3.3.2 Local structure matching 31

3.3.3 Global structure matching 43

Trang 4

3.3.4 Eliminating the low-quality matching pairs 46

3.3.5 Performance analysis 47

3.4 Transformation model estimation 48

Chapter 4 Experimental Results 50

4.1 Results of local structure matching 51

4.2 Results of global structure matching 57

4.3 Registration results on various images 61

Chapter 5 Conclusions and Further works 80

Bibliography 82

Trang 5

Summery

In this these, we propose a novel feature-based image registration method using both

the local and global structures of the feature points To address various imaging

conditions, we improve the local structure matching method Compared to the

conventional feature-based image registration methods, our method is robust by

guaranteeing the reliable feature points to be selected and used in the registration

process We have successfully applied our method to images of different conditions

Trang 6

List of Tables

Table 1: Comparison of two local structure matching methods .56

Table 2: The registration results on 8 pairs of images in Figure 13-20 .62

Trang 7

List of Figures

Figure 1: System diagram of the feature points matching method 21

Figure 2: feature point i be represented by a feature vector f i =( x i ,y i,ϕi ) .29

Figure 3: The local spatial relation between two feature points f i and f j 30

Figure 4: Spurious or dropped feature points in the neighborhood will result in an

invalid local structure for matching .34

Figure 5: The local structure matching on images with geometry transformations 52

Figure 6: The local structure matching on images with large temporal difference 53

Figure 7: The local structure matching on images with image distortions (highly

JPEG compressed) .54

Figure 8: The local structure matching on images from different sensors 55

Figure 9: The matching pairs detected from the global structure matching in cue1 59

Figure 10: The matching pairs detected from the global structure matching in cue2 59

Figure 11: The matching pairs obtained from intersection of results in cue1 and

Trang 8

cue2 .60

Figure 12: The final matching pair set after cross-validation .60

Figure 13: Registration of high resolution images 68

Figure 14: Registration of urban images from different sensors .69

Figure 15: Registration of two Amazon region images from Radar, JERS-1 with two year difference .70

Figure 16: Registration of Landsat images with four year difference and associated rotation 72

Figure 17: Registration of Amazon region image with deforestations .74

Figure 18: Registration of images with high temporal changes .75

Figure 19: Registration of images with compression distortions 77

Figure 20: Registration of retina images with associated rotation and translation .79

Trang 9

Chapter 1

Introduction

1.1 Motivation

Image registration is the process of matching two or more images of the same scene

taken in different times, from different view points, or by different sensors It

geometrically aligns the input image and the reference image Image registration is

widely used in many applications, such as image mosaicking, aerial image analysis,

medical imaging, stereo vision, automated cartography, motion analysis, and the

recovery of the 3D characteristics of a scene [1] In general, most large systems

which evaluate images require the registration of image as an intermediate step [2]

In this thesis we propose and implement a feature-based image registration

algorithm The images under consideration are roughly of the same scale (but not

necessarily the same size) Here we adapt Jiang and Yau’s fingerprint minutiae

matching algorithm [3] In [3] Jiang and Yau first establish a feature descriptor

which fulfills four important conditions: 1) invariance (the descriptions of the

corresponding features from the reference and sensed image have to be the same), 2)

uniqueness (two different features should have different descriptions), 3) stability

Trang 10

(the description of a feature which is slightly deformed in an unknown manner

should be close to the description of the original feature), and 4) independence (if

the feature description is a vector, its elements should be functionally independent)

[2] Then they propose a simple and efficient fingerprint minutiae matching

algorithm based on the ‘local’ and ‘global’ structures of fingerprint minutiae (Note

that the so called ‘global structure’ in [3] is still a local structure because it is local to

the position of a feature It should be called ‘absolute feature’ In contrast, a better

name for ‘local structure’ is ‘relative feature’ In this thesis, we still keep the names

‘local structure’ and ‘global structure’ for the consistence with [3].) However, this

algorithm is only suitable for fingerprint image under rotate and translate

transformations We improve the local and global structure matching methods in [3]

such that we can obtain a set a applicable corresponding feature points for general

images taken under various imaging conditions, such as images taken at different

times, from highly different view points, or by different sensors The proposed

feature matching method can also be applied to images with compression distortion

or object movement or high deformations

1.2 Contributions

Based on the fingerprint minutiae matching algorithm represented in [3], we propose

and implement a feature-based registration algorithm Our major contributions are in

Trang 11

the part of feature matching

We improve the local structure matching method in [3] for image registration

Therefore we can handle the cases where image has significant scene changes such

as object movement, growths or deformations In these cases, the local structure

matching method in [3] is not effective We provide a more reliable local structure

matching so that two best-matched local structure pairs are correctly computed

under various imaging conditions, such as images taken at different times, by

different sensors, and from highly different viewpoints The improved matching

method can also be applied to images with compression distortion or object

movement or high deformations

We implement the method in a software system and conduct various experiments

with applicable results

1.3 Thesis Organization

The rest of this thesis is organized as follows In Chapter 2 we give a short review of

related work In Chapter 3, we present our image registration algorithm, of which

the reliable feature points matching algorithm is our major concern In Chapter 4, a

series of experiments are performed to evaluate the performance of our registration

algorithm Finally, our work is summarized in Chapter 5

Trang 12

Chapter 2

Literature Review

2.1 Image Registration in Theory

2.1.1 Applications

Image registration is widely used in remote sensing, medical imaging, computer

vision, etc In general, according to the manner of the image acquisition the

application of image registration can be divided into four main groups [1]

Different viewpoints (multi-view analysis) Images of the same scene are acquired

from different viewpoints The aim is to gain a larger 2D view or a 3D

representation of the scanned scene Examples of applications include remote

sensing—mosaicking of images of the surveyed area, computer vision—shape

recovery (shape from stereo)

Different times (multi-temporal analysis) Images of the same scene are acquired

at different times, often at regular time interval, and possibly under different

conditions The aim is to find and evaluate changes in the scene between the

consecutive image acquisitions Examples of applications include remote

Trang 13

vision—automatic change detection for security monitoring, and medical

imaging—monitoring of the healing therapy, monitoring of tumor evolution

Different sensors (multi-modal analysis) Images of the same scene are acquired

by different sensors The aim is to integrate the information obtained from different

source streams to gain more complex and detailed scene representation Examples of

applications include remote sensing — fusion of information from sensors with

different characteristics such as panchromatic images, offering better spatial

resolution, color/multi-spectral images with better spectral resolution, or radar

images independent of cloud cover and solar illumination; medical

imaging—combination of sensors showing the anatomical structure like MRI or CT

with sensors showing functional and metabolic activities like PET, SPECT or MRS

Results can be applied , for instance, in radiotherapy and nuclear medicine

Scene to model registration Image of a scene and a model are registered The

model can be a computer representation of the scene, for instance maps, another

scene with similar content The aim is to localize the acquired image in the

scene/model and to compare them Example of applications includes remote

sensing—registration of aerial or satellite data into maps; and medical

imaging—comparison of the patient’s image with the digital anatomical atlases,

specimen classification

Trang 14

2.1.2 Standard image registration stages

Due to the diversity of image registration applications and due to various types of

image variation stated above, it is impossible to design a universal method

applicable to all registration tasks However, the standard image registration

technique usually consists of three stages as follows

Feature detection Features are salient structures or distinctive objects in the image

These features can be represented by their point representatives such as centers of

gravity, line intersections, corners In this stage, features are manually or, preferably,

automatically detected Usually the physical interpretability of the feature is required

The major problem in this stage is to decide what kind of feature is applicable to the

given task The detected features sets in sensed image and reference image should

have enough common elements, and the detection method should not be sensitive to

the assumed image variations

Feature matching The detected features in sensed image and reference images are

matched in this stage Various feature descriptors and similarity measures are

employed for the purpose The two major categories for feature matching are

area-based and feature-based methods Area-based methods, sometimes called

correlation-like methods, usually adapt a window to determine a matched location

Trang 15

using the correlation technique Area based methods deal with the images without

attempting to detect salient objects They are preferable when the images do not

have enough prominent details and distinctive objects While feature-based method

is used to extract common features such as curvature, moments, areas, or line

segments to perform accurate registration They are typically applied when the local

structural information is more important than the information carried by the image

intensities They are applicable to images of completely different nature (like aerial

photograph and map) and can handle complex image distortions

In feature matching stage, problems caused by incorrect feature detection or by

image degradations can arise Physically corresponding features can be missed due

to different imaging condition or due to different spectral sensitivity of the sensors

The choice of the feature description and similarity measure has to consider these

factors There are several conditions that a good feature descriptor should fulfill [2]

The most important ones are invariance, (the feature descriptor should be invariant

to the assumed image degradations), uniqueness (two different features should have

different description), stability (the description of a feature should be sufficiently

stable to tolerate slight unexpected feature variations and noise), and independence

(the elements of a vector feature descriptor should be functionally independent) The

matching algorithm in the space of invariants should be robust and efficient

Trang 16

Transformation model estimation and image resampling In the last stage, the

type and parameters of the mapping function are estimated by the feature

correspondences estimated from previous stage Applying the spatial mapping and

interpolation, the sensed image is resampled onto the reference image Image values

in non-integer coordinates are computed by the appropriate interpolation technique

There are two major problems need to be considered in this stage Firstly, the type of

the mapping functions should be chosen correctly In case there is no priori

information available, the model should be flexible enough to handle all possible

image transformations Secondly, there are differences between two images which

we would like to detect Therefore the decisions about which type of image

variations is variations of interest must be made in this stage

2.2 Registration Methods

The current automated registration techniques can be classified into two broad

categories: area-based and feature-based

2.2.1 Area based methods

Area-based methods, sometimes called correlation-like methods, merge the feature

detection step with the feature matching step Instead of attempting to detect salient

Trang 17

objects, windows of predefined size (or even entire images) are used for the

correspondence estimation

The area–based methods usually adapt a small window of points to determine a

matched location using the correlation technique [4] Window correspondence is

based on the similarity measure between two given windows in both the sensed

image and the reference image The most commonly used measure of similarity is

normalized cross-correlation Other useful similarity measures are the correlation

coefficient and the sequential-similarity detection [1] In normalized

cross-correlation, the measure of similarity is computed for window pairs from the

sensed and reference images and its maximum is searched The window pairs for

which the maximum is achieved are set as the corresponding ones Although the

cross-correlation based registration can exactly align mutually translated images

only, it can also be successfully applied when slight rotation and scaling are present

Another useful property of correlation is given by the Correlation theorem The

Correlation theorem states that the Fourier transform of the correlation of two

images is the product of the Fourier transform of one image and the complex

conjugate of the Fourier transform of the other This theorem gives an alternate way

to compute the correlation between images The Fourier transform is simply another

way to represent the image function Instead of representing the image in the spatial

Trang 18

domain, as we normally do, the Fourier transform represents the same information in

the frequency domain It can be computed efficiently for images using the Fast

Fourier Transform (FFT) Hence, an important reason why the correlation metric is

chosen in many registration problems is because the Correlation theorem enables it

to be computed efficiently, with existing, well-tested programs using the FFT (and

occasionally in hardware using specialized optics) The use of the FFT becomes

most beneficial for cases where the image and template to be tested are large

The area-based methods are preferable when the images do not have enough

prominent details and distinctive information is provided by graylevels/colors rather

than by local shapes and structure [5] The limitations of the area-based methods

are:

(1) The rectangular window, which is most often used, suits the registration of

images which locally differ only by a translation If images are deformed by more

complex transformations, this type of the window is not able to cover the same parts

of the scene in the reference and sensed images (the rectangle can be transformed to

some other shape) Several authors proposed to use circular shape of the window for

mutually rotated images However, the comparability of such simple-shaped

windows is violated too if more complicated geometric deformations (similarity,

perspective transforms, etc.) are present between images

Trang 19

(2) Another disadvantage of the area-based methods refers to the ‘remarkableness’ of

the window content There is high probability that a window containing a smooth

area without any prominent details will be matched incorrectly with other smooth

areas in the reference image due to its non-saliency The features for registration

should be preferably detected in distinctive parts of the image Windows, whose

selection is often not based on their content evaluation, may not have this property

(3) Classical area-based methods like cross-correlation (CC) exploit for matching

directly image intensities, without any structural analysis Consequently, they are

sensitive to the intensity changes, introduced for instance by noise, varying

illumination, and/or by using different sensor types

(4) Typically the cross-correlation between the image and the template is computed

for each allowable transformation of the template The transformation whose

cross-correlation is the largest specifies how the template can be optimally registered

to the image This is the standard approach when the allowable transformations

include a small range of translations, rotations, and scale changes; the template is

translated, rotated, and scaled for each possible translation, rotation, and scale of

interest As the number of transformations grows, however, the computational costs

quickly become unmanageable So the correlation methods are generally limited to

registration problems in which the images are misaligned only by a small rigid or

Trang 20

affine transformation

2.2.2 Feature-based methods

There are two tasks generally need to be handled in the feature-based techniques:

feature extraction and feature matching For feature extraction, the aim is to detect

two sets of features in the reference and sensed images represented by the feature

points (points themselves, end points or centers of line features, centers of gravity of

regions, etc) A variety of image segmentation techniques have been used for

extraction of edge and boundary features, such as the Canny operator, the Laplacian

of Gaussian (LoG) operator, the thresholding technique in [6], the classification

method in [7], the region growing in [8], and the wavelet transformations in [9] In

feature matching, the aim is to find the pair-wise correspondence between two

feature sets by their spatial relations or various descriptors of features Feature

correspondence is performed based on the characteristics of the features detected

Existing feature-matching algorithms include binary correlation, distance transform,

Chamfer matching, structural matching, chain-code correlation, and distance of

invariant moments [8] In most existing feature-based techniques, the crucial point is

to have discriminative and robust feature descriptor that invariant to assumed image

variations

Trang 21

Feature-based methods are typically applied when the local information is more

significant than the information carried by the image intensities In contrast to the

area-based methods, the feature-based methods do not work directly with the image

intensities The feature represents information on higher level This property makes

feature-based methods suitable to handle complex image distortions (such as image

with illuminations changes) and can apply to images of completely different nature

(such as multi-sensor analysis) However, the limitation of the feature-based

methods is that the feature may be hard to detect or unstable in time, such as some

medical images lack of distinctive objects

2.2.3 Recent registration methods

Among all the recent works, we focus on two classes of methods that appear most

appropriate for the general-purpose registration problem

1) Keypoint Indexing Methods:

Keypoint methods have received growing attention recently because of their

tolerance to low image overlap and image scale changes Keypoint indexing

methods begin with keypoint detection and localization, and then followed by

extraction of an invariant descriptor from the intensities around the keypoint In the

end the extracted invariant descriptor is used by indexing methods to match

Trang 22

keypoints between images

Existing extraction algorithms are based on approaches such as

Laplacian-of-Gaussian operator [10], Harris corners [11], information theory [12],

and intensity region stability measures [ 13] They are usually invariant to 2d

similarity or affine transformations of the image, as well as linear changes in

intensity For example, in [10], distinctive invariant features are extracted from

images that can be used to perform reliable matching between different views of an

object or scene The features are invariant to image scale and rotation, and are

shown to provide robust matching across a substantial range of affine distortion,

change in 3D viewpoint, addition of noise, and change in illumination

2) ICP

ICP is based on point features, where the “points” may be raw measurements such as

(x, y, z) values from range images, intensity points in three dimensional medical

images [ 14 ], and edge elements, corners and interest points [ 15 ] that locally

summarize the geometric structure of the images Starting from an initial estimate,

the ICP algorithm iteratively (a) maps features from the sensed image to the

reference image, (b) finds the closest reference image point for each mapping, and (c)

re-estimates the transformation based on these temporary correspondences

Trang 23

The Dual-Bootstrap ICP (DB-ICP) algorithm [16] uses the ICP algorithm DB-ICP

begins with an initial transformation estimate and initial matching regions from the

two images obtained by keypoint matching The algorithm iterates among the

following 3 steps: (1) refining the current transformation in the current “bootstrap”

region by symmetric matching, (2) applying model selection to determine if a more

sophisticated model may be used, and (3) expanding the region, growing inversely

proportional to the uncertainty of the mapping on the region boundary The

framework of this algorithm has been described elsewhere for other image

registration, such as for aerial images under different lighting conditions

The advantage of the Dual-Bootstrap ICP algorithm includes:

(1) In comparison to current image registration algorithms, it handles lower image

overlaps, image changes and poor image quality, all of which reduce the number of

common landmarks between images Moreover, by effectively exploiting the

vascular structure during the dual-bootstrap procedure it avoids the need for

expensive global search techniques

(2) In comparison with current indexing-based initialization methods and

minimal-subset random sampling methods, Dual-Bootstrap ICP has the major

advantage requiring fewer initial correspondences This is because it starts from an

Trang 24

initial low-order transformation that must only be accurate in small initial regions

(3) Instead of matching globally, which could require simultaneous consideration of

multiple matches, Dual-Bootstrap ICP uses region and model bootstrapping to

resolve matching ambiguities

However, one common problem with DB-ICP [17] is that ICP has a narrow domain

of convergence, and therefore must be initialized relatively accurately

Trang 25

Chapter 3

Image Registration

In this chapter we present a new image registration algorithm based on the local and

global structures of the feature points We apply both the local and global structure

matching methods in [3] to image registration Moreover, we improve the flexibility

of the local structure matching method to handle various image variations, and

increase the accuracy of the global structure matching method in the correspondence

estimations The major techniques of feature point matching are summarized in

section 3.3

To make the algorithm more flexible, we propose a new local structure matching

method in section 3.3.2 to handle the cases where image has significant scene

changes or distortions The proposed matching method provides more reliable local

structure matching so that two best-matched local structure pairs are correctly

computed in various imaging conditions What’s more, to improve the accuracy of

the feature points matching, we employ consistent checking and cross-validation in

our feature points matching method: we first perform global structure matching in

two cues to eliminate the false matching in section 3.3.3, and then employ

cross-validation to eliminate the low-quality matching in section 3.3.4

Trang 26

Chapter 3 is organized as following: after an overview of our algorithm in section

3.1, in section 3.2, we discuss how to extract the positions of a set of feature points

and how to estimate their orientation In section 3.3, we find correct matching pairs

between two partially overlapping images Based on the matching pair found in

section 3.3, we derive the correct transformations between two target images in

section 3.4

3.1 Algorithm Overview

The standard point mapping registration method usually consists of three stages:

feature points detection, feature points matching, and transformation model

estimation

In feature detection stage, the positions of a set of feature points are extracted by

OpenCV function GoodFeaturestoTrack [18], which computes the ‘goodness’ of a

feature points using the eigenvalue of a matrix formed from the intensities of the

pixels in a neighborhood of a feature point Then the orientations of those feature

points are estimated by a least mean square estimation method After the orientation

field of an input image is estimated, we calculate the reliability level of the

orientation data For each feature point, if its reliability level of the orientation field

is below a certain threshold, then this feature point is eliminated from the feature

Trang 27

points set, so that we only keep the feature points with reliable orientation estimation

The detail of feature points’ position extraction and orientation estimation will be

discussed in section 3.2.1and 3.2.2 respectively

Feature points matching is our major concern in the registration since the accuracy

of the feature points matching lays the foundation for accurate registration In the

feature points matching stage, both the local and the global structure matching

proposed in Jiang and Yau’s fingerprint matching method [3] are applied The

fingerprint matching method in [3] attempts to automate a human expert’s behavior

in process of aligning two fingerprints While comparing two fingerprints, a human

expert used to firstly manually examine the local positional relations between the

minutiae (referred as local structure of a minutia), and then align two fingerprints

using the unique global position structures of the whole image In our approach, we

not only adapt the local and global structure matching methods in [3] so that they

can be applied to general image, but also improve both of them so that they are more

flexible and reliable To make the local structure matching more flexible for various

imaging variations, we improve the local structure matching method to handle the

cases where the image has significant scene changes or distortions What’s more,

consistent checking and cross-validation are both employed in our feature points

matching method to guarantee the reliability of estimated feature correspondences

Trang 28

We first perform consistent checking in two cues to eliminate the false matching

pairs, and then employ cross-validation to eliminate the low-quality matching pairs

Trang 29

Feature

descriptor

F ij

Serious deformation Slight deformation

{f sp ↔ f tq } {f su ↔ f tv }

MP 1

Local structure matching

Complex matching Direct matching

Trang 30

The main procedure of our feature points matching algorithm is shown in Figure 1

At the beginning, we adapt the feature descriptor F ij defined in [3] to describe of the

spatial relations between the feature points f i and f j by their relative distance, radial

angle and orientation difference Thus for every feature point f i , a local structure LS i

is formed as the spatial relations between f i and its k-nearest neighbors The detail of

feature descriptor is summarized in section 3.3.1

Then given two feature points sets Fs ={f s1 ,…f sn} and Ft ={f t1 ,…f tm}, the local structure matching is performed to find two best-matched local structure

pairs{f sp ↔ f tq} and {f su ↔ f tv} Since the local structure matching method proposed in [3] can only apply to images with simple geometry transformations and

slight distortions, we propose a more complex local structure matching method in

section 3.3.2 to handle complex image variations Basic idea of our proposed

method is as follows: when comparing two local structures of two feature points,

instead of simply matching their k-nearest neighbors in order of their relative

distances as [3], we compute the similarity of two local structures only according to

those matched neighbors We will discuss how to qualify two matched neighbor in

section 3.3.2 Employing this method, we can provide applicable local structure

matching for images with complex variations, such high deformations or object

movements

Trang 31

Assume that we obtain two best-matched local structure pairs, say

{f sp ↔ f tq},{ f su ↔ f tv}, from the local structuring matching, either one of them

can serve as a reliable correspondence of the two feature points’ sets Fs and Ft All

other feature points in Fs and Ft will be converted to the polar coordinate system with respect to the corresponding reference pair Here we perform the global

structure matching in two cues for consistence check: only those correspondences

found by both cues are considered as valid matches, the other candidates points are

excluded from the further processing As shown in Figure 1, the best-matched local

structure pair {f sp ↔ f tq}is input to cue1 to provide correspondence for aligning

the global structure of the feature points, and a matching pair set MP1 is generated

from cue1; while the other best-matched local structure pair {f su ↔ f tv}is input to

cue2 to generate a matching pair set MP2 Only those pairs are generated from both

cues are considered as the valid matching pairs The global matching method is

presented in section 3.3.3

After we obtain a number of matching pairs from the global structure matching, we

apply the validation step to eliminate those low-quality matching pairs by

cross-validation The details of eliminating the low-quality matching pairs are

presented in section 3.3.4

After obtaining a set of correct matching pairs, we can decide the transformation

Trang 32

between two images using QR factorization We discuss how to derive the correct

transformations in section 3.4

3.2 Feature points detection

In this section, we shall describe in detail how to extract the positions of a set of

feature points using eigenvalue, and how to estimate the orientation of those feature

points by a least mean square estimation method

3.2.1 Feature point position extraction

Features are salient structure in the images, such as significant regions, lines, or

points Typical features that are used are corners, line intersections, points on curves

with high curvature, high variance points, and local extreme of wavelet

transformation In our approach, we employ OpenCV function goodFeaturestoTrack

[ 18 ] to extract feature point positions In OpenCV the function

GoodFeaturesToTrack is designed to find corners by computing the ‘goodness’ of a

feature points using Tomasi’s algorithm This algorithm computes the eigenvalue of

a matrix formed from the intensities of the pixels in a neighborhood of a feature

point

Trang 33

3.2.2 Feature point orientation estimation

A number of methods have been proposed have been proposed for orientation

estimation of the feature points In our system, we apply the least mean square

estimation algorithm proposed by [19] [20] The steps for calculating the orientation

at pixel (i , j) are as follows:

1 Divide the input image into blocks of sizeW W× ．

2 For each pixel in the block, calculate image gradientsG and x G y at each pixel by

Sobel operator, where G and x G y are the gradient magnitudes in x and y directions,

respectively The horizontal Sobel operator is used to compute G and the vertical x

Sobel operator is used to compute G y

3 Estimate the local orientation at each pixel( )i j by finding the principal axis of ,

variation in the image gradients:

(3.1) / 2

/ 2 / 2 / 2

Trang 34

1 ( , )1

y x

where θ( , )i j is the least square estimate of the local orientation at the block

centered at pixel ( )i j,

4 Smooth the orientation field in a local neighborhood using a Gaussian filter The

orientation image is first converted into a continuous vector field, which is defined

where U x and U y are the x and y components of the vector field, respectively After

the vector field has been computed, we smooth the orientation by a Gaussian

low-pass filter of sizew w'× ':

Trang 35

5 The final smoothed orientation field O at pixel ( )i j is defined as: ,

1 ' ( , )1

y x

In the feature point matching process, the orientation of feature points is the most

important criteria for feature measurement Therefore the reliability of orientation

estimation is important To measure the reliability of orientation data, we first

calculate the area moment of inertia about the orientation axis found as the

minimum inertia, and then calculate the axis perpendicular as the maximum inertia:

where I min and I max denotes the minimum and maximum inertia, respectively If the

ratio of the minimum to maximum inertia is close to one, we have little orientation

information Therefore we calculate the reliability of orientation at pixel( using

Trang 36

feature point, if its reliability of the orientation field is below a certain threshold, it

will be eliminated from the feature points’ set for the subsequent feature points

matching stage

3.3 Feature points matching

The main procedure of our feature point matching algorithm is shown in Figure 1

As shown in Figure 1, there are four major steps in our matching algorithm: define

an invariant feature descriptor to describe the local positional relations between two

feature points; local structure matching to get the best-matched local structure pairs;

global structure matching to get a set of matching pairs; and cross-validation to

eliminate the low-quality matching pairs

3.3.1 Define a feature descriptor

Each feature point i detected before can be represented by a feature vector f i as:

where (x i ,y i) is its coordinate, ϕi is the orientation.(see Figure 2) The feature vector f i

represents a feature point’s absolute structure However, in this thesis we adapt the

name from [3], so that f i is called ‘global structure’ of the feature point i

Trang 37

i

x i

y i

ϕi

Figure 2: feature point i be represented by a feature vector fi = ( xi ,y i,ϕi )

The global characteristic of the feature point x y i, i,ϕi are dependent on the rotation

and translation of the image However, a feature point can be described with

rotation and translation invariant feature by using some other feature points in the

same image (Without specification, rotation and translation in this thesis stand for

2D rotation and 2D translation only.) Here we adapt the feature descriptor F ij

defined in [3] to describe the local positional relations between two feature points f i

and f j by their relative distance d ij, radial angle θ and orientation difference ij ϕ ij

(see Figure 3) by equation (3.13):

Trang 38

From Figure 3, we see that a feature point f i together with the feature description F ij

can uniquely decide the position and orientation of another feature point f j

(uniqueness) For any feature point pair, their relative distance d ij, radial angle

ij

θ and orientation differenceϕ , are invariant to 2D rotation and translation of the ij

image (invariance) Moreover, the elements in the vector F ij are functionally

Trang 39

3.3.2 Local structure matching

Employing the feature descriptor described in section 3.3.1, for every feature point f i,

a local structure LS i can be formed as the spatial relations between the feature point

f i and its k-nearest neighbors:

where F ij is the feature descriptor describing the local positional relations between

two feature points f i and f j defined in equation(3.13), and F ij T is the transpose of F ij

We should note that LS i is ordered ascendingly by the relative distance d ik between

the feature point i and its neighbor k

It is easy to see that the local structure feature vector LS i is independent of the 2D

rotation and translation of the image So it can directly be used for matching Thus in

local structure matching, given two feature sets Fs ={f s1 ,…f sn} and Ft ={f t1 ,…f tm},

where Fs and Ft consist of all feature points detected from sensed image s and

reference image t, respectively, the aim is to find two best-matched local structure

pairs{f sp ↔ f tq} and {f su ↔ f tv} to serve as the corresponding reference pairs

later in the global structure matching stage The reason why we need two

best-matched local structure pairs is to perform consistence check, which we will

explain in more detail in section 3.3.3

Trang 40

Here we have two ways to measure the similarity level of two local structures

according to the complex level of image variations In case there are only rotate and

translate transformations between images and the image distortions are slight, we

employ the direct local structure matching method proposed in [3] for its efficiency

Otherwise we propose a more complex local structure matching method to solve the

problem where the sensed image and the reference image from the same scene have

only a few similar local structure pairs We present both methods in the follows

Direct local structure matching [3]

Suppose LS

iand LS j are the local structure feature vectors of the feature point i

from sensed image s and the feature points j from the reference image t, respectively

A similarity level between two feature points i and j is defined as

, ( , )

Định dạng
Số trang	92
Dung lượng	2,77 MB