Innovations in Robot Mobility and Control - Srikanta Patnaik et al (Eds) Part 6 ppt

2.4.3.2 Sizes and Views Invariant Landmark Recognition The SVALR architecture requires the recognition of landmarks that are continuously changing in size and shape during navigation..

Trang 1

2.4.3 Landmark Recognitions

In autonomous robot navigations, it is critical that the vision system is able

to achieve reliable and robust visual landmark recognitions in real-time

Fault landmark recognitions will lead to ‘the robot is lost’ situation, where

the robot loses its perception and its current location in the environment In

general, fault recognitions are cased by image distortions Therefore, the

challenge is to develop techniques to overcome image distortions due to

noises introduced through wireless video links Furthermore, as the robot

navigates the size and shape of landmark are changing constantly These

changes are directly proportional to the robot’s speed and the robot’s

ap-proaching angles with respect to the target landmark during navigation

The following sections describe techniques used to overcome image

distor-tions, and changes in landmark’s size and shape

2.4.3.1 Distortion Invariant Landmark Recognition

The SVALR architecture recognises landmarks based on their shapes

Therefore if the shape of a landmark is affected by noises causing image

distortions, changing the size and shape of the landmark will result in a

recognition failure The architecture employs two concepts named band

transformation and shape attraction to overcome image distortions and

small change in the landmark’s size and shape

The central idea to band transformations is to thicken the shape of the

landmark by means of a Gaussian filter [56] or an averaging mask [57]

us-ing eq.2.6 This will produce a blurred edge image The blurred image is

then subjected to a shape attraction process The shape attraction process

uses the memory template to selectively attract the corresponding edge

ac-tivities in the blurred shape and project them into the original undistorted

shape The concept of shape attraction is further illustrated in Fig 2.13

)

* /(

) , ( )

,

(

5

0

5

j

i

IB

r

c

¼

º

«

¬

ª

¦¦

(2.6)

Where IB is the burred image, r & c are the size of the averaging window

and I is the input edge image

2.4.3.2 Sizes and Views Invariant Landmark Recognition

The SVALR architecture requires the recognition of landmarks that are

continuously changing in size and shape during navigation This leads to

the development of a simultaneously multiple-memory image search

(SMIS) mechanism This mechanism is capable of providing real-time size

Trang 2

and view invariant visual landmark recognition [58] The central idea to the SMIS mechanism is to pre-store multiple memory images of different landmark’s sizes and from different views and compares each input image with multiple memory templates

Fig 2.13 The shape attraction process

Through experiment it was found that the landmark’s size is directly proportional to the distance between the landmark and the robot As a re-sult, the shape attraction method is capable of providing a small detectable region for each memory image stored in memory as illustrated in Fig 2.14(a) The first memory image is taken from a distant, K1, away from the landmark This provides a detectable region around the location, X1, and a detectable angle, D Thus, multiple memory images can be selected with adjacent detectable regions joined together to provide landmark recogni-tion over larger distances and hence larger changes in landmark’s size Therefore, by storing multiple memory images of different landmark’s sizes, with detectable regions joined together will provide the system with full size invariant landmark recognitions The numbers of memory images required depend on the rate of change in the landmark’s size, which is di-rectly proportional to the robot’s speed

Similarly, each memory image provides a detection angle, D Therefore, multiple views covering 3600

around the landmark with the angle between these views equal to D are stored for each landmark to provide full view invariant landmark recognitions as shown in Fig 2.14(b) The number of views required to cover 3600

are given by the eq.2.7.

Band Transformation Stage

Shape Attraction Stage

Trang 3

No of views = 3600/ (2.7)

Fig 2.14 The SMIS mechanism for achieving size and view invariant landmark recognitions (a) Size invariant landmark recognition using two memory images (b) View invariant landmark recognition using views memory images

The central idea of the SMIS mechanism is to search for multiple mem-ory images simultaneously Thus, it allows the SVALR architecture to rec-ognise landmarks of different sizes and from different views However, the SMIS mechanism is very computational intensive as many views are evaluated simultaneously Therefore the SMIS mechanism employs a view selector to select a limited number of views to use in the searching process for reducing the computational requirement The view selector determines appropriate views based on the robot’s heading, which is provided by the magnetic compass on-board the robot via wireless data link, illustrated on the top in Fig 2.14(b) This reduces to only the current view and two left-right adjacent views are activated instead of simultaneously searching through all the views associated with a landmark

Landmark Templates stored in memory

M2 M1

w

Į

K 1

Detectable Region M1

Detectable Region M2

H 1

H T

X 1

X 2

Robot’s Headings

View Selector

Selected Views

h1

Trang 4

2.4.3.3 Light Invariant Recognition

The SVALR architecture processes input images based on pre-processed

edges or boundary information of an edge detection stage Therefore, the

efficiency of the architecture is directly depending on the quality of edge

information obtained Common edge detection methods such as Sobel,

Prewitt and Robinson edge detections, all detecting edges based on the

dif-ferences between the sums of pixels values on the left and right regions of

a target pixel This is generally achieved by applying an appropriate edge

detection convolution mask The strength of the detected edges using these

methods is directly affected by the amount of light in the environment

Changes in light intensity have an immediate impact on the strength of

edges obtained at the edge detection stage This section describes a new

method for edge detection named contrast-based edge detection This edge

detection method enables the SVALR architecture to recognise landmarks

under different lighting conditions

The contrast-based edge detection is developed based on Grossberg’s

theory on shunting-competitive neural networks [59, 60] The equation for

dynamics competition of biological neurons is given in eq.2.8 Where A

is the rate of decay, B and D are some constant that specify the range of

neurons activities Eijand Cij are excitatory and inhibitory inputs

respec-tively

ij ij

ij ij ij

ij

C D x E x B Ax

dt

dx

) (

)

(2.8)

At equilibrium 0

dt

dx ij

, there exists a steady state solution for the

neuron,x ij given in eq.2.9.

0 ) (

)

Axij B xij Cij xij D Eij

(2.9)

ij ij

ij ij ij

E C A

DE BC

x

(2.10)

In order to design the contrast-based edge detection, the Cij and Eij

terms are replaced with the left and right columns of an edge detection

mask instead of the excitatory and inhibitory inputs in dynamic

competi-tive neurons as shown in Fig 2.15 Since B and D are constants Let B &

Trang 5

D =1, this gives the contrast-based edge detection equation as shown in

eq.2.11.

¦

|

ij ij ij

ij

ij ij ij

ij ij

I C I

E A

I E I

C

x (2.11)

Where A is a small constant preventing the equation from dividing by zero and Iij is the input gray level image Notice that both Sobel and Robinson edge detection masks can be used in the contrast-based edge detection

In general, the contrast-based edge detection uses a conventional edge detection convolution mask for detecting the difference between neighbouring left and right regions of a target pixel The calculated differ-ence is divided by the total sum of all edge activities from both left and right regions within the edge detection mask

Fig 2.15 Contrast-based vertical edge detection masks

2.4.3.4 Final SVALR Architecture

The final SVALR architecture is illustrated in Fig 2.16 Initially, a gray level image is pre-processed using the contrast-based edge detection to generate an edge image This image is blurred using a 5x5-averaging win-dow for achieving distortion and small size and view invariant landmark recognitions using shape attraction A window-based searching mechanism

Trang 6

is employed to search the entire input blurred image for a target landmark The search window is sized, 50x50 pixels, each region within the search window is processed in the pre-attentive stage using both the ROI and the signature thresholds as illustrated at the bottom of Fig 2.16 The selected regions are passed into the attentive stage, where they are modulated by

the memory feedback modulation given in eq.2.5 Then, lateral

competi-tion between pixels within the selected region is achieved by applying L2 normalisation This results in a filter effect, which enhances common edge activities and suppresses un-aligned features between the memory image and the input region achieving object background separation

The SMIS mechanism selects appropriate views based on the robot’s heading as illustrated on the top of Fig 2.16 It is found that the SVALR architecture requires a minimum of two memory images for each view and eight views to achieve size and view invariant landmark recognition re-spectively This is achieved at a moderate robot’s speed The selected memory images are used to compare with the selected input region The matching between selected input regions and corresponding memory im-ages are determined based on two criteria

Firstly, each selected input region is compared with two selected mem-ory images (belonging to one view) separately using the cosine between two 2-D arrays The cosine comparison results with a match value range from 0 to 1, where 1 is the 100% match, which is evaluated against a match-threshold of 90% match If either results are greater than the match threshold, then the second criterion is evaluated The second criterion is based on a concept of top-down expectancy from physiological study Based on a given map, the landmark is expected to appear at a certain dis-tance and direction These two constraints are used to further enhance the robustness of the landmark recognition stage Therefore, a match only oc-curs when the robot has travelled a minimum required distance and head-ing in the approximate expected direction

2.5 Results

The autonomous mobile robot is evaluated in indoor laboratory environ-ment The robot is provided with a topological map, which consists of the relative directions and approximate distances between objects placed on the laboratory floor A number of autonomous navigation trials were con-ducted to evaluate the SVALR architecture ability to recognise landmarks

in clean, cluttered complex backgrounds and under different lighting con-ditions Four trials were selected for discussion in this chapter

Trang 7

Fig 2.16 The Selective Visual Attention Landmark Recognition Architecture

In the first trial, objects that are chosen to serve as landmarks were placed in front of clean backgrounds and critical points where the robot needs to make a turn During navigation each input images is pre- proc-essed by the contrast-based edge detection and then blurred using a 5x5 averaging window for achieving distortion invariant landmark recognition

as shown in Fig 2.17(b) and Fig 2.17(c) respectively The landmark search and recognition is performed on the blurred image, where each 50x50 region is compared with the memory image The results are

Signature

Threshold

Less than threshold

COMPARE

ROI

Threshold

COMPARE

Match Threshold

COMPARE

Normalised Input Region

Gray Level Images

Memory

Feedback

Modulation

Selected Regions

COMPARE

Memory

Images

Send to Robot Reset

Contrast-Based

Edge Image

Blurred Image

View Selec tor

Topological Map

Top-Down Expectancy Heading

Pre-attentive Stage Attentive Stage SMIS

Mechanism

Trang 8

converted into a range from 0-255 and displayed as an image form in Fig 2.17(d) The region with highest intensity represents the highest match with the memory image The dot indicated by an arrow at the bottom cen-tre of Fig 2.17(d) highlights location of maximum match value and is greater than the match threshold (location where the object is found) This location is sent to the robot via wireless data link The navigational algo-rithm on-board the robot based on the provided map to perform self-localisation and more toward the next visual landmark In this trial, the re-sults show that the SVALR architecture is capable of recognising all visual landmarks in clean backgrounds, successfully performing self-localisation and autonomous navigation Finally, black regions in Fig 2.17(d) are ones that have been skipped by the pre-attentive stage This has increased the landmark searching process significantly [55]

In the second trial each landmark is placed in front of a complex back-ground with many other objects behind it This is to demonstrate the SVALR architecture ability to recognise landmarks in cluttered back-grounds Similarly, incoming images are processed as discussed previously with the landmark recognition results illustrated in Fig 2.18, which shows

a sample processed frame during navigation The dot indicated by the ar-row highlights the location where the landmark was found in Fig 2.18(d) The robot is able to traverse the specified route, detecting all visual land-marks embedded in complex backgrounds

In the third trial, the same experimental setup as trial two was used ex-cept all the lights in the laboratory were turned off with windows remain open to simulate sudden change in image conditions All landmarks are placed in complex cluttered backgrounds A sample processed frame dur-ing the navigation is illustrated in Fig 2.19 Similarly, the system is able to successfully traverse the route, recognising all landmarks under insuffi-cient light conditions and embedded in cluttered backgrounds

2.6 Conclusion

This chapter has provided an insight into autonomous vision-based autonomous robot navigations, focusing on monocular vision and naviga-tion by 2D landmark recogninaviga-tions in clean and cluttered backgrounds as well as under different lighting conditions The essential components of monocular vision systems are described in details including; maps, data acquisition, feature extraction, landmark recognition and self-localisation Then a 2-D landmark recognition architecture named selective visual attention landmark recognition (SVALR) is proposed based on a detailed

Trang 9

analysis of how the Adaptive Resonance Theory (ART) model may be ex-tended to provide a real-time neural network that has more powerful atten-tional mechanisms This leads to the development of the Selective Atten-tion Adaptive Resonance Theory (SAART) neural network It uses the established memory to selectively bias the competitive processing at the input to enable landmark recognitions in cluttered backgrounds Due to the dynamic nature of SAART, it is very computationally intensive Therefore the main concept in the SAART network (top down presynatic facilitation)

is re-engineered and is named memory feedback modulation (MFM) mechanism Using the MFM mechanism and in combination with standard image processing architecture leading to the development of the SVALR architecture

A robot platform is developed to demonstrate the SVALR architecture applicability in autonomous vision-based robot applications A SMIS mechanism was added to the SVALR architecture to cope with image dis-tortions due to wireless video links and the dynamic changing in land-mark’s size and shape The SMIS mechanism uses the concepts of band transformations and the shape attraction to achieve image distortions, small size and view invariant landmark recognitions The experiments show that the SVALR architecture is capable of autonomously navigating the labora-tory environment, using the recognition of visual landmarks and a topo-logical map to perform self-localisation The SVALR architecture is capa-ble of achieving real-time 2-D landmark recognitions in both clean and complex cluttered backgrounds as well as under different lighting condi-tions

The SVALR architecture performance is based on the assumptions that all visual landmarks are not occluded and only one landmark is searched for and recognised at a time Thus the problems of partial and multiple landmarks recognitions haven’t been addressed Furthermore, the robot platform is designed and implemented with the primary purpose of validat-ing the SVALR architecture and therefore omittvalidat-ing the obstacle avoidance capability In addition, the memory used in the SVALR architecture is pre-selected prior to navigation and cannot be changed dynamically This give rises to a need for developing an obstacle avoidance capability and an adaptive mechanism to provide some means of learning to the SVALR ar-chitecture to cope with real-life situations, where landmarks move or change its shape and orientations dynamically These problems are re-mained as future research

Trang 10

Fig 2.17 A processed frame from the first trial, encountered in the navigational

phase (a) Grey level image, (b) Sobel edge image, (c) Blurred image using a

5x5-averaging mask and (d) The degree of match of the input image with the memory

image at each location

Fig 2.18 A processed frame from the laboratory environment, encountered in the

navigational phase (a) Grey level image, (b) Sobel edge image, (c) Blurred image

using a 5x5-averaging mask and (d) The degree of match of the input image with

the memory image at each location

(a) (b)

Định dạng
Số trang	20
Dung lượng	718,14 KB