1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Robot Localization and Map Building Part 9 doc

35 176 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Robot Localization and Map Building Part 9
Trường học University of Science and Technology of Hanoi
Chuyên ngành Robot Localization and Map Building
Thể loại lecture notes
Định dạng
Số trang 35
Dung lượng 2,39 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

4 shows an example of a base likelihood shown as a grayscale image and particles that represent the pose of the vehicle and their associated paths in cyan.. 4 shows an example of a base

Trang 2

trajectoryi t' , t' k ,k, where the value  defines some horizon of time This

trajectory ends exactly at the particle instance, i.e i  i

k

The estimated dead-reckoning trajectory is usually defined in a different coordinate system

as it is the result of an independent process The important aspect of the dead-reckoning

estimate is that its path has good quality in relative terms, i.e locally Its shape is, after

proper rotation and translation, similar to the real path of the vehicle expressed in a

different coordinate frame

If the dead-reckoning estimate is expressed as the path i t' x t y t     ' ,  ' , t'  then the

process to associate it to an individual particle and to express it in the global coordinate

frame is performed according to:

k

 is the heading of the particle i

k

X and  k

is the heading of the dead-reckoning path at time k

Clearly, at time 't k the difference x t' ,y t'  must be 0,0 , and t' t k'  k 0

Fig 3 (left) shows a dead-reckoning path and how it would be used to define the associated

paths of two hypothetical particles The associated paths Fig 3 (right) are just versions of the

original path adequately translated and rotated to match the particles’ positions and

100 150 200

X

Fig 3 The left picture shows a dead-reckoning path, expressed in a coordinate frame

defined by the position and heading of the last point (red square) The right picture shows

the same path associated to two arbitrary particles, expressed in a common coordinate frame

The new likelihood of a particle is now evaluated through the likelihood of its associated path with respect to the road map:

where i s is the path expressed in function of its intrinsic parameter s and l s is the length

of integration over the path The integration of the hypothetical path can be well approximated by a discrete summation

Fig 4 shows an example of a base likelihood (shown as a grayscale image) and particles that represent the pose of the vehicle and their associated paths (in cyan) The particles’ positions and headings are represented blue arrows The red arrow and the red path correspond to one of most likely hypotheses

By applying observations that consider the hypothetical past path of the particle, the map problem is mitigated (although not solved completely) for transition situations The transition between being on the known map and going completely out of it (i.e current pose and recent path are out of the map) can be performed safely by considering an approach based on hysteresis

out-of-The approach is summarized as follows: If the maximum individual path likelihood (the likelihood of the particle with maximum likelihood) is higher than K H then the process keeps all particles with likelihoodK L These thresholds are defined by100%K HK L0% If the maximum likelihood is K H then the process keeps all the particles and continues the processing in pure prediction mode Usual values for these thresholds areK H70%,K L60%

Trang 3

Robust Global Urban Localization Based on Road Maps 275

trajectoryi t' , t' k ,k, where the value  defines some horizon of time This

trajectory ends exactly at the particle instance, i.e i  i

k

The estimated dead-reckoning trajectory is usually defined in a different coordinate system

as it is the result of an independent process The important aspect of the dead-reckoning

estimate is that its path has good quality in relative terms, i.e locally Its shape is, after

proper rotation and translation, similar to the real path of the vehicle expressed in a

different coordinate frame

If the dead-reckoning estimate is expressed as the path i t' x t y t     ' ,  ' , t'  then the

process to associate it to an individual particle and to express it in the global coordinate

frame is performed according to:

k

 is the heading of the particle i

k

X and  k

is the heading of the dead-reckoning path at time k

Clearly, at time 't k the difference x t' ,y t' must be 0,0 , and t' t k'  k 0

Fig 3 (left) shows a dead-reckoning path and how it would be used to define the associated

paths of two hypothetical particles The associated paths Fig 3 (right) are just versions of the

original path adequately translated and rotated to match the particles’ positions and

100 150 200

X

Fig 3 The left picture shows a dead-reckoning path, expressed in a coordinate frame

defined by the position and heading of the last point (red square) The right picture shows

the same path associated to two arbitrary particles, expressed in a common coordinate frame

The new likelihood of a particle is now evaluated through the likelihood of its associated path with respect to the road map:

where i s is the path expressed in function of its intrinsic parameter s and l s is the length

of integration over the path The integration of the hypothetical path can be well approximated by a discrete summation

Fig 4 shows an example of a base likelihood (shown as a grayscale image) and particles that represent the pose of the vehicle and their associated paths (in cyan) The particles’ positions and headings are represented blue arrows The red arrow and the red path correspond to one of most likely hypotheses

By applying observations that consider the hypothetical past path of the particle, the map problem is mitigated (although not solved completely) for transition situations The transition between being on the known map and going completely out of it (i.e current pose and recent path are out of the map) can be performed safely by considering an approach based on hysteresis

out-of-The approach is summarized as follows: If the maximum individual path likelihood (the likelihood of the particle with maximum likelihood) is higher than K H then the process keeps all particles with likelihoodK L These thresholds are defined by100%K HK L0% If the maximum likelihood is K H then the process keeps all the particles and continues the processing in pure prediction mode Usual values for these thresholds areK H70%,K L60%

Trang 4

60 80 100 120 140 160 180 40

60 80 100 120 140 160 180

x (meters)

Fig 4 A synthetic example This region of interest (ROI) is a rectangle of 200 meters by 200

meters A set of particles and their associated paths are superimposed to an image of base

likelihood

In the synthetic example shown in Fig 4 the region of interest (ROI) is a rectangle of 200

meters by 200 meters This ROI is big enough to contain the current population of particles

and their associated paths

Although all the particles are located on the road (high base likelihood); many of their

associated paths abandon the zones of high base likelihood The most likely particles are

those that have a path mostly contained in the nominal zones It can be seen the remarkable

effect of a wrong heading that can rotate the associated path and make it to abandon the

zones of high base likelihood (i.e the road sections in gray)

Some particles have current values that escape the dark gray region (high base likelihood

zones) however their associated paths are mostly contained in the roads That means the

real vehicle could be actually abandoning the road This situation is repeated in Fig 5 as

well, where all the particles are located outside of the nominal road although many of them

have paths that match the map constraints

When the filter infers that the vehicle has been outside the map for sufficient time (i.e no

particles show relevant part of their paths consistent with the map), no updates are

performed on the particles, i.e the filter works in pure prediction mode

When the vehicle enters the known map and eventually there are some particles that

achieve the required path likelihood, i.e higher thanK H, then the filter will start to apply

the updates on the particles

However this synchronization is not immediate There could be some delay until some

associated paths are consistent with the map the fact that a particle is well inside the road

does not mean that its likelihood is high It needs a relevant fraction of its associated path

history to match the road map in order to be considered “inside the map”

This policy clearly immunizes the filter from bias when incorrect particles are temporarily

on valid roads

40 60 80 100 120 140 160 180

4 Experimental Results

Long term experiments have been performed in urban areas of Sydney The road maps were created by an ad-hoc Matlab tool that allowed users to define segments on top of a satellite image obtained from Google Earth These road maps were low quality representations of the roads This disregard for the quality of the definition of the road maps was done on purpose with the goal of exposing the approach to realistic and difficult conditions Fig 7 and Fig 8 show the road map used in the estimation process Fig 2 shows part of the used road map as well

The dead-reckoning process was based on the fusion of speed and heading rate measurements The heading rate was provided by low cost three dimensional gyroscopes A diversity of additional sensors were available in the platform (PAATV/UTE project) although those were not used in the estimation process and results presented in this paper All the experiments and realistic simulations have validated the satisfactory performance of the approach

Figures 7, 8 and 9 present the position estimates as result of the estimation process Those are shown in red (Figure 7) or in yellow (Figures 8 and 9) and are superimposed on the road map In some parts of the test the vehicle went temporarily outside the known map Although there was not a predefined map on those sections it was possible to infer that the estimator performed adequately From the satellite image and the over-imposed estimated

path, a human can realize that the estimated path is actually on a road not defined in the a

priori map (Fig 9)

Trang 5

Robust Global Urban Localization Based on Road Maps 277

40 60 80 100 120 140 160 180

x (meters)

Fig 4 A synthetic example This region of interest (ROI) is a rectangle of 200 meters by 200

meters A set of particles and their associated paths are superimposed to an image of base

likelihood

In the synthetic example shown in Fig 4 the region of interest (ROI) is a rectangle of 200

meters by 200 meters This ROI is big enough to contain the current population of particles

and their associated paths

Although all the particles are located on the road (high base likelihood); many of their

associated paths abandon the zones of high base likelihood The most likely particles are

those that have a path mostly contained in the nominal zones It can be seen the remarkable

effect of a wrong heading that can rotate the associated path and make it to abandon the

zones of high base likelihood (i.e the road sections in gray)

Some particles have current values that escape the dark gray region (high base likelihood

zones) however their associated paths are mostly contained in the roads That means the

real vehicle could be actually abandoning the road This situation is repeated in Fig 5 as

well, where all the particles are located outside of the nominal road although many of them

have paths that match the map constraints

When the filter infers that the vehicle has been outside the map for sufficient time (i.e no

particles show relevant part of their paths consistent with the map), no updates are

performed on the particles, i.e the filter works in pure prediction mode

When the vehicle enters the known map and eventually there are some particles that

achieve the required path likelihood, i.e higher thanK H, then the filter will start to apply

the updates on the particles

However this synchronization is not immediate There could be some delay until some

associated paths are consistent with the map the fact that a particle is well inside the road

does not mean that its likelihood is high It needs a relevant fraction of its associated path

history to match the road map in order to be considered “inside the map”

This policy clearly immunizes the filter from bias when incorrect particles are temporarily

on valid roads

40 60 80 100 120 140 160 180

4 Experimental Results

Long term experiments have been performed in urban areas of Sydney The road maps were created by an ad-hoc Matlab tool that allowed users to define segments on top of a satellite image obtained from Google Earth These road maps were low quality representations of the roads This disregard for the quality of the definition of the road maps was done on purpose with the goal of exposing the approach to realistic and difficult conditions Fig 7 and Fig 8 show the road map used in the estimation process Fig 2 shows part of the used road map as well

The dead-reckoning process was based on the fusion of speed and heading rate measurements The heading rate was provided by low cost three dimensional gyroscopes A diversity of additional sensors were available in the platform (PAATV/UTE project) although those were not used in the estimation process and results presented in this paper All the experiments and realistic simulations have validated the satisfactory performance of the approach

Figures 7, 8 and 9 present the position estimates as result of the estimation process Those are shown in red (Figure 7) or in yellow (Figures 8 and 9) and are superimposed on the road map In some parts of the test the vehicle went temporarily outside the known map Although there was not a predefined map on those sections it was possible to infer that the estimator performed adequately From the satellite image and the over-imposed estimated

path, a human can realize that the estimated path is actually on a road not defined in the a

priori map (Fig 9)

Trang 6

It is difficult to define a true path in order to compare it with the estimated solution This is

because the estimator is intended to provide permanent global localization with a quality

usually similar to a GPS Figures 10, 11 and 12 present the estimated positions and

corresponding GPS estimates although those were frequently affected by multipath and

other problems

5 Conclusions and Future Work

This paper presented a method to perform global localization in urban environments using

segment-based maps together with particle filters In the proposed approach the likelihood

function is locally generated as a grid derived from segment-based maps The scheme can

efficiently assign weights to the particles in real time, with minimum memory requirements

and without any additional pre-filtering procedures Multi-hypothesis cases are handled

transparently by the filter A path-based observation model is developed as an extension to

consistently deal with out-of-map navigation cases This feature is highly desirable since the

map can be incomplete, or the vehicle can be actually located outside the boundaries of the

provided map

The system behaves like a virtual GPS, providing accurate global localization without using

an actual GPS

Experimental results have shown that the proposed architecture works robustly in urban

environments using segment-based road maps These particular maps provide road

network connectivity in the context of the Darpa Urban Challenge However, the proposed

architecture is general and can be used with any kind of segment-based or topological a

priori map

The filter is able to provide consistent localization, for extended periods of time and long

traversed courses, using only rough dead-reckoning input (affected by considerably drift),

and the RNDF map

The system performs robustly in a variety of circumstances, including extreme situations

such as tunnels, where a GPS-based positioning would not render any solution at all

The continuation of this work involves different lines of research and development One of

them is the implementation of this approach as a robust and reliable module ready to be

used as a localization resource by other systems However this process should be flexible

enough to allow the integration with other sources of observations such as biased compass

measurements and even sporadic GPS measurements

Other necessary and interesting lines are related to the initialization of the estimation

process, particularly for cases where the robot starts at a completely unknown position

Defining a huge local area for the definition of the likelihood (and spreading a population of

particles in it) is not feasible in real-time We are investigating efficient and practical

solutions for that issue

Another area of relevance is the application of larger paths in the evaluation of the Path

Likelihood In the current implementation we consider a deterministic path, i.e we exploit

the fact that for short paths the dead-reckoning presents low uncertainty to the degree of

allowing us to consider the recent path as a deterministic entity In order to extend the path

validity we need to model the path in a stochastic way, i.e by a PDF Although this concept

is mathematically easy to define and understand it implies considerable additional

computational cost

Finally, the observability of the estimation process can be increased by considering additional sources of observation such the detection of road intersections These additional observations would improve the observability of the process particularly when the vehicle does not perform turning maneuvers for long periods

Likelihood in selected region

-1500 -1000 -500 0 500 1000 1500 2000 -1500

-1000 -500 0 500 1000 1500 2000 2500

Estimated Path and Known Map

Trang 7

Robust Global Urban Localization Based on Road Maps 279

It is difficult to define a true path in order to compare it with the estimated solution This is

because the estimator is intended to provide permanent global localization with a quality

usually similar to a GPS Figures 10, 11 and 12 present the estimated positions and

corresponding GPS estimates although those were frequently affected by multipath and

other problems

5 Conclusions and Future Work

This paper presented a method to perform global localization in urban environments using

segment-based maps together with particle filters In the proposed approach the likelihood

function is locally generated as a grid derived from segment-based maps The scheme can

efficiently assign weights to the particles in real time, with minimum memory requirements

and without any additional pre-filtering procedures Multi-hypothesis cases are handled

transparently by the filter A path-based observation model is developed as an extension to

consistently deal with out-of-map navigation cases This feature is highly desirable since the

map can be incomplete, or the vehicle can be actually located outside the boundaries of the

provided map

The system behaves like a virtual GPS, providing accurate global localization without using

an actual GPS

Experimental results have shown that the proposed architecture works robustly in urban

environments using segment-based road maps These particular maps provide road

network connectivity in the context of the Darpa Urban Challenge However, the proposed

architecture is general and can be used with any kind of segment-based or topological a

priori map

The filter is able to provide consistent localization, for extended periods of time and long

traversed courses, using only rough dead-reckoning input (affected by considerably drift),

and the RNDF map

The system performs robustly in a variety of circumstances, including extreme situations

such as tunnels, where a GPS-based positioning would not render any solution at all

The continuation of this work involves different lines of research and development One of

them is the implementation of this approach as a robust and reliable module ready to be

used as a localization resource by other systems However this process should be flexible

enough to allow the integration with other sources of observations such as biased compass

measurements and even sporadic GPS measurements

Other necessary and interesting lines are related to the initialization of the estimation

process, particularly for cases where the robot starts at a completely unknown position

Defining a huge local area for the definition of the likelihood (and spreading a population of

particles in it) is not feasible in real-time We are investigating efficient and practical

solutions for that issue

Another area of relevance is the application of larger paths in the evaluation of the Path

Likelihood In the current implementation we consider a deterministic path, i.e we exploit

the fact that for short paths the dead-reckoning presents low uncertainty to the degree of

allowing us to consider the recent path as a deterministic entity In order to extend the path

validity we need to model the path in a stochastic way, i.e by a PDF Although this concept

is mathematically easy to define and understand it implies considerable additional

computational cost

Finally, the observability of the estimation process can be increased by considering additional sources of observation such the detection of road intersections These additional observations would improve the observability of the process particularly when the vehicle does not perform turning maneuvers for long periods

Likelihood in selected region

-1500 -1000 -500 0 500 1000 1500 2000 -1500

-1000 -500 0 500 1000 1500 2000 2500

Estimated Path and Known Map

Trang 8

-1500 -1000 -500 0 500 1000 1500 2000 -1000

Fig 8 Estimated path (in yellow) for one of the experiments The known map (cyan) and a

satellite image of the region are included in the picture

-750 -700 -650 -600 -550

-800 -790 -780 -770 -760 -750 -740

Trang 9

Fig 8 Estimated path (in yellow) for one of the experiments The known map (cyan) and a

satellite image of the region are included in the picture

-750 -700 -650 -600 -550

-800 -790 -780 -770 -760 -750 -740

Trang 10

1680 1690 1700 1710 1720 1730 -788

Fig 12 A close inspection shows interesting details The estimates are provided at

frequencies higher than the GPS (5Hz) The GPS presents jumps and the road segment

appears as a continuous piece-wise line (in blue), both sources of information are unreliable

if used individually

6 References

S Arulampalam, S Maskell, N Gordon, and T Clapp, “A Tutorial on Particle Filters for

On-line Non-On-linear/Non-Gaussian Bayesian Tracking,” IEEE Transactions on Signal

Processing, vol 50, no 2, pp 174–188, 2002

F Dellaert, D Fox, W Burgard, and S Thrun, “Monte Carlo Localization for Mobile

Robots,” in International Conference on Robotics and Automation Detroit: IEEE, 1999

D Fox, S Thrun, W Burgard, and F Dellaert, “Particle Filters for Mobile Robot

Localization,” in Sequential Monte Carlo Methods in Practice, A Doucet, N de Freitas,

and Gordon, Eds New York: Springer, 2001

J E Guivant and E M Nebot, “Optimization of the Simultaneous Localization and

Map-building Algorithm for Real time Implementation,” IEEE Transactions on Robotics

and Automation, vol 17, no 3, 2001, pp 242–257

J Guivant, and R Katz, “Global urban localization based on road maps,” in RSJ

International Conference on Intelligent Robots and Systems, IROS San Diego, CA :

IEEE, Oct 2007, pp 1079-1084 ISBN: 978-1-4244-0912-9

J J Leonard and H F Durrant-Whyte, “Simultaneous Map Building and Localization for an

Autonomous Mobile Robot,” IEEE/RSJ International Workshop on Intelligent Robots

and Systems IEEE, 1991

L Liao, D Fox, J Hightower, H Kautz, and D Schultz, “Voronoi Tracking: Location

Estimation Using Sparse and Noisy Sensor Data,” in IEEE/RSJ International

Workshop on Intelligent Robots and Systems Las Vegas, USA: IEEE, 2003

M E E Najjar and P Bonnifait, “A Road-Matching Method for Precise Vehicle Localization

Using Belief Theory and Kalman Filtering,” Autonomous Robots, vol 19, no 2, 2005,

pp 173–191

S M Oh, S Tariq, B N Walker, and F Dellaert, “Map based Priors for Localization,” in

IEEE/RSJ International Workshop on Intelligent Robots and Systems Sendai, Japan:

IEEE, 2004

G Taylor and G Blewitt, “Road Reduction Filtering using GPS,” in 3rd AGILE Conference on

Geographic Information Science, Helsinki, Finland, 2000

ACFR, The University of Sydney and LCR, Universidad Nacional del Sur, “PAATV/UTE

Projects,” Sydney, Australia, 2006

Defence Advanced Research Projects Agency (DARPA), “DARPA Urban Challenge,”

http://www.darpa.mil/grandchallenge/, 2006

Google, “Google Earth,” http://earth.google.com/, 2007

Trang 11

Fig 12 A close inspection shows interesting details The estimates are provided at

frequencies higher than the GPS (5Hz) The GPS presents jumps and the road segment

appears as a continuous piece-wise line (in blue), both sources of information are unreliable

if used individually

6 References

S Arulampalam, S Maskell, N Gordon, and T Clapp, “A Tutorial on Particle Filters for

On-line Non-On-linear/Non-Gaussian Bayesian Tracking,” IEEE Transactions on Signal

Processing, vol 50, no 2, pp 174–188, 2002

F Dellaert, D Fox, W Burgard, and S Thrun, “Monte Carlo Localization for Mobile

Robots,” in International Conference on Robotics and Automation Detroit: IEEE, 1999

D Fox, S Thrun, W Burgard, and F Dellaert, “Particle Filters for Mobile Robot

Localization,” in Sequential Monte Carlo Methods in Practice, A Doucet, N de Freitas,

and Gordon, Eds New York: Springer, 2001

J E Guivant and E M Nebot, “Optimization of the Simultaneous Localization and

Map-building Algorithm for Real time Implementation,” IEEE Transactions on Robotics

and Automation, vol 17, no 3, 2001, pp 242–257

J Guivant, and R Katz, “Global urban localization based on road maps,” in RSJ

International Conference on Intelligent Robots and Systems, IROS San Diego, CA :

IEEE, Oct 2007, pp 1079-1084 ISBN: 978-1-4244-0912-9

J J Leonard and H F Durrant-Whyte, “Simultaneous Map Building and Localization for an

Autonomous Mobile Robot,” IEEE/RSJ International Workshop on Intelligent Robots

and Systems IEEE, 1991

L Liao, D Fox, J Hightower, H Kautz, and D Schultz, “Voronoi Tracking: Location

Estimation Using Sparse and Noisy Sensor Data,” in IEEE/RSJ International

Workshop on Intelligent Robots and Systems Las Vegas, USA: IEEE, 2003

M E E Najjar and P Bonnifait, “A Road-Matching Method for Precise Vehicle Localization

Using Belief Theory and Kalman Filtering,” Autonomous Robots, vol 19, no 2, 2005,

pp 173–191

S M Oh, S Tariq, B N Walker, and F Dellaert, “Map based Priors for Localization,” in

IEEE/RSJ International Workshop on Intelligent Robots and Systems Sendai, Japan:

IEEE, 2004

G Taylor and G Blewitt, “Road Reduction Filtering using GPS,” in 3rd AGILE Conference on

Geographic Information Science, Helsinki, Finland, 2000

ACFR, The University of Sydney and LCR, Universidad Nacional del Sur, “PAATV/UTE

Projects,” Sydney, Australia, 2006

Defence Advanced Research Projects Agency (DARPA), “DARPA Urban Challenge,”

http://www.darpa.mil/grandchallenge/, 2006

Google, “Google Earth,” http://earth.google.com/, 2007

Trang 13

Object Localization using Stereo Vision 285

Object Localization using Stereo Vision

Sai Krishna Vuppala

x

Object Localization using Stereo Vision

Sai Krishna Vuppala

Institute of Automation, University of Bremen

Germany

1 Introduction

Computer vision is the science and technology of machines that see The theoretical

explanation about the optics of the eye and the information about the existence of images

formed at the rear of the eye ball are provided by Kepler and Scheiner respectively in the

16th century (Lin, 2002) The field of computer vision is started emerging in the second half

of 19th century, since then researchers have been trying to develop the methods and systems

aiming at imitating the biological vision process and therefore increasing the machine

intelligence for many possible applications in the real world The theoretical knowledge

behind the perception of 3D real world using multiple vision systems has been published in

various literature, and (Hartley & Zisserman 2000) (Klette at al., 1998) (Sterger , 2007) are few

well known from them As there are numerous applications of computer vision in service,

industrial, surveillance, and surgical etc automation sectors, researchers have been

publishing numerous methods and systems that address their specific goals It is intended

with most of these researchers that their developed methods and systems are enough

general with respect to the aspects such as functional, robust, time effective and safety

issues Though practical limitations hinder the researchers and developers in achieving

these goals, many are getting ahead providing the solutions to the known and predicted

problems

The field of computer vision is supporting the field of robotics with many vision based

applications In service robotics action interpretation and object manipulation are few

examples with which the computer vision supports the humans According to (Taylor &

Kleeman, 2006), three things are almost certain about universal service robots of the future:

many will have manipulators (and probably legs!), most will have cameras, and almost all

will be called upon to grasp, lift, carry, stack or otherwise manipulate objects in our

environment Visual perception and coordination in support of robotic grasping is thus a

vital area of research for the progress of universal service robots Service robotic tasks are

usually specified at a supervisory level with reference to general actions and classes of

objects An example of such a task would be: Please get ‘the bottle’ from ‘the fridge’ Here,

getting the bottle is the intended action, ‘bottle’ belongs to the class of objects that are

manipulated, and ‘fridge’ belongs to the class of objects in/on which the objects to be

manipulated lie The content of the manuscript addresses the object manipulation tasks

using stereo vision for applications of service robotics Motivation of the content of the

manuscript stems from the needs of service robotic systems FRIEND II and FRIEND III

15

Trang 14

(Functional Robot with dexterous arm and user-frIENdly interface for Disabled people) that

are being developed at IAT (Institute of Automation, University of Bremen, Germany) The

systems FRIEND II and FRIEND III are shown in figure 1 a) and b) The objects of interest to

be manipulated are shown in figure 2

Fig 1 Rehabilitation robotic systems a) FRIEND II b) FRIEND III

Fig 2 The objects of interest for the manipulation purpose

In order robot manipulators to perform their actions autonomously, 3D environment

information is necessary In order to reconstruct 3D object information using cameras

various approaches have been investigated since few decades According to (Lange at al.,

2006), the 3D object information recovery techniques are broadly classified into passive and

active sensing methods Passive sensing methods require relatively low power compared to

the active methods Passive sensing methods are similar to the biological vision process The

characteristic contrast-related features in images (i.e cues such as shading, texture) of the

observed scene are used in extracting the 3D information In active sensing methods compared with passive methods high accuracy measurements can be obtained Active sensing relies on projecting energy into the environment and interpreting the modified sensory view CMOS time-of-flight cameras can capture complete 3D image in a single measurement cycle (Lange, 2000), however such a technology is limited with timing resolutions and, the possibility of the 3D information perception depends on the object surface properties

Considering passive sensing methods, though any number of cameras can be used in 3D vision process, the methods based on stereo vision plays optimal role considering the functional, hardware, and computational issues Visual servoing system (Corke, 1996) (Hutchinson, 1996) come into this this category The two major classes of such systems are position- and image based visual servoing systems In position based visual servoing systems, they are further classified into closed- and open loop systems Depending on the requirements of the strategy, they can be realized using single, or two cameras Closed loop systems have high accuracy since the overall system error is minimized using the feedback information, where as in an open-loop system the error depends on the performance of the system in a single step In contrast to the closed approaches look-then-move is an open loop approach for the autonomous task executions; the vision system identifies the target location

in 3D and the robot is driven to the location of interest In look-then-move approach, the accuracy of the overall robotic system depends on the camera calibration accuracy and the manipulator accuracy in reaching the specified locations in the real world (Garric & Devy

1995) In a system like FRIEND II look-then-move approach is prefered since object manipulation, collision avoidance, and path planning are planned using a standlone stereo camera unit

Stereo triangulation is the core process in stereo based 3D vision applications, in order to calculate 3D point stereo correspondences information is used Precisely identified stereo correspondences give precise 3D reconstruction results Any uncertainty in the result of stereo correspondences can potentially yield a virtually unbounded uncertainty in the result

of 3D reconstruction (Lin, 2002) Precise 3D reconstruction additionally requires well calibrated stereo cameras as the calibrated parameters of the stereo view geometry are used

in the stereo triangulation process (Hartley & Zissermann 2000) In addition to that 3D reconstruction accuracy further depends on the length of base line (Gilbert at al., 2006), if the base line is shorter the matching process is facilitated, and if the base line is larger the 3D reconstruction accuracy is higher Approaches for finding the stereo correspondences are broadly classified into two types (Klette at al., 1998); they are intensity and feature-based matching techniques The state of the art stereo correspondence search algorithms mostly based on various scene and geometry based constraints These algorithms can not be used for the texture less objects as the reconstructed obejct information is error prone

The pose (position and orientation) of an object can be estimated using the information from

a single or multiple camera views Assuming that the internal camera parameters are known, the problem of finding the object pose is nothing but finding the orientation and position of object with respect to the camera The problem of finding the pose of the object using the image and object point correspondences in the literature is described as a perspective n-point problem The standard perspective n-point problem can be solved using systems of linear equations if correspondences between at least six image and scene points are known Several researchers provided solutions for this problem considering at least 3

Trang 15

Object Localization using Stereo Vision 287

(Functional Robot with dexterous arm and user-frIENdly interface for Disabled people) that

are being developed at IAT (Institute of Automation, University of Bremen, Germany) The

systems FRIEND II and FRIEND III are shown in figure 1 a) and b) The objects of interest to

be manipulated are shown in figure 2

Fig 1 Rehabilitation robotic systems a) FRIEND II b) FRIEND III

Fig 2 The objects of interest for the manipulation purpose

In order robot manipulators to perform their actions autonomously, 3D environment

information is necessary In order to reconstruct 3D object information using cameras

various approaches have been investigated since few decades According to (Lange at al.,

2006), the 3D object information recovery techniques are broadly classified into passive and

active sensing methods Passive sensing methods require relatively low power compared to

the active methods Passive sensing methods are similar to the biological vision process The

characteristic contrast-related features in images (i.e cues such as shading, texture) of the

observed scene are used in extracting the 3D information In active sensing methods compared with passive methods high accuracy measurements can be obtained Active sensing relies on projecting energy into the environment and interpreting the modified sensory view CMOS time-of-flight cameras can capture complete 3D image in a single measurement cycle (Lange, 2000), however such a technology is limited with timing resolutions and, the possibility of the 3D information perception depends on the object surface properties

Considering passive sensing methods, though any number of cameras can be used in 3D vision process, the methods based on stereo vision plays optimal role considering the functional, hardware, and computational issues Visual servoing system (Corke, 1996) (Hutchinson, 1996) come into this this category The two major classes of such systems are position- and image based visual servoing systems In position based visual servoing systems, they are further classified into closed- and open loop systems Depending on the requirements of the strategy, they can be realized using single, or two cameras Closed loop systems have high accuracy since the overall system error is minimized using the feedback information, where as in an open-loop system the error depends on the performance of the system in a single step In contrast to the closed approaches look-then-move is an open loop approach for the autonomous task executions; the vision system identifies the target location

in 3D and the robot is driven to the location of interest In look-then-move approach, the accuracy of the overall robotic system depends on the camera calibration accuracy and the manipulator accuracy in reaching the specified locations in the real world (Garric & Devy

1995) In a system like FRIEND II look-then-move approach is prefered since object manipulation, collision avoidance, and path planning are planned using a standlone stereo camera unit

Stereo triangulation is the core process in stereo based 3D vision applications, in order to calculate 3D point stereo correspondences information is used Precisely identified stereo correspondences give precise 3D reconstruction results Any uncertainty in the result of stereo correspondences can potentially yield a virtually unbounded uncertainty in the result

of 3D reconstruction (Lin, 2002) Precise 3D reconstruction additionally requires well calibrated stereo cameras as the calibrated parameters of the stereo view geometry are used

in the stereo triangulation process (Hartley & Zissermann 2000) In addition to that 3D reconstruction accuracy further depends on the length of base line (Gilbert at al., 2006), if the base line is shorter the matching process is facilitated, and if the base line is larger the 3D reconstruction accuracy is higher Approaches for finding the stereo correspondences are broadly classified into two types (Klette at al., 1998); they are intensity and feature-based matching techniques The state of the art stereo correspondence search algorithms mostly based on various scene and geometry based constraints These algorithms can not be used for the texture less objects as the reconstructed obejct information is error prone

The pose (position and orientation) of an object can be estimated using the information from

a single or multiple camera views Assuming that the internal camera parameters are known, the problem of finding the object pose is nothing but finding the orientation and position of object with respect to the camera The problem of finding the pose of the object using the image and object point correspondences in the literature is described as a perspective n-point problem The standard perspective n-point problem can be solved using systems of linear equations if correspondences between at least six image and scene points are known Several researchers provided solutions for this problem considering at least 3

Trang 16

object points According to Shakunaga in his publication on pose estimation using single

camera (Shakunaga, 1991) described that an n-vector body with n >=3 gives at most 8 rotation

candidates from object image correspondences In case if n < 3, the recovery of the rotation

of n-vector body is not possible

The content of the manuscript discusses selection of stereo feature correspondences and

determining the stereo correspondences for opted features on texture less objects in sections

2 and 3 respectively; section 4 presents tracking object pose using 2 object points ; section 5

presents 3D object reconstruction results; Conclusion and References are followed in

sections 6 and 7 respectively

2 Selection of Stereo Feature Correspondences

Scene independent 3D object reconstruction requires information from at least two camera

views Considering stereo vision rather than multiple vision systems for this purpose eases

the computational complexity of the system Stereo vision based 3D reconstruction methods

require stereo correspondence information Traditional problems in finding the stereo

correspondence are occlusion, regularity/ repetitiveness Traditionally these problems are

solved using intensity and feature based stereo matching techniques However, absolute

homogeneous surfaces without any sharp features (i.e edges, corners etc) do not provide

proper stereo correspondence information The domestic objects to be manipulated in the

FRIEND environment are some examples of such kind As the considered object (either each

part or the whole body) has uniform color information the intensity based correspondence

search methods often provide improper stereo correspondence information Therefoere,

alternatively the edges of the green bottle are observed for the stereo correspondence

analysis Out of all possible edges of green bottle that is shown in figure 3, only the

orientation edges of the bottle can be considered for the 3D bottle reconstruction; this is

because the stereo correspondence information between other kinds of edges is typically

lost

2.1 Resolutions between 3D and 2D

The 3D reconstruction accuracy depends on, the accuracy of calibration parameters, the sub

pixel accuracy of stereo correspondences and the length of the baseline In addition to these

factors, the 3D reconstruction accuracy further depends on the pixel size, the focal length of

the camera, and the distance between the camera and the object In the following the feasible

spatial resolution of object surface in the projected scene is analyzed Figure 4 illustrates the

projection of the object surface on to a single image row The feasible geometrical resolution

of the object surface projected on to one row of the image plane is calculated using formula

(1) Similarly the feasible resolution of the object projection on to the column of the object

plane can be calculated The width and the height of the pictured scene are also can be

approximately calculated multiplying the number of pixels along the rows and columns

multiplied with respective feasible resolutions

Fig 3 Various possible edges of green bottle from FRIEND Environment

f

pd resolution

Fig 4 Feasible spatial resolution on object surface in the projected scene (Klette at al., 1998) E.g: The size of the SONY CCD Sensor ICX204AK that is used in the bumblebee stereo vision system has the pixel size of 4.54m×4.54m, the focal length of the camera is 4mm,

and the assumed situation of the object is about 1m far from the camera The feasible

resolution of the camera along the row is 1.125mm/column and the feasible resolution along the column is 1.125mm/row Similarly with a higher focal length camera such as 6mm the feasible resolution becomes 0.75mm/column Implicitely, with decrease in feasible

resolution increases the 3D reconstruction accuracy From equation (1), the feasible resolution is proportional to the distance between the camera and the object, as the distance

Trang 17

Object Localization using Stereo Vision 289

object points According to Shakunaga in his publication on pose estimation using single

camera (Shakunaga, 1991) described that an n-vector body with n >=3 gives at most 8 rotation

candidates from object image correspondences In case if n < 3, the recovery of the rotation

of n-vector body is not possible

The content of the manuscript discusses selection of stereo feature correspondences and

determining the stereo correspondences for opted features on texture less objects in sections

2 and 3 respectively; section 4 presents tracking object pose using 2 object points ; section 5

presents 3D object reconstruction results; Conclusion and References are followed in

sections 6 and 7 respectively

2 Selection of Stereo Feature Correspondences

Scene independent 3D object reconstruction requires information from at least two camera

views Considering stereo vision rather than multiple vision systems for this purpose eases

the computational complexity of the system Stereo vision based 3D reconstruction methods

require stereo correspondence information Traditional problems in finding the stereo

correspondence are occlusion, regularity/ repetitiveness Traditionally these problems are

solved using intensity and feature based stereo matching techniques However, absolute

homogeneous surfaces without any sharp features (i.e edges, corners etc) do not provide

proper stereo correspondence information The domestic objects to be manipulated in the

FRIEND environment are some examples of such kind As the considered object (either each

part or the whole body) has uniform color information the intensity based correspondence

search methods often provide improper stereo correspondence information Therefoere,

alternatively the edges of the green bottle are observed for the stereo correspondence

analysis Out of all possible edges of green bottle that is shown in figure 3, only the

orientation edges of the bottle can be considered for the 3D bottle reconstruction; this is

because the stereo correspondence information between other kinds of edges is typically

lost

2.1 Resolutions between 3D and 2D

The 3D reconstruction accuracy depends on, the accuracy of calibration parameters, the sub

pixel accuracy of stereo correspondences and the length of the baseline In addition to these

factors, the 3D reconstruction accuracy further depends on the pixel size, the focal length of

the camera, and the distance between the camera and the object In the following the feasible

spatial resolution of object surface in the projected scene is analyzed Figure 4 illustrates the

projection of the object surface on to a single image row The feasible geometrical resolution

of the object surface projected on to one row of the image plane is calculated using formula

(1) Similarly the feasible resolution of the object projection on to the column of the object

plane can be calculated The width and the height of the pictured scene are also can be

approximately calculated multiplying the number of pixels along the rows and columns

multiplied with respective feasible resolutions

Fig 3 Various possible edges of green bottle from FRIEND Environment

f

pd resolution

Fig 4 Feasible spatial resolution on object surface in the projected scene (Klette at al., 1998) E.g: The size of the SONY CCD Sensor ICX204AK that is used in the bumblebee stereo vision system has the pixel size of 4.54m×4.54m, the focal length of the camera is 4mm,

and the assumed situation of the object is about 1m far from the camera The feasible

resolution of the camera along the row is 1.125mm/column and the feasible resolution along the column is 1.125mm/row Similarly with a higher focal length camera such as 6mm the feasible resolution becomes 0.75mm/column Implicitely, with decrease in feasible

resolution increases the 3D reconstruction accuracy From equation (1), the feasible resolution is proportional to the distance between the camera and the object, as the distance

Ngày đăng: 12/08/2014, 00:20