Machine Learning and Robot Perception - Bruno Apolloni et al (Eds) Part 9 doc

Hwang, “Fast and automatic video object segmentation and tracking for content based applications,” IEEE Trans.. Bouthemy, “Region-based tracking using afﬁne motion models in long image s

Trang 1

5 Efficient Incorporation of Optical Flow 195

Fig 5.16 Processing time (averaged over 7-window-frames) vs frames: for the

original sequence (left), for the sequence subsampled by 6 in time (right)

The polygonal tracker with its ability to utilize various region-based de- scriptors could be used for tracking textured objects on textured back- grounds A speciﬁc choice based on an information-theoretic measure [41]

Trang 2

196 G Unal et al

Fig 5.17 A flatworm in a textured sea terrain (15 frames are shown left-right

top-bottom) Polygonal tracker successfully tracks the flatworm

whose approximation uses high order moments of the data distributions leads the image based integrand f in Eq.(17) to take the form

))(I)(G)(I))((G

for instance as G1([)=[eí[2 /2and G2([) = eí[2 /2 When the correction step

of our method involves the descriptor f just given with a adaptive number

of vertices, a ﬂatworm swimming in the bottom of the sea could be captured through the highly textured sequence by the polygonal tracker in Fig 5.17 The speed plots in Fig 5.18 depict the speeds for the tracker with and without prediction The ﬁgure on the right is for the original sequence (whose plot is given on the left) which is temporally subsampled

by two Varying the number of vertices to account for shape variations of the worm slows down the tracking in general However, the tracker with prediction still performs faster than the tracker without prediction as expected The difference in speeds becomes more pronounced in the subsampled sequence on the left Similarly, a clownﬁsh on a host anemone shown in Figure 5.19, could be tracked in a highly textured scene The continuous trackers we have introduced in this study do not provide a continuous tracking in either of these examples, and they split, leak to background regions, and lose track of the target completely

Trang 3

Fig 5.18 Processing time (averaged over 7-window-frames) vs frames: for the

original sequence (left), for the sequence subsampled by 2 in time (right)

Trang 4

198 G Unal et al

Fig 5.19 A clownfish with its textured body swims in its host anemone (Frames

1, 13, 39,59, 64, 67, 71, 74, 78, 81, 85, 95, 105, 120, 150, 155 are shown left-right top-bottom) Polygonal tracker successfully tracks the fish

5.5 Conclusions

In this chapter, we have presented a simple but efficient approach to object tracking combining active contours framework with the optical-flow based motion estimation Both curve evolution and polygon evolution models are utilized to carry out the tracking The ODE model obtained in the polygonal tracker, can act on vertices of a polygon for their intra-frame as well as inter-frame motion estimation according to region-based characteristics as well as the optical-flow field’s known properties The latter is easily estimated from a well-known image brightness constraint

We have demonstrated by way of example and discussion that our proposed tracking approach effectively and efficiently moves vertices through integrated local information with a resulting superior performance

Trang 5

We note moreover that no prior shape model assumptions on targets are made, since any shape may be approximated by a polygon While the topology-change property provided by continuous contours in the level-set framework is not attained, this limitation may be an advantage if the target region stays simply connected We also note that there are no assumptions, such as a static camera which is widely employed in the literature by other object tracking methods utilizing also a motion detection step A motion detection step can also be added to this framework to make the algorithm more unsupervised in detecting motion in the scene, or the presence of multiple moving targets in the scene

References

1 C Kim and J N Hwang, “Fast and automatic video object segmentation and tracking for content based applications,” IEEE Trans Circuits and Systems on Video Technology, vol 12, no 2, pp 122–129, 2002

2 N Paragios and R Deriche, “Geodesic active contours and level sets for the detection and tracking of moving objects,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 22, no 3, pp 266–280, 2000

3 F G Meyer and P Bouthemy, “Region-based tracking using afﬁne motion models in long image sequences,” Computer Vision, Graphics, and Image Processing, vol 60, no 2, pp 119–140, 1994

4 B Bascle and R Deriche, “Region tracking through image sequences,” in Proc Int Conf on Computer Vision, 1995, pp 302–307

5 J Wang and E Adelson, “Representing moving images with layers,” IEEE Trans Image Process., vol 3, no 5, pp 625–638, 1994

6 T.J Broida and R Chellappa, “Estimation of object motion parameters from noisy images,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 8, no 1, pp 90–99, 1986

7 D Koller, K Daniilidis, and H H Nagel, “Model-based object tracking in monocular image sequences of road traffic scenes,” Int J Computer Vision, vol 10, no 3, pp 257–281, 1993

8 J Regh and T Kanade, “Model-based tracking of self-occluding articulated objects,” in Proc IEEE Conf on Computer Vision and Pattern Recognition, 1995, pp 612–617

9 D Gavrial and L Davis, “3-d model-based tracking of humans in action: A multi-view approach,” in Proc IEEE Conf on Computer Vision and Pattern Recognition, 1996, pp 73–80

Trang 6

12 M O Berger, “How to track efficiently piecewise curved contours with a view to reconstructing 3D objects,,” in Proc Int Conf on Pattern Recognition, 1994, pp 32–36.

13 M Isard and A Blake, “Contour tracking by stochastic propagation of conditional density,,” in Proc European Conf Computer Vision, 1996, pp 343–356

14 Y Fu, A T Erdem, and A M Tekalp, “Tracking visible boundary of objects using occlusion adaptive motion snake,” IEEE Trans Image Process., vol 9, no 12, pp 2051–2060, 2000

15 F Leymarie and M Levine, “Tracking deformable objects in the plane using an active contour model,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 15, no 6, pp 617–634, 1993

16 V Caselles and B Coll, “Snakes in movement,” SIAM Journal on Numerical Analysis, vol 33, no 12, pp 2445–2456, 1996

17 J Badenas, J M Sanchiz, and F Pla, “Motion-based segmentation and region tracking in image sequences,” Pattern Recognition, vol 34, pp 661–670, 2001

18 F Marques and V Vilaplana, “Face segmentation and tracking based

on connected operators and partition projection,” Pattern Recognition, vol 35, pp 601–614, 2002

19 J Badenas, J.M Sanchiz, and F Pla, “Using temporal integration for tracking regions in traffic monitoring sequences,” in Proc Int Conf

Trang 7

25 S Ullman, “Analysis of visual motion by biological and computer systems,” IEEE Computer, vol 14, no 8, pp 57–69, 1981

26 B K P Horn and B G Schunck, “Determining optical ﬂow,” AI, vol

17, pp 185–203, 1981

27 A Kumar, A R Tannenbaum, and G J Balas, “Optical ﬂow: A curve evolution approach,” IEEE Trans Image Process., vol 5, no 4, pp 598–610, 1996

28 B D Lucas and T Kanade, “An iterative image registration technique with an application to stereo vision,” Proc Imaging Understanding Workshop, pp 121–130, 1981

29 H H Nagel and W Enkelmann, “An investigation of smoothness constraints for the estimation of displacement vector ﬁelds from image sequences,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 8, no 5, pp 565–593, 1986

30 S V Fogel, “The estimation of velocity vector ﬁelds from varying image sequences,” CVGIP: Image Understanding, vol 53, no

34 A M Tekalp, Digital Video Processing, Prentice Hall, 1995

35 M.I Sezan and R.L Lagendijk (eds.), Motion Analysis and Image Sequence Processing, Norwell, MA: Kluwer, 1993

36 W E Snyder (Ed.), “Computer analysis of time varying images, special issue,” IEEE Computer, vol 14, no 8, pp 7–69, 1981

37 D Terzopoulos and R Szeliski, Active Vision, chapter Tracking with Kalman Snakes, pp 3–20, MIT Press, 1992

38 N Peterfreund, “Robust tracking of position and velocity with Kalman snakes,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol

21, no 6, pp 564–569, 1999

39 D G Luenberger, “An introduction to observers,” IEEE Transactions

on Automatic Control, vol 16, no 6, pp 596–602, 1971

40 A Gelb, Ed., Applied Optimal Estimation, MIT Press, 1974

41 G Unal, A Yezzi, and H Krim, “Information-theoretic active polygons for unsupervised texture segmentation,” May-June 2005, IJCV

Trang 8

202 G Unal et al

42 S Zhu and A Yuille, “Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation,” ,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 18, no 9, pp 884–900, 1996

43 B.B Kimia, A Tannenbaum, and S Zucker, “Shapes, shocks, and deformations I,” Int J Computer Vision, vol 31, pp 189–224, 1995

44 S Osher and J.A Sethian, “Fronts propagating with curvature dependent speed: Algorithms based on the Hamilton-Jacobi formulation,” J Computational Physics, vol 49, pp 12–49, 1988

45 D Peng, B Merriman, S Osher, H-K Zhao, and M Kang, “A based fast local level set method,” J Computational Physics, vol 255,

PDE-pp 410–438, 1999

46 T.F Chan and L.A Vese, “An active contour model without edges,” in Int Conf Scale-Space Theories in Computer Vision, 1999, pp 141–151

47 A Yezzi, A Tsai, and A Willsky, “A fully global approach to image segmentation via coupled curve evolution equations,” J Vis Commun Image Representation, vol 13, pp 195–216, 2002

48 M Bertalmio, L.T Cheng, S Osher, and G Sapiro, “Variational problems and partial differential equations on implicit surfaces,” J Computational Physics, vol 174, no 2, pp 759–780, 2001

Trang 9

6 3-D Modeling of Real-World Objects Using

Range and Intensity Images

1 School of Electrical and Computer Engineering, Purdue University,West Lafayette, Indiana, U.S.A

an object surfacebymillions of polygons which allows such representations

to be visualized interactively in real-time Obviously, to take advantage ofthese technological advances, the 3D models constructed must capture to themaximum extent possible of the shape and surface-texture information ofreal-world objects By real-world objects, we mean objects that may presentself-occlusion with respect to the sensory devices; objects with shiny sur-faces that may create mirror-like (specular) effects; objects that may absorblight and therefore not be completely perceived by the vision system; andother types of optically uncooperative objects Construction of such photo-realistic 3D models of real-world objects is the main focus of this chapter Ingeneral, the construction of such 3D models entails four main steps:

1 Acquisition of geometric data:

First, a range sensor must be used to acquire the geometric shape of theexterior of the object Objects of complex shape may require a largenumber of range images viewed from different directions so that all of

J Park and G.N DeSouza: 3-D Modeling of Real-World Objects Using Range and Intensity

www.springerlink.com Springer-Verlag Berlin Heidelberg 2005c

Images, Studies in Computational Intelligence (SCI) 7, 203–264 (2005)

Trang 10

the surface detail is captured, although it is very difﬁcult to capture theentire surface if the object contains signiﬁcant protrusions.

4 Acquisition of reﬂection data:

In order to provide a photo-realistic visualization, the ﬁnal step quires the reﬂectance properties of the object surface, and this infor-mation is added to the geometric model

ac-Each of these steps will be described in separate sections of this chapter

6.2 Acquisition of Geometric Data

The ﬁrst step in 3D object modeling is to acquire the geometric shape ofthe exterior of the object Since acquiring geometric data of an object is

a very common problem in computer vision, various techniques have beendeveloped over the years for different applications

6.2.1 Techniques of Acquiring 3D Data

The techniques described in this section are not intended to be exhaustive;

we will mention brieﬂy only the prominent approaches In general, methods

of acquiring 3D data can be divided into passive sensing methods and activesensing methods

Passive Sensing Methods

The passive sensing methods extract 3D positions of object points by ing images with ambient light source Two of the well-known passive sens-

us-J Park and G N DeSouza

204

Trang 11

ing methods are From-Shading (SFS) and stereo vision The From-Shading method uses a single image of an object The main idea ofthis method derives from the fact that one of the cues the human visual sys-tem uses to infer the shape of a 3D object is its shading information Usingthe variation in brightness of an object, the SFS method recovers the 3Dshape of an object There are three major drawbacks of this method: First,the shadow areas of an object cannot be recovered reliably since they do notprovide enough intensity information Second, the method assumes that theentire surface of an object has uniform reﬂectance property, thus the methodcannot be applied to general objects Third, the method is very sensitive tonoise since the computation of surface gradients is involved.

Shape-The stereo vision method uses two or more images of an object fromdifferent viewpoints Given the image coordinates of the same object point

in two or more images, the stereo vision method extracts the 3D coordinate

of that object point A fundamental limitation of this method is the fact thatﬁnding the correspondence between images is extremely difﬁcult

The passive sensing methods require very simple hardware, but usuallythese methods do not generate dense and accurate 3D data compare to theactive sensing methods

Active Sensing Methods

The active sensing methods can be divided into two categories: contact andnon-contact methods Coordinate Measuring Machine (CMM) is a primeexample of the contact methods CMMs consist of probe sensors whichprovide 3D measurements by touching the surface of an object AlthoughCMMs generate very accurate and ﬁne measurements, they are very expen-sive and slow Also, the types of objects that can be used by CMMs arelimited since physical contact is required

The non-contact methods project their own energy source to an object,then observe either the transmitted or the reﬂected energy The computedtomography (CT), also known as the computed axial tomography (CAT),

is one of the techniques that records the transmitted energy It uses X-raybeams at various angles to create cross-sectional images of an object Sincethe computed tomography provides the internal structure of an object, themethod is widely used in medical applications

The active stereo uses the same idea of the passive sensing stereo method,but a light pattern is projected onto an object to solve the difﬁculty of ﬁndingcorresponding points between two (or more) camera images

The laser radar system, also known as LADAR, LIDAR, or optical radar,uses the information of emitted and received laser beam to compute thedepth There are mainly two methods that are widely used: (1) using ampli-tude modulated continuous wave (AM-CW) laser, and (2) using laser pulses

6 3D Modeling of Real-World Objects Using Range and Intensity Images 205

Trang 12

The ﬁrst method emits AM-CW laser onto a scene, and receives the laserthat was reﬂected by a point in the scene The system computes the phasedifference between the emitted and the received laser beam Then, the depth

of the point can be computed since the phase difference is directly tional to depth The second method emits a laser pulse, and computes theinterval between the emitted and the received time of the pulse The time in-

propor-terval, well known as time-of-ﬂight, is then used to compute the depth given

by t = 2z/c where t is time-of-ﬂight, z is depth, and c is speed of light The

laser radar systems are well suited for applications requiring medium-rangesensing from 10 to 200 meters

The structured-light methods project a light pattern onto a scene, thenuse a camera to observe how the pattern is illuminated on the object surface.Broadly speaking, the structured-light methods can be divided into scanningand non-scanning methods The scanning methods consist of a moving stageand a laser plane, so either the laser plane scans the object or the object movesthrough the laser plane A sequence of images is taken while scanning Then,

by detecting illuminated points in the images, 3D positions of correspondingobject points are computed by the equations of camera calibration The non-scanning methods project a spatially or temporally varying light pattern onto

an object An appropriate decoding of the reﬂected pattern is then used tocompute the 3D coordinates of an object

The system that acquired all the 3D data presented in this chapter fallsinto a category of a scanning structured-light method using a single laserplane From now on, such a system will be referred to as a structured-lightscanner

6.2.2 Structured-Light Scanner

Structured-light scanners have been used in manifold applications since thetechnique was introduced about two decades ago They are especially suit-able for applications in 3D object modeling for two main reasons: First, theyacquire dense and accurate 3D data compared to passive sensing methods.Second, they require relatively simple hardware compared to laser radar sys-tems

In what follows, we will describe the basic concept of structured-lightscanner and all the data that can be typically acquired and derived from thiskind of sensor

Định dạng
Số trang	25
Dung lượng	5,48 MB