Hwang, “Fast and automatic video object segmentation and tracking for content based applications,” IEEE Trans.. Bouthemy, “Region-based tracking using affine motion models in long image s
Trang 15 Efficient Incorporation of Optical Flow 195
Fig 5.16 Processing time (averaged over 7-window-frames) vs frames: for the
original sequence (left), for the sequence subsampled by 6 in time (right)
The polygonal tracker with its ability to utilize various region-based de- scriptors could be used for tracking textured objects on textured back- grounds A specific choice based on an information-theoretic measure [41]
Trang 2196 G Unal et al
Fig 5.17 A flatworm in a textured sea terrain (15 frames are shown left-right
top-bottom) Polygonal tracker successfully tracks the flatworm
whose approximation uses high order moments of the data distributions leads the image based integrand f in Eq.(17) to take the form
))(I)(G)(I))((G
for instance as G1([)=[eí[2 /2and G2([) = eí[2 /2 When the correction step
of our method involves the descriptor f just given with a adaptive number
of vertices, a flatworm swimming in the bottom of the sea could be captured through the highly textured sequence by the polygonal tracker in Fig 5.17 The speed plots in Fig 5.18 depict the speeds for the tracker with and without prediction The figure on the right is for the original sequence (whose plot is given on the left) which is temporally subsampled
by two Varying the number of vertices to account for shape variations of the worm slows down the tracking in general However, the tracker with prediction still performs faster than the tracker without prediction as expected The difference in speeds becomes more pronounced in the subsampled sequence on the left Similarly, a clownfish on a host anemone shown in Figure 5.19, could be tracked in a highly textured scene The continuous trackers we have introduced in this study do not provide a continuous tracking in either of these examples, and they split, leak to background regions, and lose track of the target completely
Trang 35 Efficient Incorporation of Optical Flow 197
Fig 5.18 Processing time (averaged over 7-window-frames) vs frames: for the
original sequence (left), for the sequence subsampled by 2 in time (right)
Trang 4198 G Unal et al
Fig 5.19 A clownfish with its textured body swims in its host anemone (Frames
1, 13, 39,59, 64, 67, 71, 74, 78, 81, 85, 95, 105, 120, 150, 155 are shown left-right top-bottom) Polygonal tracker successfully tracks the fish
5.5 Conclusions
In this chapter, we have presented a simple but efficient approach to object tracking combining active contours framework with the optical-flow based motion estimation Both curve evolution and polygon evolution models are utilized to carry out the tracking The ODE model obtained in the polygonal tracker, can act on vertices of a polygon for their intra-frame as well as inter-frame motion estimation according to region-based characteristics as well as the optical-flow field’s known properties The latter is easily estimated from a well-known image brightness constraint
We have demonstrated by way of example and discussion that our proposed tracking approach effectively and efficiently moves vertices through integrated local information with a resulting superior performance
Trang 55 Efficient Incorporation of Optical Flow 199
We note moreover that no prior shape model assumptions on targets are made, since any shape may be approximated by a polygon While the topology-change property provided by continuous contours in the level-set framework is not attained, this limitation may be an advantage if the target region stays simply connected We also note that there are no assumptions, such as a static camera which is widely employed in the literature by other object tracking methods utilizing also a motion detection step A motion detection step can also be added to this framework to make the algorithm more unsupervised in detecting motion in the scene, or the presence of multiple moving targets in the scene
References
1 C Kim and J N Hwang, “Fast and automatic video object segmentation and tracking for content based applications,” IEEE Trans Circuits and Systems on Video Technology, vol 12, no 2, pp 122–129, 2002
2 N Paragios and R Deriche, “Geodesic active contours and level sets for the detection and tracking of moving objects,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 22, no 3, pp 266–280, 2000
3 F G Meyer and P Bouthemy, “Region-based tracking using affine motion models in long image sequences,” Computer Vision, Graphics, and Image Processing, vol 60, no 2, pp 119–140, 1994
4 B Bascle and R Deriche, “Region tracking through image sequences,” in Proc Int Conf on Computer Vision, 1995, pp 302–307
5 J Wang and E Adelson, “Representing moving images with layers,” IEEE Trans Image Process., vol 3, no 5, pp 625–638, 1994
6 T.J Broida and R Chellappa, “Estimation of object motion parameters from noisy images,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 8, no 1, pp 90–99, 1986
7 D Koller, K Daniilidis, and H H Nagel, “Model-based object tracking in monocular image sequences of road traffic scenes,” Int J Computer Vision, vol 10, no 3, pp 257–281, 1993
8 J Regh and T Kanade, “Model-based tracking of self-occluding articulated objects,” in Proc IEEE Conf on Computer Vision and Pattern Recognition, 1995, pp 612–617
9 D Gavrial and L Davis, “3-d model-based tracking of humans in action: A multi-view approach,” in Proc IEEE Conf on Computer Vision and Pattern Recognition, 1996, pp 73–80
Trang 612 M O Berger, “How to track efficiently piecewise curved contours with a view to reconstructing 3D objects,,” in Proc Int Conf on Pattern Recognition, 1994, pp 32–36.
13 M Isard and A Blake, “Contour tracking by stochastic propagation of conditional density,,” in Proc European Conf Computer Vision, 1996, pp 343–356
14 Y Fu, A T Erdem, and A M Tekalp, “Tracking visible boundary of objects using occlusion adaptive motion snake,” IEEE Trans Image Process., vol 9, no 12, pp 2051–2060, 2000
15 F Leymarie and M Levine, “Tracking deformable objects in the plane using an active contour model,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 15, no 6, pp 617–634, 1993
16 V Caselles and B Coll, “Snakes in movement,” SIAM Journal on Numerical Analysis, vol 33, no 12, pp 2445–2456, 1996
17 J Badenas, J M Sanchiz, and F Pla, “Motion-based segmentation and region tracking in image sequences,” Pattern Recognition, vol 34, pp 661–670, 2001
18 F Marques and V Vilaplana, “Face segmentation and tracking based
on connected operators and partition projection,” Pattern Recognition, vol 35, pp 601–614, 2002
19 J Badenas, J.M Sanchiz, and F Pla, “Using temporal integration for tracking regions in traffic monitoring sequences,” in Proc Int Conf
Trang 75 Efficient Incorporation of Optical Flow 201
25 S Ullman, “Analysis of visual motion by biological and computer systems,” IEEE Computer, vol 14, no 8, pp 57–69, 1981
26 B K P Horn and B G Schunck, “Determining optical flow,” AI, vol
17, pp 185–203, 1981
27 A Kumar, A R Tannenbaum, and G J Balas, “Optical flow: A curve evolution approach,” IEEE Trans Image Process., vol 5, no 4, pp 598–610, 1996
28 B D Lucas and T Kanade, “An iterative image registration technique with an application to stereo vision,” Proc Imaging Understanding Workshop, pp 121–130, 1981
29 H H Nagel and W Enkelmann, “An investigation of smoothness constraints for the estimation of displacement vector fields from image sequences,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 8, no 5, pp 565–593, 1986
30 S V Fogel, “The estimation of velocity vector fields from varying image sequences,” CVGIP: Image Understanding, vol 53, no
34 A M Tekalp, Digital Video Processing, Prentice Hall, 1995
35 M.I Sezan and R.L Lagendijk (eds.), Motion Analysis and Image Sequence Processing, Norwell, MA: Kluwer, 1993
36 W E Snyder (Ed.), “Computer analysis of time varying images, special issue,” IEEE Computer, vol 14, no 8, pp 7–69, 1981
37 D Terzopoulos and R Szeliski, Active Vision, chapter Tracking with Kalman Snakes, pp 3–20, MIT Press, 1992
38 N Peterfreund, “Robust tracking of position and velocity with Kalman snakes,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol
21, no 6, pp 564–569, 1999
39 D G Luenberger, “An introduction to observers,” IEEE Transactions
on Automatic Control, vol 16, no 6, pp 596–602, 1971
40 A Gelb, Ed., Applied Optimal Estimation, MIT Press, 1974
41 G Unal, A Yezzi, and H Krim, “Information-theoretic active polygons for unsupervised texture segmentation,” May-June 2005, IJCV
Trang 8202 G Unal et al
42 S Zhu and A Yuille, “Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation,” ,” IEEE Trans Pattern Analysis, and Machine Intelligence, vol 18, no 9, pp 884–900, 1996
43 B.B Kimia, A Tannenbaum, and S Zucker, “Shapes, shocks, and deformations I,” Int J Computer Vision, vol 31, pp 189–224, 1995
44 S Osher and J.A Sethian, “Fronts propagating with curvature dependent speed: Algorithms based on the Hamilton-Jacobi formulation,” J Computational Physics, vol 49, pp 12–49, 1988
45 D Peng, B Merriman, S Osher, H-K Zhao, and M Kang, “A based fast local level set method,” J Computational Physics, vol 255,
PDE-pp 410–438, 1999
46 T.F Chan and L.A Vese, “An active contour model without edges,” in Int Conf Scale-Space Theories in Computer Vision, 1999, pp 141–151
47 A Yezzi, A Tsai, and A Willsky, “A fully global approach to image segmentation via coupled curve evolution equations,” J Vis Commun Image Representation, vol 13, pp 195–216, 2002
48 M Bertalmio, L.T Cheng, S Osher, and G Sapiro, “Variational problems and partial differential equations on implicit surfaces,” J Computational Physics, vol 174, no 2, pp 759–780, 2001
Trang 96 3-D Modeling of Real-World Objects Using
Range and Intensity Images
1 School of Electrical and Computer Engineering, Purdue University,West Lafayette, Indiana, U.S.A
an object surfacebymillions of polygons which allows such representations
to be visualized interactively in real-time Obviously, to take advantage ofthese technological advances, the 3D models constructed must capture to themaximum extent possible of the shape and surface-texture information ofreal-world objects By real-world objects, we mean objects that may presentself-occlusion with respect to the sensory devices; objects with shiny sur-faces that may create mirror-like (specular) effects; objects that may absorblight and therefore not be completely perceived by the vision system; andother types of optically uncooperative objects Construction of such photo-realistic 3D models of real-world objects is the main focus of this chapter Ingeneral, the construction of such 3D models entails four main steps:
1 Acquisition of geometric data:
First, a range sensor must be used to acquire the geometric shape of theexterior of the object Objects of complex shape may require a largenumber of range images viewed from different directions so that all of
J Park and G.N DeSouza: 3-D Modeling of Real-World Objects Using Range and Intensity
www.springerlink.com Springer-Verlag Berlin Heidelberg 2005c
Images, Studies in Computational Intelligence (SCI) 7, 203–264 (2005)
Trang 10the surface detail is captured, although it is very difficult to capture theentire surface if the object contains significant protrusions.
4 Acquisition of reflection data:
In order to provide a photo-realistic visualization, the final step quires the reflectance properties of the object surface, and this infor-mation is added to the geometric model
ac-Each of these steps will be described in separate sections of this chapter
6.2 Acquisition of Geometric Data
The first step in 3D object modeling is to acquire the geometric shape ofthe exterior of the object Since acquiring geometric data of an object is
a very common problem in computer vision, various techniques have beendeveloped over the years for different applications
6.2.1 Techniques of Acquiring 3D Data
The techniques described in this section are not intended to be exhaustive;
we will mention briefly only the prominent approaches In general, methods
of acquiring 3D data can be divided into passive sensing methods and activesensing methods
Passive Sensing Methods
The passive sensing methods extract 3D positions of object points by ing images with ambient light source Two of the well-known passive sens-
us-J Park and G N DeSouza
204
Trang 11ing methods are From-Shading (SFS) and stereo vision The From-Shading method uses a single image of an object The main idea ofthis method derives from the fact that one of the cues the human visual sys-tem uses to infer the shape of a 3D object is its shading information Usingthe variation in brightness of an object, the SFS method recovers the 3Dshape of an object There are three major drawbacks of this method: First,the shadow areas of an object cannot be recovered reliably since they do notprovide enough intensity information Second, the method assumes that theentire surface of an object has uniform reflectance property, thus the methodcannot be applied to general objects Third, the method is very sensitive tonoise since the computation of surface gradients is involved.
Shape-The stereo vision method uses two or more images of an object fromdifferent viewpoints Given the image coordinates of the same object point
in two or more images, the stereo vision method extracts the 3D coordinate
of that object point A fundamental limitation of this method is the fact thatfinding the correspondence between images is extremely difficult
The passive sensing methods require very simple hardware, but usuallythese methods do not generate dense and accurate 3D data compare to theactive sensing methods
Active Sensing Methods
The active sensing methods can be divided into two categories: contact andnon-contact methods Coordinate Measuring Machine (CMM) is a primeexample of the contact methods CMMs consist of probe sensors whichprovide 3D measurements by touching the surface of an object AlthoughCMMs generate very accurate and fine measurements, they are very expen-sive and slow Also, the types of objects that can be used by CMMs arelimited since physical contact is required
The non-contact methods project their own energy source to an object,then observe either the transmitted or the reflected energy The computedtomography (CT), also known as the computed axial tomography (CAT),
is one of the techniques that records the transmitted energy It uses X-raybeams at various angles to create cross-sectional images of an object Sincethe computed tomography provides the internal structure of an object, themethod is widely used in medical applications
The active stereo uses the same idea of the passive sensing stereo method,but a light pattern is projected onto an object to solve the difficulty of findingcorresponding points between two (or more) camera images
The laser radar system, also known as LADAR, LIDAR, or optical radar,uses the information of emitted and received laser beam to compute thedepth There are mainly two methods that are widely used: (1) using ampli-tude modulated continuous wave (AM-CW) laser, and (2) using laser pulses
6 3D Modeling of Real-World Objects Using Range and Intensity Images 205
Trang 12The first method emits AM-CW laser onto a scene, and receives the laserthat was reflected by a point in the scene The system computes the phasedifference between the emitted and the received laser beam Then, the depth
of the point can be computed since the phase difference is directly tional to depth The second method emits a laser pulse, and computes theinterval between the emitted and the received time of the pulse The time in-
propor-terval, well known as time-of-flight, is then used to compute the depth given
by t = 2z/c where t is time-of-flight, z is depth, and c is speed of light The
laser radar systems are well suited for applications requiring medium-rangesensing from 10 to 200 meters
The structured-light methods project a light pattern onto a scene, thenuse a camera to observe how the pattern is illuminated on the object surface.Broadly speaking, the structured-light methods can be divided into scanningand non-scanning methods The scanning methods consist of a moving stageand a laser plane, so either the laser plane scans the object or the object movesthrough the laser plane A sequence of images is taken while scanning Then,
by detecting illuminated points in the images, 3D positions of correspondingobject points are computed by the equations of camera calibration The non-scanning methods project a spatially or temporally varying light pattern onto
an object An appropriate decoding of the reflected pattern is then used tocompute the 3D coordinates of an object
The system that acquired all the 3D data presented in this chapter fallsinto a category of a scanning structured-light method using a single laserplane From now on, such a system will be referred to as a structured-lightscanner
6.2.2 Structured-Light Scanner
Structured-light scanners have been used in manifold applications since thetechnique was introduced about two decades ago They are especially suit-able for applications in 3D object modeling for two main reasons: First, theyacquire dense and accurate 3D data compared to passive sensing methods.Second, they require relatively simple hardware compared to laser radar sys-tems
In what follows, we will describe the basic concept of structured-lightscanner and all the data that can be typically acquired and derived from thiskind of sensor