... domain to the Depth based Rendering model, to be more specific, Layered Depth Image Based Rendering The depth based rendering model exploits the additional data available in terms of the 2D image. .. re-projection of the depth pixels in the reference depth images [Lee, 1998] Layered Depth Image Based Rendering is an extension to the depth based rendering model, which performs warping from an intermediate... Geometry and Photometry Extraction and Scene Resampling The System framework of this Layered Depth Image based rendering approach is depicted Image Samples (Color, range maps) Normals from depths
Trang 1AN EFFICIENT APPROACH
TO LAYERED-DEPTH IMAGE BASED RENDERING
RAVINDER NAMBOORI
NATIONAL UNIVERSITY OF SINGAPORE
2003
Trang 2AN EFFICIENT APPROACH
TO LAYERED-DEPTH IMAGE BASED RENDERING
RAVINDER NAMBOORI (B.Comp (Hons.), NUS)
A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE
SCHOOL OF COMPUTING NATIONAL UNIVERSITY OF SINGAPORE
2003
Trang 3ACKNOWLEDGEMENTS
I would like to sincerely thank A/P Teh Hung Chuan and Dr Huang Zhiyong, my project advisors, for their continual support and guidance throughout my research Their assistance, patience, warmth and constant encouragement have been invaluable to this research My thanks to Dr Chang Ee Chien and Mr Low Kok Lim for their helpful suggestions
I am extremely grateful to Mr Chong Peng Kong for his time and help with the lab apparatus This project wouldn’t have been possible without his willingness to help at any moment and his readiness to ensure that all is well with my work
My special thanks to Mr Sushil Chauhan, for his help in better formulating the sampling arc functions
Ravinder Namboori
Oct 2003
Trang 41.2 Image Based Rendering and the Sampling Problem 2
2.5 Sampling issue for other Rendering Techniques 15
CHAPTER 3 THE PROPOSED IMRPOVEMENT TO THE LDI SYSTEM 17
Trang 6SUMMARY
There exist a lot of computer graphics techniques to synthesize 3-D environments, of which, Image Based Rendering (IBR) techniques are becoming increasingly popular In this thesis we concentrate on improving one such IBR technique, viz Layered Depth Images (LDI) This technique, like many other IBR techniques, works on a set of pre-acquired imagery to model the world, and often, problems have been encountered in determining how exactly to decide on this pre-acquired set of sample images As the quality of the synthetic view is governed by the initial stages of sampling, addressing this problem can enhance the result achieved by the eventual rendering engine
This research presents a new approach to rendering an LDI, by adaptively sampling the raw data based on the determined set of sample parameters This approach eliminates the redundancy caused by over-sampling, and removes the hole artefact caused by under-sampling In addition, the rendering speed of the LDI is improved by the pre-computed visibility graph and patch lookup table
Trang 7LIST OF FIGURES
Figure 1.2 Framework of the Layered Depth Image based rendering System 7
Figure 3.3 Contour Formation, Sampling and Visibility Graphs 20
Figure 3.6 From a group of Rectangle Patches to a 2-D contour 26
Figure 5.1 (a) Synthetic Views generated by the improved system –
Mannequin
60
Figure 5.1 (b) Synthetic Views generated by the improved system – Pooh Bear 61 Figure 5.2 Statistical information for the improved system 62 Figure 5.3 (a) Synthetic Views generated by the sparsely sampled LDI system
(without splatting) – Mannequin
64
Trang 8Figure 5.3 (a) Synthetic Views generated by the sparsely sampled LDI system
(without splatting) – Pooh Bear
65
Figure 5.3 (b) Synthetic Views generated by the sparsely sampled LDI system
(with splatting) – Mannequin
66
Figure 5.3 (b) Synthetic Views generated by the sparsely sampled LDI system
(with splatting) – Pooh Bear
Trang 9INTRODUCTION
1.1 Documentation Layout
For the purpose of easy readability, the content has been divided into seven chapters This chapter, Chapter 1, is an introduction to the research as a whole, an introduction to the various phases of the research, as well as the nature of this project We shall highlight the problem statement and the overall system framework in this chapter
Chapter 2 covers an overview of the related work in the area to date Included in this chapter, is a brief description of the various researches and techniques in the area of Image Based Rendering and Layered Depth Images in particular, sampling methods and automatic camera placement techniques
Chapter 3 highlights the proposed improvement to the Layered Depth Image system by
adaptively sampling the reference images and pre-computing the patch lookup table Also
discussed in this chapter are the derivations and assumptions leading to the essential steps involved in the system framework
Chapter 4 is an elaboration of the implementation of the system and the sampling issues involved in the research This section takes a methodological approach to exemplify the steps involved in demonstrating the proposed method of improving the Layered Depth Image system
Trang 10Chapter 5 discusses the results achieved by the implementation of the proposed method In this chapter, we go through the various examples used and the outputs we got using our system, and contrast the result with those achieved by an earlier framework, which does not include the proposed improvements
Chapter 6 concludes this thesis discussing the lessons learnt from this research and reinstating the goals achieved and the solution proposed and implemented
Chapter 7 addresses the future prospects of research in this area, and wraps up the report with a final word
1.2 Image Based Rendering and the Sampling Problem
The traditional approach to synthesize realistic images of virtual environments involve modeling the environments using a collection of 3-D geometrical entities with their associated material properties, and a set of light sources Then, rendering techniques such
as radiosity and ray tracing are used to generate the images at given viewpoints The realism of such rendered images is limited by the accuracy in the description of the primitive material and illumination properties and hand coded or mathematically derived graphical models Also, real-time rendering using this technique relies heavily on the complexity of the scene geometry and the hardware configuration
Computer Vision, on the other hand can be considered as an inverse process of computer graphics, which recovers 3-D scene geometry from 2-D images Extracting 3-D geometry
of a scene usually requires solving difficult problems such as stereovision, depth from
Trang 11shading, or using expensive rangefinders From the 3-D geometry recovered, approximated 3-D models are constructed, from which new images can be synthesized However, these reconstruction techniques are usually computationally expensive and the reconstructed models suffer from the lack of accuracy
Image Based Rendering is an emerging new field, which counters these limitations In this technique, new images and 3-D worlds can be modeled without the knowledge of the geometry of the scene involved Realism is achieved by the fact that the basic entities of the 3-D environment are no longer polygons or geometries, but are pre-acquired images Fig 1.1 depicts the process of Image Based Rendering As can be seen, it has emerged from both the fields of Computer Graphics and Computer Vision and yet bypasses the complicated and limiting stage of defining the scene’s geometry It also shows that the tedious 3D shape modeling can be avoided and little or no knowledge of 3D shape of the scene is required
In addition, what is highlighted in Fig 1.1 is the new step of sampling, which dictates how different the Modeled Synthetic World is going to be, in comparison to the Real world As the quality of the synthetic view is now governed by the reference images at our disposal and not any 3D geometry, the initial stages of sampling becomes of paramount
importance The sampling problem is to determine where the scene needs to be sampled
from and how many such samples are required to adequately sample the scene It is important that a robust solution be formulated for the problem of sampling and determining the exact set of reference imagery required in rendering the 3D world
Trang 12Figure 1.1: Model of Image Based Rendering
Tackling the sampling problem isn’t as straightforward as over-sampling, as that would result not only in redundancy of sampled data, but also an increased amount of time to re-render the synthetic view On the contrary an attempt to under-sample even if followed by stages of splatting, compromises on the realism of the modeled world, often leaving holes and visual artifacts, or portions of synthetically splatted patches
A lot has been researched in the field of Image Based Rendering since its emergence a few years back Essentially Image Based Rendering is about creating new photo-realistic images of complex scenes through interpolation techniques or other computations based
on input data from photographs, drawings and rendered virtual scenes There are various techniques to model a 3-D World using pre-acquired imagery, viz Layered Depth Images, LumiGraph/Light Field, Panorama, View Morphing etc All these techniques differ slightly in striking a balance between the computation involved in generating new views and the size of the sampled database Irrespective of the approach, the stage of sampling is indispensable to its framework The availability of geometry as in the case of methods like
Sampling
Extract 3-D Geometry (Computer Vision)
Real World
Scene Geometry
Conventional Rendering (Computer Graphics)
Modeled Synthetic World
Image Based Rendering
Reference Imagery
Trang 13Layered Depth Images provides ample opportunity to precisely select and limit the reference Imagery As for the other approaches, splatting and other techniques to compensate for under sampling have been seen as a possible alternative We shall overview these techniques in Chapter 2, under Overview of Related Work
1.3 Problem Statement and Research Scope
This Research is focused on one of the areas of the huge field of Image Based Rendering, viz Layered Depth Images The aim of this research work is to enhance the realism and hasten the generation of the views achieved by the standard way of Layered Depth Image
based Rendering by adaptively sampling the reference images and pre-computing a patch
lookup table The idea is to introduce a filtering stage after densely over-sampling the real
world The filtering stage, like the sampling stage, being a part of the pre-rendering phase, helps out the rendering engine by pulling out additional computations, and saving up precious time while rendering This method would not only improve the quality of the synthetic images generated, in terms of getting rid of holes and occlusion artefacts, but will also enable quick generation of images, owing to the tabulation of the required sampled imagery that is acquired
The major challenge is to effectively compute the required reference viewpoints from the dense sample to eliminate possible loss or redundancy of data To create a compelling sense of virtual presence, the following goals must be achieved:
• Users can interactively navigate through the 3-D environment, without hardware acceleration
Trang 14• The photo-realism of the environment ought not to be compromised in terms of holes or other synthetic occlusions
• The speed of the rendering pipeline should be unaffected by the fact that the sampled imagery is bigger than a sparsely sampled image set
The sampled images are assumed to be taken under white light and with an ideal pinhole camera (no lens distortion)
The contributions of this research include:
• Implementation of a Layered Depth Image framework that enables rendering
of complex 3-D environments, catering for absence of holes or visual artefacts
in the modeled world
• An efficient approach to tabulate the pre-acquired set of imagery, to ensure fast reference view selection and rendering of the synthetic views
• A method which retains the realism of the 3-D environment, through dense samples of the real world, and yet achieves a rendering engine which is as fast
as a sparse sampled LDI system
1.4 The System Framework
The original Layered Depth Image system is essentially classified into 3 main phases Scene Sampling, Scene Geometry and Photometry Extraction and Scene Resampling
The System framework of this Layered Depth Image based rendering approach is depicted
Trang 15Figure 1.2: Framework of the Layered Depth Image based rendering System
in Fig 1.2
We work on the first 3 stages of this framework, the so-called pre-computation phase, and
Reference Views Selection
Normals from depths
Image Samples (with Surface Normals)
Image Samples (Color, range maps)
Layered Depth Image Generation
Render Synthetic Views/Environment
Incremental Warping Process
Scene Sampling
Scene Resampling
Scene Geometry and Photometry Extraction
Trang 16Figure 1.3: Our System Framework
modify the framework as depicted in Fig 1.3 We briefly go over these new stages of the framework in the sections to come
Hash Table for Reference Views Selection
Patch Identification and Rectangularisation
Contour Formation
Identifying Visibility and Sampling regions
Filtered Image Patch Samples
Reference Views Selection
Image Samples (Color, range maps)
Layered Depth Image Generation
Render Synthetic Views/Environment
Incremental Warping Process
Trang 171.4.1 Patch Identification and Rectangularisation
The surface normals at every point are calculated based on the range information (for more information on the calculation of surface normals, please refer to chapter 4, section 4.3.2) The color & range maps, along with the normals constitute our sampled point
cloud From this point cloud, this step attempts to identify the uniform patches, surfaces
that are not uneven and that fit in the camera’s field of view, sets of points defined by certain patch constraints Based on these constraints, the whole point cloud is divided into smaller uniform regions called patches These patches, owing to the constraints thus applied, have a close to constant third dimension The patches are then rectangularised, a recursive process which applies a greedy algorithm to extract the largest rectangle in the patch At the end of this stage, we have categorized the point cloud into rectangular patches, ones that can be summarized as a line in two dimensions, when viewed from the top This process is explained and discussed in detail in chapter 3
1.4.2 Contour Formation
The output of the previous stage, viz the rectangle patches, is fed into this part of the
pipeline, in an attempt to identify unique 2D-contours along the vertical axis The aim of
this stage is to identify those parts in the vertical space, which when viewed from the top, look as if it were in a single plane This stage subsequently summarizes these parts of the object as a 2D-contour associated to a particular vertical range
1.4.3 Identifying Visibility and Sampling Regions
In this step, we find the visibility and sampling regions for each of the contours thus found Visibility region for a particular edge in a 2D-contour is defined as that region from
Trang 18which the whole edge is visible, if there is no occlusion A sampling region for a
particular edge is defined as the region where it’s appropriate to sample that particular edge, ensuring that all of the data visible on the edge is captured A sampling region is determined by formulae dependent on the size of the edge, the camera calibrations and the sampling camera trajectory
1.4.4 Patch lookup table for Reference view selection
This last step, despite being out of the scene-sampling phase, and being a part of rendering, is worthwhile mentioning at this point, because of the organization of the data
re-in the prior phases Given the structured organization of the sampled pore-ints of the origre-inal data, the selection of reference views to render while generating a new synthetic view becomes straightforward
During rendering, the necessity to look for the closest reference viewpoints or the reference views which cover such and such occluded region is overcome by the fact that these considerations have been addressed during the pre-computation phase of determining the required set of sampled imagery Hence, the reference view selection becomes a fast and straightforward procedure of looking up the hash table of patches thus created, for relevant reference data, as the viewer moves around the 3D space
Trang 19OVERVIEW OF RELATED WORK
Image based rendering techniques have been classified into four distinct categories: pixel based, block based, reconstruction based and mosaicing [Kang, 1997] These categories are not necessarily mutually exclusive Also, there exists a different categorization, i.e Rendering from Interpolation of Dense Samples, Panorama based Rendering, Morphing and Depth based Rendering These techniques vary largely in the knowledge of the geometry of the scene and the number of samples of the scene We shall restrict our domain to the Depth based Rendering model, to be more specific, Layered Depth Image Based Rendering
The depth based rendering model exploits the additional data available in terms of the 2D image samples being images with depths These so called depth images, in addition to having the color values at a particular pixel, also contain the depth information at that location Synthetic images for new viewpoints are created by a re-projection of the depth pixels in the reference depth images [Lee, 1998] Layered Depth Image Based Rendering
is an extension to the depth based rendering model, which performs warping from an intermediate representation called a Layered Depth Image (LDI) [Shade et al., 1998] An LDI is a view of the scene from a single input camera view, but with multiple pixels along each line of sight An LDI is constructed by warping n depth images into a common camera view
This chapter surveys the various techniques employed to get around the sampling problem
Trang 20for the LDI based rendering method, discussed in the previous chapter While some of these techniques look at remedying the damage caused by the problem like splatting the holes during rendering, some others attempt to find the best next view to sample, assuming the first sample was ideal All these techniques aim to exploit the geometrical knowledge to improve the photo-realism of the synthetically generated scene
2.1 Splatting
Splatting is a technique, which aims to remedy the effects of the sampling problem The Layered Depth Image, which is created from uniformly sampled images, is splat into the output image by estimating the projected area of the warped pixels [Shade et al., 1998] This estimation is computed differentially based on the distance between the sampled surface point and the LDI camera, the field of view of the camera, the dimensions of the LDI and the angle between the surface normal at the sampled surface point and the line of sight to the LDI camera
As splatting is a post-sampling step, care has to be taken that it doesn’t slow down the rendering engine In this view, a lookup table is generated Before rendering each new image, the new output camera information is used to pre-compute the lookup table
2.2 Multi-Resolution Sampling
Multi-Resolution sampling attempts to get around the problem of over-sampling or sampling for various camera distances, by sampling sets of images for different resolutions While splatting and meshing are proposed to deal with the disocclusion artifacts, they are seemingly adequate only for post-rendering warping in which the
Trang 21under-resolution of the current view does not deviate much from the under-resolution of the reference image
In cases where an LDI is created from reference images not at similar distances from the object under consideration, insufficient sampling rate of the LDI might cause the synthetic view to look blurrier than it looks in the reference image closer to the object On the contrary excessive sampling rate of the LDI might slow down the rendering pipeline
The LDI Tree method [Chang et al., 1999], employed a hierarchical partition scheme with the concept of LDI, which preserves the sampling rate of the reference images by adaptively selecting an LDI from the LDI cluster for each pixel In another approach, an L-System was implemented, which could store images of varying resolutions at different nodes of the L-System for effective tree modeling [Lluch et al., 2004]
2.3 Sampling all Visible Surfaces
As the name suggests, in this technique of sampling all visible surfaces, an attempt is made to record a series of images that, collectively, capture all visible surfaces of the object This technique revolves around the selection of a good heuristic method to find a good set of viewpoints for a given geometric model The goal is to have sampled images from the computed viewpoints such that every visible surface is shown at least once One such heuristic is to segment the object to exemplify hierarchical visibility [Stuerzlinger, 1998] The scene is assumed to be a set of surface polygons organized in a hierarchy The
Trang 22hierarchical visibility method subdivides the scene hierarchy depending on the relative visibility of objects
Yet another heuristic is to cover all possible surfaces, masking reference images as each surface is considered [Fleshman et al., 1999] In this approach, the set of scene polygons visible from a viewing zone is approximated and then a greedy algorithm is employed to select a small number of camera positions that together cover every polygon in the geometric model Towards this goal, the boundary of the walking zone is first tessellated Scene polygons are subsequently subdivided to reduce the likelihood of the visibility problems Visibility and quality of the subdivided sections of the polygons determine the worth of any reference image
2.4 Best Next View Sampling
The best next view problem is that of selecting the next view for the sampling system to take, given some already acquired views of the object Two criterions are often considered in solving this problem The visibility criterion attempts to maximize the number of surfaces not seen thus far, by adding the next image to the sampled set, while the quality criterion aims to improve the quality of the surfaces sampled The quality criterion prioritizes an image, which samples a decent number of surfaces, covering most areas of these surfaces, over an image, which samples a lot of surfaces but obliquely
Several methods are available in this form of sampling, most of them differing in the way they establish the two criterion mentioned In one particular approach, a volumetric representation, termed the voxelmap, is generated at each cycle of best next view
Trang 23computation [Massios, Fisher, 1998] The voxels thus scanned are marked empty, seen, unseen or as an occlusion plane depending on the visibility from the new view The seen voxels carry a quality property, which is estimated by the aggregate normals of all the points sampled in a particular voxel
In yet another approach, each range image sampled thus far is approximated by a triangular mesh [Garcia, 1998] The resolution of the triangular mesh determines the minimum distance of that can be distinguished during the exploration process The edges
in these triangular meshes are marked as exterior, occlusion or interior depending on whether they bound a region or whether they are susceptible to occlude surfaces of the scene or whether they are formed by an overlap of two exterior edges The quality criterion is satisfied by a voting mechanism Each occlusion edge has an associated normal histogram and a tangent histogram Every cell of the histogram keeps the sum of all the associated normals That cell is looked up for which has received the maximum number of votes in either histogram
2.5 Sampling Issue for other Rendering Techniques
Though drifting slightly from the area of concentration, interesting techniques have been researched in two other forms of Image based rendering
In Point based rendering [Grossman, Dally, 1998], an attempt is made to ignore the issue
of adequate sampling during rendering The problem is dealt with, during the phase of sampling, by suggesting that to minimize the number of samples that adequately sample
an object, the distance between the adjacent samples on the surface of the object should be
Trang 24as large as possible but less than the pixel side length at the target resolution assuming unit magnification An equilateral triangle mesh is used for the purpose
For Lumigraph/Light Field Image based rendering techniques [Gortler et al., 1996], a spectral analysis of light field signals combined with the sampling theorem is used to derive the analytical functions that determine the minimum sampling rate [Chai et al., 2000] The minimum sampling rate is obtained by compacting the replicas of the spectral support of the sampled light field within the smallest interval As it is known that the spectral support of a light field signal is bounded by the minimum and maximum depths only, no matter how complicated the spectral support might be because of the depth variations in the scene, a reconstruction filter with an optimal constant depth can be designed to achieve anti aliased rendering
Trang 25THE PROPOSED IMPROVEMENT TO THE LDI SYSTEM
It is established that a Layered Depth Image is a view of the scene from a single input camera view, but with multiple pixels along each line of sight It is a comprehensive image data structure, built to take into account various artifacts like occlusions and holes,
by storing not just the first layer of pixels but also a few layers along each line of sight Unfortunately, this comprehensive LDI framework’s ability to render photo realistic views, devoid of holes and other visual artefacts, is highly dependent on the nature of the reference images sampled, to be precise, the number of reference images sampled and the position from which they are sampled
The sampling problem is to determine where the scene needs to be sampled from and how
many such samples are required to adequately sample the scene It is worthwhile to note that under-sampling results in visual artefacts On the contrary, over-sampling helps get around the problem of visual artefacts, but at the cost of the rendering speed There is no one fixed scheme to adequately sample the LDI, as the occlusion artefacts and holes are largely dependent on the scene’s geometry Hence we adopt the approach to adaptively sample the scene based on the scene’s available geometrical data
In this chapter we shall discuss a method to adaptively sample a layered depth image, based on the geometrical information at our disposal We shall elaborate how the LDI
system is improved by this change in the sampling phase and the patch lookup table
generated during the pre-rendering phase In the chapters to follow, we shall go over a
Trang 26Figure 3.1: Adaptive Sampling Pipeline
sample implementation and the results observed using this model
3.1 Brief Overview
Uniform sampling is the simplest alternative preferred by most LDI engines to-date, and the inadequacy of the sampling problem is solved by methods like splatting discussed in the previous chapter The density of uniform sampling affects the quality of the output Sparse uniform sampling results in visual artefacts while dense uniform sampling ends up with a slow re-rendering pipeline
We approach the adaptive sampling method by starting with a highly dense uniform sample set and adaptively filtering redundant data, retaining only the adequate information Figure 3.1 depicts the adaptive sampling pipeline
Real Scene
Unique Rectangularised Patches
Required Sample Patches
Patch categorization
Contour formation and Sampling graphs Uniform dense sampling Sample Reference
Images
Trang 27Figure 3.2: Patch Categorization
The step of patch categorization first categorizes the uniformly sampled images into
uniquely defined patches These patches help in guiding the sampling The next step forms contours with these patches, and identifies the sampling regions required Eventually, the exact reference points to sample from are deciphered, and the unnecessary data is
Real Scene
Uniformly Sampled Images
Any more Patches?
No
Unique Rectangularised Patches
Highly dense uniform sampling
Trang 28Figure 3.3: Contour Formation, Sampling and Visibility Graphs
Figure 3.4: Re-rendering Engine
2D Contours
Reference views required
Hash Table for
Reference View
Selection
Required Sample Patches
Unique Rectangularised Patches
Scan line traversal and contour identification
Identifying Sampling regions
Identifying Visibility regions
Assimilate required data
Patch Lookup
Table for Reference View
Selection Required Sample
Patches
LDI Walkthrough Engine
Synthetic View
Walkthrough
Reference Patches
Generate LDI
Trang 29disposed Figure 3.2 elaborates on the step of Patch Categorization and Figure 3.3 depicts the stage of formation of contours and the identification of sampling and visibility regions there on Figure 3.4 highlights the effect of the adaptive sampling method on the final re-rendering engine In the sections to follow, we shall go through each of these steps in detail, defining and discussing the theories and considerations
3.2 Patch Categorization
This step marks one of the most critical steps in the method of adaptive sampling, as it’s in this step that we start from just a point cloud to a representation, which though isn’t as detailed as a triangle mesh, is still informative enough for us to understand the geometry
of the scene and proceed to adaptively sample the object It is necessary to clarify at this point that the patches thus formed are only for guiding the stage of sampling The eventual rendering is still from the originally captured data
3.2.1 Patch
Before going any further with the procedure of patch categorization, we explain the concept of a patch, in the context of data sampling A patch is regarded as any uniform surface on the object (a surface without uneven bumps), which can wholly fit into the field
of view of the camera under consideration
The purpose of defining a patch is to be able to summarize the geometry of the object in a plane in 2 dimensions, so as to ensure that we get a rough sketch of the uniform sections
of the object It’s worthwhile to note that a crude way of expressing adaptive sampling is
to say that more samples are needed for areas of the object which are not too uniform
Trang 30(occluded by parts of the same surface or different surfaces), and lesser samples for those sections of the object which are fairly smooth Hence the need to be able to clearly distinguish these various sections of the object
3.2.2 Patch Constraints
Having gone through a layman’s definition of a patch, and its purpose, in this subsection
we attempt to formally define a patch A patch is formally defined as all neighboring points in the point cloud which satisfy the following constraints: (for more information on the point cloud and the attributes of the points there-in, please refer to the next chapter, section 4.3, titled Other Issues)
a) The normals between any two neighboring points in the spherical co-ordinate system don’t differ by a preset δnϕ and δnθ
b) The normals between the extreme two points of the patch, in the spherical ordinate system don’t differ by more than a preset ∆nϕ and ∆nθ
co-c) The Z values of any two neighboring points don’t differ by more than a preset δz This ensures that areas on two objects which have smooth transition, but are placed apart in the viewing direction don’t end up being called a patch
d) The size of a patch both horizontally and vertically never exceeds the maximum size that the field of view, θ, of the camera permits at that depth This is calculated
as follows:
Taking a top view and denoting the size of the patch in any scan line with an edge, let the first and last points of this edge be Smax, Smin Suppose the orthogonal bisector of the
Trang 31Figure 3.5: Patch Size Constraint (top view)
edge intersects the sampling circle at a point, and let the distance from the midpoint of the
edge to the circle be denoted as D (The sampling circle will be explicitly defined in section 3.3.2)
The size d of the patch is:
The same criteria applies for the vertical extent, with the corresponding angle, ϕ Given the point cloud and given these constraints, the patches more-or-less as defined in section 3.2.1 are obtained and the whole point cloud is categorized
3.2.3 Rectangularisation of Patches
We stated at the end of the previous subsection that we have categorized the point cloud into patches, which are “more-or-less” as defined earlier The reason why these patches are still not exactly the way we defined is because, though the patches are uniform and if seen from the top or bottom look like they occupy only 2-dimensions, they still have a
Trang 32non-uniform shape in the 2-dimensional plane We hence attempt to break down these patches obtained, into patches of shape that can be easily summarized in one dimension,
as a line We choose a rectangle for our convenience and for the fact that it can be reduced
to a line along a scan line
In order to break the patches gotten so far, into rectangles, we follow a step called the rectangularisation of patches In this step, for each of the patches thus identified, we find the biggest rectangle that can fit into it, and cut that portion out We follow the same procedure for the remaining area in the patch, until we are left with areas, which are smaller than a preset area All the portions cut out from the patch in the process are rectangularised components of the patch, and hence the term rectangularised patches
It can hence be seen that a single patch obtained by patch categorization may later end up
as a few rectangularised patches
3.2.4 Patch Merging
So far we have defined and discussed about how to theoretically get patches from the point cloud Practically, storing the entire sampled point cloud in the program’s heap may not be feasible, owing to the highly dense sampling A simpler and more practical approach is discussed in this subsection
We consider one reference image at a time, and find the patches from its point cloud As the size of individual images is much smaller, the heap constraint is no longer applicable
Trang 33As patches are being formed, they are checked for overlaps with patches found from previous reference images
Merging the patches is done by isolating the various vertical cross-sections, and concentrating on the patches that overlap If we are to find two sections of a patch that need to be merged, we can be assured that they would have an overlap, owing to the high density of the samples Care should be taken to make sure that the patch, as it’s being merged, still satisfies the patch constraints, over the border and as a whole
Redundant patch areas can be found and dropped, when patches are found which overlap
by an extent more than the approximation involved in the step of rectangularisation The section of the patch that is not redundant (which is now definitely smaller than the maximum size a patch permits) can be merged with some other patch if need be
It’s worthwhile to mention the effect of the sequence of the reference images, on the eventual set of patches generated The patch merging process is applied in a linear fashion, with the patches being created in the current reference image, compared with the patches thus formed, to check for an overlap This makes the efficiency of the merging process reliant on which reference image is considered next The reduction in the data redundancy
is independent of this decision; however the number of patches generated, is not A recursive approach to make patch-merging independent of the sequence of the reference images is computationally expensive Also, this is unnecessary in situations where there is
a clearly defined sampling trajectory on which the reference images were captured In our system implementation, the reference images are considered in the same order that they
Trang 34Figure 3.6: From a group of Rectangle Patches to a 2-D contour
were captured during the sampling camera traversal on the sampling trajectory This ensures maximum overlap between consecutive reference images, given that our sampling trajectory was a circle
3.3 Contour Formation, Sampling and Visibility Graphs
A contour, literally, stands for a 2-dimensional shape expressed as a line representation Our definition isn’t far from this meaning of contours Having broken down the point cloud into rectangularised patches, we are now in a position to form a skeleton of the entire object
3.3.1 Contour Formation
In the previous section, we discussed how we categorized the point cloud into a set of rectangle patches In this section, we attempt to make a skeleton out of these uniform sections of the object
Each patch obtained thus far, is labeled with a number A scan line traversal is now performed to see the patches that are encountered at each scan line At the end of this scan
Trang 35Algorithm 3.1: Contour Formation line traversal, we have a set of patches traversed for each scan line We group all the scan lines with the same patch traversals together Given these groups of patches, we try to make one contour to represent each of these groups The contour is basically the representation of what we would see when viewed from the top Each patch, since close-to-constant in the 3rd dimension, and a rectangle in 2 dimensions, can be represented as one line, as it would be when seen from the same plane in which it is present Algorithm 3.1 depicts the pseudo code that summarizes the contour formation algorithm applied to the rectangularised patches
We hence have distinctly formed contours, representing various vertical segments of the
procedure FormContours (patch [])
for k←0 to patch.size-1 //find patch demarcations in the vertical direction
demarcations.add(patch[k].min_y) demarcations.add(patch[k].max_y)
/form a contour with the patches in temp, and applicable for y from prev_demarcation to the current
createContour (temp, prev_demarcation, demarcations[j])
end for
end procedure
Trang 36object Figure 3.6 depicts the formation of a contour from a simple group of rectangle patches
3.3.2 Sampling Arc
Having broken down the point cloud into a few 2D contours, the initial problem of sampling now boils down to adequately sampling all the edges in each of these contours
In this context we shall define the concept of a sampling arc For any contour, we attempt
to sample the edges from the circumference of a circle, lying on the plane of the contour, with its center at the object’s origin and a radius, which defines how close we can get to
the object during camera walkthrough We call this the sampling circle We can have multiple concentric sampling circles for various resolutions
For any edge, a sampling arc is defined as the arc of the sampling circle, such that from
any point on that arc, the edge under consideration has maximum visibility To understand the sampling arc better, we need to take a brief look at the concept of cameras and views
Any camera has a view plane on to which any point seen from the camera is projected When we see through the camera, or take images with the camera, what we see is a projection of the scene on to the camera’s view plane The number of pixels on the view plane does not necessarily have a one-to-one mapping with the number of actual points in the world coordinate system
The sampling arc of any edge can now be defined as the arc of the sampling circle, defined by all those points on the circle’s circumference, from which the geometric
Trang 37Figure 3.7: Sampling Arc
content of the pixels seen, as compared to what is seen from an orthogonal view of the edge, is unchanged Figure 3.7 depicts a contour, the object’s sampling circle and the sampling arc of a particular edge labeled “e1” of the contour
When an edge is sampled from a point on the sampling circle outside the sampling arc, we end up sampling fewer pixels than can be seen from points on the sampling arc We call
this phenomenon oblique sampling
3.3.3 Determining the Sampling Arc
Having seen the definition and necessity of a sampling arc, we shall in this subsection see how to find the sampling arc, given an edge and the sampling circle
Suppose we choose a Cartesian st-coordinate system for the sampling circle with the
e1
Sampling Arc
Sampling
Figure 3.7: Sampling Arc
Trang 38center of the circle placed at its origin and the t-axis parallel to the edge Before actually
deriving the sampling arc, let us look at how the edge depth z, varies with respect to the
camera motion, as depicted in Figure 3.8 We define the edge depth, as the distance
between the midpoint of the edge and the camera placed on the sampling circle The prime
sample point (so,to), is defined as the point of intersection of the edge’s orthogonal bisector
and the sampling circle For any given edge, its prime depth, zo, is defined as the edge depth when the camera is at the prime sampling point The edge depth function Fz, which
is the edge depth as a function of the sampling point, can now be formulated as:
z’ = Fz (s’, t’) = ∆s / cos ( tan-1 ( (zo - ∆t)/ ∆s ) ), ∆s ≠ 0 or ∆t ≠ 0
zo, ∆s = 0 and ∆t = 0 (where ∆s = s’ - so , and ∆t = t’ - t o )
(3.2)
Now we are well equipped to derive the sampling arc, given any edge and the sampling
circle We have discussed before, while defining the sampling arc, that there is no one mapping between the actual points on the object and the pixels seen on the image plane It is clear that the maximum visibility for the surface associated with the edge is obtained when seen from the prime sample point However, as we would see further, this maximal visibility extends to a certain span on either side of the prime sample point, on the circumference of the sampling circle, giving us the sampling arc, where the best
one-to-possible view of the edge can still be maintained
Our aim is to find a sampling arc, such that the same pixel resolution as seen from the
Trang 39from Figure 3.7 that as we move away from the prime sample point, the number of points
on the edge projected to a pixel on the view plane will reduce Suppose, when viewed from the prime sample point, we establish clearly the largest segments on the edge which map to at most two pixels on the view plane, we would end up dividing the edge into several overlapping segments Given that the length on the image plane, that these segments correspond to, keeps reducing as we move away from the prime sample point,
we reach a stage where at least one of these segments corresponds to less than two pixels
on the image plane It is clear that, this is the point that defines the end point of the sampling arc From Figure 3.8, we infer that the righter more the segment on the edge, the more its reduction of size on the image plane, upon moving left Also, given that the right most and the left most segments of the edge, are the smallest, it is evident that one of these would be the first of the segments to correspond to less than two pixels on the image
plane We call these the critical segments of the edge
Hence, the problem of finding the sampling arc boils down to finding the points on the
sampling circle where the two critical segments occupy 2 pixels Before going further, we
need to acquaint ourselves with two camera dependent parameters The pixel size ∆p is the length of one pixel Its corresponding edge segment is ∆pw The camera distance f, is the distance between the camera’s lens and the view plane
We now term the pixel occupancy function, FW, as the number of pixels occupied by the critical segments Given that an edge has a right and left critical segment, we have a right and a left pixel occupancy function, FWR, FWL The sampling arc is thus determined by the
Trang 40set (s’,t’) such that FWR (s’, t’) > 1 and FWL (s’, t’) > 1 Solving for this, the left end point
of the sampling arc is determined by the equation: