An efficient approach to layered depth image based rendering

... domain to the Depth based Rendering model, to be more specific, Layered Depth Image Based Rendering The depth based rendering model exploits the additional data available in terms of the 2D image. .. re-projection of the depth pixels in the reference depth images [Lee, 1998] Layered Depth Image Based Rendering is an extension to the depth based rendering model, which performs warping from an intermediate... Geometry and Photometry Extraction and Scene Resampling The System framework of this Layered Depth Image based rendering approach is depicted Image Samples (Color, range maps) Normals from depths

Trang 1

AN EFFICIENT APPROACH

TO LAYERED-DEPTH IMAGE BASED RENDERING

RAVINDER NAMBOORI

NATIONAL UNIVERSITY OF SINGAPORE

2003

Trang 2

AN EFFICIENT APPROACH

TO LAYERED-DEPTH IMAGE BASED RENDERING

RAVINDER NAMBOORI (B.Comp (Hons.), NUS)

A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE

SCHOOL OF COMPUTING NATIONAL UNIVERSITY OF SINGAPORE

2003

Trang 3

ACKNOWLEDGEMENTS

I would like to sincerely thank A/P Teh Hung Chuan and Dr Huang Zhiyong, my project advisors, for their continual support and guidance throughout my research Their assistance, patience, warmth and constant encouragement have been invaluable to this research My thanks to Dr Chang Ee Chien and Mr Low Kok Lim for their helpful suggestions

I am extremely grateful to Mr Chong Peng Kong for his time and help with the lab apparatus This project wouldn’t have been possible without his willingness to help at any moment and his readiness to ensure that all is well with my work

My special thanks to Mr Sushil Chauhan, for his help in better formulating the sampling arc functions

Ravinder Namboori

Oct 2003

Trang 4

1.2 Image Based Rendering and the Sampling Problem 2

2.5 Sampling issue for other Rendering Techniques 15

CHAPTER 3 THE PROPOSED IMRPOVEMENT TO THE LDI SYSTEM 17

Trang 6

SUMMARY

There exist a lot of computer graphics techniques to synthesize 3-D environments, of which, Image Based Rendering (IBR) techniques are becoming increasingly popular In this thesis we concentrate on improving one such IBR technique, viz Layered Depth Images (LDI) This technique, like many other IBR techniques, works on a set of pre-acquired imagery to model the world, and often, problems have been encountered in determining how exactly to decide on this pre-acquired set of sample images As the quality of the synthetic view is governed by the initial stages of sampling, addressing this problem can enhance the result achieved by the eventual rendering engine

This research presents a new approach to rendering an LDI, by adaptively sampling the raw data based on the determined set of sample parameters This approach eliminates the redundancy caused by over-sampling, and removes the hole artefact caused by under-sampling In addition, the rendering speed of the LDI is improved by the pre-computed visibility graph and patch lookup table

Trang 7

LIST OF FIGURES

Figure 1.2 Framework of the Layered Depth Image based rendering System 7

Figure 3.3 Contour Formation, Sampling and Visibility Graphs 20

Figure 3.6 From a group of Rectangle Patches to a 2-D contour 26

Figure 5.1 (a) Synthetic Views generated by the improved system –

Mannequin

60

Figure 5.1 (b) Synthetic Views generated by the improved system – Pooh Bear 61 Figure 5.2 Statistical information for the improved system 62 Figure 5.3 (a) Synthetic Views generated by the sparsely sampled LDI system

(without splatting) – Mannequin

64

Trang 8

Figure 5.3 (a) Synthetic Views generated by the sparsely sampled LDI system

(without splatting) – Pooh Bear

65

Figure 5.3 (b) Synthetic Views generated by the sparsely sampled LDI system

(with splatting) – Mannequin

66

Figure 5.3 (b) Synthetic Views generated by the sparsely sampled LDI system

(with splatting) – Pooh Bear

Trang 9

INTRODUCTION

1.1 Documentation Layout

For the purpose of easy readability, the content has been divided into seven chapters This chapter, Chapter 1, is an introduction to the research as a whole, an introduction to the various phases of the research, as well as the nature of this project We shall highlight the problem statement and the overall system framework in this chapter

Chapter 2 covers an overview of the related work in the area to date Included in this chapter, is a brief description of the various researches and techniques in the area of Image Based Rendering and Layered Depth Images in particular, sampling methods and automatic camera placement techniques

Chapter 3 highlights the proposed improvement to the Layered Depth Image system by

adaptively sampling the reference images and pre-computing the patch lookup table Also

discussed in this chapter are the derivations and assumptions leading to the essential steps involved in the system framework

Chapter 4 is an elaboration of the implementation of the system and the sampling issues involved in the research This section takes a methodological approach to exemplify the steps involved in demonstrating the proposed method of improving the Layered Depth Image system

Trang 10

Chapter 5 discusses the results achieved by the implementation of the proposed method In this chapter, we go through the various examples used and the outputs we got using our system, and contrast the result with those achieved by an earlier framework, which does not include the proposed improvements

Chapter 6 concludes this thesis discussing the lessons learnt from this research and reinstating the goals achieved and the solution proposed and implemented

Chapter 7 addresses the future prospects of research in this area, and wraps up the report with a final word

1.2 Image Based Rendering and the Sampling Problem

The traditional approach to synthesize realistic images of virtual environments involve modeling the environments using a collection of 3-D geometrical entities with their associated material properties, and a set of light sources Then, rendering techniques such

as radiosity and ray tracing are used to generate the images at given viewpoints The realism of such rendered images is limited by the accuracy in the description of the primitive material and illumination properties and hand coded or mathematically derived graphical models Also, real-time rendering using this technique relies heavily on the complexity of the scene geometry and the hardware configuration

Computer Vision, on the other hand can be considered as an inverse process of computer graphics, which recovers 3-D scene geometry from 2-D images Extracting 3-D geometry

of a scene usually requires solving difficult problems such as stereovision, depth from

Trang 11

shading, or using expensive rangefinders From the 3-D geometry recovered, approximated 3-D models are constructed, from which new images can be synthesized However, these reconstruction techniques are usually computationally expensive and the reconstructed models suffer from the lack of accuracy

Image Based Rendering is an emerging new field, which counters these limitations In this technique, new images and 3-D worlds can be modeled without the knowledge of the geometry of the scene involved Realism is achieved by the fact that the basic entities of the 3-D environment are no longer polygons or geometries, but are pre-acquired images Fig 1.1 depicts the process of Image Based Rendering As can be seen, it has emerged from both the fields of Computer Graphics and Computer Vision and yet bypasses the complicated and limiting stage of defining the scene’s geometry It also shows that the tedious 3D shape modeling can be avoided and little or no knowledge of 3D shape of the scene is required

In addition, what is highlighted in Fig 1.1 is the new step of sampling, which dictates how different the Modeled Synthetic World is going to be, in comparison to the Real world As the quality of the synthetic view is now governed by the reference images at our disposal and not any 3D geometry, the initial stages of sampling becomes of paramount

importance The sampling problem is to determine where the scene needs to be sampled

from and how many such samples are required to adequately sample the scene It is important that a robust solution be formulated for the problem of sampling and determining the exact set of reference imagery required in rendering the 3D world

Trang 12

Figure 1.1: Model of Image Based Rendering

Tackling the sampling problem isn’t as straightforward as over-sampling, as that would result not only in redundancy of sampled data, but also an increased amount of time to re-render the synthetic view On the contrary an attempt to under-sample even if followed by stages of splatting, compromises on the realism of the modeled world, often leaving holes and visual artifacts, or portions of synthetically splatted patches

A lot has been researched in the field of Image Based Rendering since its emergence a few years back Essentially Image Based Rendering is about creating new photo-realistic images of complex scenes through interpolation techniques or other computations based

on input data from photographs, drawings and rendered virtual scenes There are various techniques to model a 3-D World using pre-acquired imagery, viz Layered Depth Images, LumiGraph/Light Field, Panorama, View Morphing etc All these techniques differ slightly in striking a balance between the computation involved in generating new views and the size of the sampled database Irrespective of the approach, the stage of sampling is indispensable to its framework The availability of geometry as in the case of methods like

Sampling

Extract 3-D Geometry (Computer Vision)

Real World

Scene Geometry

Conventional Rendering (Computer Graphics)

Modeled Synthetic World

Image Based Rendering

Reference Imagery

Trang 13

Layered Depth Images provides ample opportunity to precisely select and limit the reference Imagery As for the other approaches, splatting and other techniques to compensate for under sampling have been seen as a possible alternative We shall overview these techniques in Chapter 2, under Overview of Related Work

1.3 Problem Statement and Research Scope

This Research is focused on one of the areas of the huge field of Image Based Rendering, viz Layered Depth Images The aim of this research work is to enhance the realism and hasten the generation of the views achieved by the standard way of Layered Depth Image

based Rendering by adaptively sampling the reference images and pre-computing a patch

lookup table The idea is to introduce a filtering stage after densely over-sampling the real

world The filtering stage, like the sampling stage, being a part of the pre-rendering phase, helps out the rendering engine by pulling out additional computations, and saving up precious time while rendering This method would not only improve the quality of the synthetic images generated, in terms of getting rid of holes and occlusion artefacts, but will also enable quick generation of images, owing to the tabulation of the required sampled imagery that is acquired

The major challenge is to effectively compute the required reference viewpoints from the dense sample to eliminate possible loss or redundancy of data To create a compelling sense of virtual presence, the following goals must be achieved:

• Users can interactively navigate through the 3-D environment, without hardware acceleration

Trang 14

• The photo-realism of the environment ought not to be compromised in terms of holes or other synthetic occlusions

• The speed of the rendering pipeline should be unaffected by the fact that the sampled imagery is bigger than a sparsely sampled image set

The sampled images are assumed to be taken under white light and with an ideal pinhole camera (no lens distortion)

The contributions of this research include:

• Implementation of a Layered Depth Image framework that enables rendering

of complex 3-D environments, catering for absence of holes or visual artefacts

in the modeled world

• An efficient approach to tabulate the pre-acquired set of imagery, to ensure fast reference view selection and rendering of the synthetic views

• A method which retains the realism of the 3-D environment, through dense samples of the real world, and yet achieves a rendering engine which is as fast

as a sparse sampled LDI system

1.4 The System Framework

The original Layered Depth Image system is essentially classified into 3 main phases Scene Sampling, Scene Geometry and Photometry Extraction and Scene Resampling

The System framework of this Layered Depth Image based rendering approach is depicted

Trang 15

Figure 1.2: Framework of the Layered Depth Image based rendering System

in Fig 1.2

We work on the first 3 stages of this framework, the so-called pre-computation phase, and

Reference Views Selection

Normals from depths

Image Samples (with Surface Normals)

Image Samples (Color, range maps)

Layered Depth Image Generation

Render Synthetic Views/Environment

Incremental Warping Process

Scene Sampling

Scene Resampling

Scene Geometry and Photometry Extraction

Trang 16

Figure 1.3: Our System Framework

modify the framework as depicted in Fig 1.3 We briefly go over these new stages of the framework in the sections to come

Hash Table for Reference Views Selection

Patch Identification and Rectangularisation

Contour Formation

Identifying Visibility and Sampling regions

Filtered Image Patch Samples

Reference Views Selection

Image Samples (Color, range maps)

Layered Depth Image Generation

Render Synthetic Views/Environment

Incremental Warping Process

Trang 17

1.4.1 Patch Identification and Rectangularisation

The surface normals at every point are calculated based on the range information (for more information on the calculation of surface normals, please refer to chapter 4, section 4.3.2) The color & range maps, along with the normals constitute our sampled point

cloud From this point cloud, this step attempts to identify the uniform patches, surfaces

that are not uneven and that fit in the camera’s field of view, sets of points defined by certain patch constraints Based on these constraints, the whole point cloud is divided into smaller uniform regions called patches These patches, owing to the constraints thus applied, have a close to constant third dimension The patches are then rectangularised, a recursive process which applies a greedy algorithm to extract the largest rectangle in the patch At the end of this stage, we have categorized the point cloud into rectangular patches, ones that can be summarized as a line in two dimensions, when viewed from the top This process is explained and discussed in detail in chapter 3

1.4.2 Contour Formation

The output of the previous stage, viz the rectangle patches, is fed into this part of the

pipeline, in an attempt to identify unique 2D-contours along the vertical axis The aim of

this stage is to identify those parts in the vertical space, which when viewed from the top, look as if it were in a single plane This stage subsequently summarizes these parts of the object as a 2D-contour associated to a particular vertical range

1.4.3 Identifying Visibility and Sampling Regions

In this step, we find the visibility and sampling regions for each of the contours thus found Visibility region for a particular edge in a 2D-contour is defined as that region from

Trang 18

which the whole edge is visible, if there is no occlusion A sampling region for a

particular edge is defined as the region where it’s appropriate to sample that particular edge, ensuring that all of the data visible on the edge is captured A sampling region is determined by formulae dependent on the size of the edge, the camera calibrations and the sampling camera trajectory

1.4.4 Patch lookup table for Reference view selection

This last step, despite being out of the scene-sampling phase, and being a part of rendering, is worthwhile mentioning at this point, because of the organization of the data

re-in the prior phases Given the structured organization of the sampled pore-ints of the origre-inal data, the selection of reference views to render while generating a new synthetic view becomes straightforward

During rendering, the necessity to look for the closest reference viewpoints or the reference views which cover such and such occluded region is overcome by the fact that these considerations have been addressed during the pre-computation phase of determining the required set of sampled imagery Hence, the reference view selection becomes a fast and straightforward procedure of looking up the hash table of patches thus created, for relevant reference data, as the viewer moves around the 3D space

Trang 19

OVERVIEW OF RELATED WORK

Image based rendering techniques have been classified into four distinct categories: pixel based, block based, reconstruction based and mosaicing [Kang, 1997] These categories are not necessarily mutually exclusive Also, there exists a different categorization, i.e Rendering from Interpolation of Dense Samples, Panorama based Rendering, Morphing and Depth based Rendering These techniques vary largely in the knowledge of the geometry of the scene and the number of samples of the scene We shall restrict our domain to the Depth based Rendering model, to be more specific, Layered Depth Image Based Rendering

The depth based rendering model exploits the additional data available in terms of the 2D image samples being images with depths These so called depth images, in addition to having the color values at a particular pixel, also contain the depth information at that location Synthetic images for new viewpoints are created by a re-projection of the depth pixels in the reference depth images [Lee, 1998] Layered Depth Image Based Rendering

is an extension to the depth based rendering model, which performs warping from an intermediate representation called a Layered Depth Image (LDI) [Shade et al., 1998] An LDI is a view of the scene from a single input camera view, but with multiple pixels along each line of sight An LDI is constructed by warping n depth images into a common camera view

This chapter surveys the various techniques employed to get around the sampling problem

Trang 20

for the LDI based rendering method, discussed in the previous chapter While some of these techniques look at remedying the damage caused by the problem like splatting the holes during rendering, some others attempt to find the best next view to sample, assuming the first sample was ideal All these techniques aim to exploit the geometrical knowledge to improve the photo-realism of the synthetically generated scene

2.1 Splatting

Splatting is a technique, which aims to remedy the effects of the sampling problem The Layered Depth Image, which is created from uniformly sampled images, is splat into the output image by estimating the projected area of the warped pixels [Shade et al., 1998] This estimation is computed differentially based on the distance between the sampled surface point and the LDI camera, the field of view of the camera, the dimensions of the LDI and the angle between the surface normal at the sampled surface point and the line of sight to the LDI camera

As splatting is a post-sampling step, care has to be taken that it doesn’t slow down the rendering engine In this view, a lookup table is generated Before rendering each new image, the new output camera information is used to pre-compute the lookup table

2.2 Multi-Resolution Sampling

Multi-Resolution sampling attempts to get around the problem of over-sampling or sampling for various camera distances, by sampling sets of images for different resolutions While splatting and meshing are proposed to deal with the disocclusion artifacts, they are seemingly adequate only for post-rendering warping in which the

Trang 21

under-resolution of the current view does not deviate much from the under-resolution of the reference image

In cases where an LDI is created from reference images not at similar distances from the object under consideration, insufficient sampling rate of the LDI might cause the synthetic view to look blurrier than it looks in the reference image closer to the object On the contrary excessive sampling rate of the LDI might slow down the rendering pipeline

The LDI Tree method [Chang et al., 1999], employed a hierarchical partition scheme with the concept of LDI, which preserves the sampling rate of the reference images by adaptively selecting an LDI from the LDI cluster for each pixel In another approach, an L-System was implemented, which could store images of varying resolutions at different nodes of the L-System for effective tree modeling [Lluch et al., 2004]

2.3 Sampling all Visible Surfaces

As the name suggests, in this technique of sampling all visible surfaces, an attempt is made to record a series of images that, collectively, capture all visible surfaces of the object This technique revolves around the selection of a good heuristic method to find a good set of viewpoints for a given geometric model The goal is to have sampled images from the computed viewpoints such that every visible surface is shown at least once One such heuristic is to segment the object to exemplify hierarchical visibility [Stuerzlinger, 1998] The scene is assumed to be a set of surface polygons organized in a hierarchy The

Trang 22

hierarchical visibility method subdivides the scene hierarchy depending on the relative visibility of objects

Yet another heuristic is to cover all possible surfaces, masking reference images as each surface is considered [Fleshman et al., 1999] In this approach, the set of scene polygons visible from a viewing zone is approximated and then a greedy algorithm is employed to select a small number of camera positions that together cover every polygon in the geometric model Towards this goal, the boundary of the walking zone is first tessellated Scene polygons are subsequently subdivided to reduce the likelihood of the visibility problems Visibility and quality of the subdivided sections of the polygons determine the worth of any reference image

2.4 Best Next View Sampling

The best next view problem is that of selecting the next view for the sampling system to take, given some already acquired views of the object Two criterions are often considered in solving this problem The visibility criterion attempts to maximize the number of surfaces not seen thus far, by adding the next image to the sampled set, while the quality criterion aims to improve the quality of the surfaces sampled The quality criterion prioritizes an image, which samples a decent number of surfaces, covering most areas of these surfaces, over an image, which samples a lot of surfaces but obliquely

Several methods are available in this form of sampling, most of them differing in the way they establish the two criterion mentioned In one particular approach, a volumetric representation, termed the voxelmap, is generated at each cycle of best next view

Trang 23

computation [Massios, Fisher, 1998] The voxels thus scanned are marked empty, seen, unseen or as an occlusion plane depending on the visibility from the new view The seen voxels carry a quality property, which is estimated by the aggregate normals of all the points sampled in a particular voxel

In yet another approach, each range image sampled thus far is approximated by a triangular mesh [Garcia, 1998] The resolution of the triangular mesh determines the minimum distance of that can be distinguished during the exploration process The edges

in these triangular meshes are marked as exterior, occlusion or interior depending on whether they bound a region or whether they are susceptible to occlude surfaces of the scene or whether they are formed by an overlap of two exterior edges The quality criterion is satisfied by a voting mechanism Each occlusion edge has an associated normal histogram and a tangent histogram Every cell of the histogram keeps the sum of all the associated normals That cell is looked up for which has received the maximum number of votes in either histogram

2.5 Sampling Issue for other Rendering Techniques

Though drifting slightly from the area of concentration, interesting techniques have been researched in two other forms of Image based rendering

In Point based rendering [Grossman, Dally, 1998], an attempt is made to ignore the issue

of adequate sampling during rendering The problem is dealt with, during the phase of sampling, by suggesting that to minimize the number of samples that adequately sample

an object, the distance between the adjacent samples on the surface of the object should be

Trang 24

as large as possible but less than the pixel side length at the target resolution assuming unit magnification An equilateral triangle mesh is used for the purpose

For Lumigraph/Light Field Image based rendering techniques [Gortler et al., 1996], a spectral analysis of light field signals combined with the sampling theorem is used to derive the analytical functions that determine the minimum sampling rate [Chai et al., 2000] The minimum sampling rate is obtained by compacting the replicas of the spectral support of the sampled light field within the smallest interval As it is known that the spectral support of a light field signal is bounded by the minimum and maximum depths only, no matter how complicated the spectral support might be because of the depth variations in the scene, a reconstruction filter with an optimal constant depth can be designed to achieve anti aliased rendering

Trang 25

THE PROPOSED IMPROVEMENT TO THE LDI SYSTEM

It is established that a Layered Depth Image is a view of the scene from a single input camera view, but with multiple pixels along each line of sight It is a comprehensive image data structure, built to take into account various artifacts like occlusions and holes,

by storing not just the first layer of pixels but also a few layers along each line of sight Unfortunately, this comprehensive LDI framework’s ability to render photo realistic views, devoid of holes and other visual artefacts, is highly dependent on the nature of the reference images sampled, to be precise, the number of reference images sampled and the position from which they are sampled

The sampling problem is to determine where the scene needs to be sampled from and how

many such samples are required to adequately sample the scene It is worthwhile to note that under-sampling results in visual artefacts On the contrary, over-sampling helps get around the problem of visual artefacts, but at the cost of the rendering speed There is no one fixed scheme to adequately sample the LDI, as the occlusion artefacts and holes are largely dependent on the scene’s geometry Hence we adopt the approach to adaptively sample the scene based on the scene’s available geometrical data

In this chapter we shall discuss a method to adaptively sample a layered depth image, based on the geometrical information at our disposal We shall elaborate how the LDI

system is improved by this change in the sampling phase and the patch lookup table

generated during the pre-rendering phase In the chapters to follow, we shall go over a

Trang 26

Figure 3.1: Adaptive Sampling Pipeline

sample implementation and the results observed using this model

3.1 Brief Overview

Uniform sampling is the simplest alternative preferred by most LDI engines to-date, and the inadequacy of the sampling problem is solved by methods like splatting discussed in the previous chapter The density of uniform sampling affects the quality of the output Sparse uniform sampling results in visual artefacts while dense uniform sampling ends up with a slow re-rendering pipeline

We approach the adaptive sampling method by starting with a highly dense uniform sample set and adaptively filtering redundant data, retaining only the adequate information Figure 3.1 depicts the adaptive sampling pipeline

Real Scene

Unique Rectangularised Patches

Required Sample Patches

Patch categorization

Contour formation and Sampling graphs Uniform dense sampling Sample Reference

Images

Trang 27

Figure 3.2: Patch Categorization

The step of patch categorization first categorizes the uniformly sampled images into

uniquely defined patches These patches help in guiding the sampling The next step forms contours with these patches, and identifies the sampling regions required Eventually, the exact reference points to sample from are deciphered, and the unnecessary data is

Real Scene

Uniformly Sampled Images

Any more Patches?

No

Highly dense uniform sampling

Trang 28

Figure 3.3: Contour Formation, Sampling and Visibility Graphs

Figure 3.4: Re-rendering Engine

2D Contours

Reference views required

Hash Table for

Reference View

Selection

Required Sample Patches

Scan line traversal and contour identification

Identifying Sampling regions

Identifying Visibility regions

Assimilate required data

Patch Lookup

Table for Reference View

Selection Required Sample

Patches

LDI Walkthrough Engine

Synthetic View

Walkthrough

Reference Patches

Generate LDI

Trang 29

disposed Figure 3.2 elaborates on the step of Patch Categorization and Figure 3.3 depicts the stage of formation of contours and the identification of sampling and visibility regions there on Figure 3.4 highlights the effect of the adaptive sampling method on the final re-rendering engine In the sections to follow, we shall go through each of these steps in detail, defining and discussing the theories and considerations

3.2 Patch Categorization

This step marks one of the most critical steps in the method of adaptive sampling, as it’s in this step that we start from just a point cloud to a representation, which though isn’t as detailed as a triangle mesh, is still informative enough for us to understand the geometry

of the scene and proceed to adaptively sample the object It is necessary to clarify at this point that the patches thus formed are only for guiding the stage of sampling The eventual rendering is still from the originally captured data

3.2.1 Patch

Before going any further with the procedure of patch categorization, we explain the concept of a patch, in the context of data sampling A patch is regarded as any uniform surface on the object (a surface without uneven bumps), which can wholly fit into the field

of view of the camera under consideration

The purpose of defining a patch is to be able to summarize the geometry of the object in a plane in 2 dimensions, so as to ensure that we get a rough sketch of the uniform sections

of the object It’s worthwhile to note that a crude way of expressing adaptive sampling is

to say that more samples are needed for areas of the object which are not too uniform

Trang 30

(occluded by parts of the same surface or different surfaces), and lesser samples for those sections of the object which are fairly smooth Hence the need to be able to clearly distinguish these various sections of the object

3.2.2 Patch Constraints

Having gone through a layman’s definition of a patch, and its purpose, in this subsection

we attempt to formally define a patch A patch is formally defined as all neighboring points in the point cloud which satisfy the following constraints: (for more information on the point cloud and the attributes of the points there-in, please refer to the next chapter, section 4.3, titled Other Issues)

a) The normals between any two neighboring points in the spherical co-ordinate system don’t differ by a preset δnϕ and δnθ

b) The normals between the extreme two points of the patch, in the spherical ordinate system don’t differ by more than a preset ∆nϕ and ∆nθ

co-c) The Z values of any two neighboring points don’t differ by more than a preset δz This ensures that areas on two objects which have smooth transition, but are placed apart in the viewing direction don’t end up being called a patch

d) The size of a patch both horizontally and vertically never exceeds the maximum size that the field of view, θ, of the camera permits at that depth This is calculated

as follows:

Taking a top view and denoting the size of the patch in any scan line with an edge, let the first and last points of this edge be Smax, Smin Suppose the orthogonal bisector of the

Trang 31

Figure 3.5: Patch Size Constraint (top view)

edge intersects the sampling circle at a point, and let the distance from the midpoint of the

edge to the circle be denoted as D (The sampling circle will be explicitly defined in section 3.3.2)

The size d of the patch is:

The same criteria applies for the vertical extent, with the corresponding angle, ϕ Given the point cloud and given these constraints, the patches more-or-less as defined in section 3.2.1 are obtained and the whole point cloud is categorized

3.2.3 Rectangularisation of Patches

We stated at the end of the previous subsection that we have categorized the point cloud into patches, which are “more-or-less” as defined earlier The reason why these patches are still not exactly the way we defined is because, though the patches are uniform and if seen from the top or bottom look like they occupy only 2-dimensions, they still have a

Trang 32

non-uniform shape in the 2-dimensional plane We hence attempt to break down these patches obtained, into patches of shape that can be easily summarized in one dimension,

as a line We choose a rectangle for our convenience and for the fact that it can be reduced

to a line along a scan line

In order to break the patches gotten so far, into rectangles, we follow a step called the rectangularisation of patches In this step, for each of the patches thus identified, we find the biggest rectangle that can fit into it, and cut that portion out We follow the same procedure for the remaining area in the patch, until we are left with areas, which are smaller than a preset area All the portions cut out from the patch in the process are rectangularised components of the patch, and hence the term rectangularised patches

It can hence be seen that a single patch obtained by patch categorization may later end up

as a few rectangularised patches

3.2.4 Patch Merging

So far we have defined and discussed about how to theoretically get patches from the point cloud Practically, storing the entire sampled point cloud in the program’s heap may not be feasible, owing to the highly dense sampling A simpler and more practical approach is discussed in this subsection

We consider one reference image at a time, and find the patches from its point cloud As the size of individual images is much smaller, the heap constraint is no longer applicable

Trang 33

As patches are being formed, they are checked for overlaps with patches found from previous reference images

Merging the patches is done by isolating the various vertical cross-sections, and concentrating on the patches that overlap If we are to find two sections of a patch that need to be merged, we can be assured that they would have an overlap, owing to the high density of the samples Care should be taken to make sure that the patch, as it’s being merged, still satisfies the patch constraints, over the border and as a whole

Redundant patch areas can be found and dropped, when patches are found which overlap

by an extent more than the approximation involved in the step of rectangularisation The section of the patch that is not redundant (which is now definitely smaller than the maximum size a patch permits) can be merged with some other patch if need be

It’s worthwhile to mention the effect of the sequence of the reference images, on the eventual set of patches generated The patch merging process is applied in a linear fashion, with the patches being created in the current reference image, compared with the patches thus formed, to check for an overlap This makes the efficiency of the merging process reliant on which reference image is considered next The reduction in the data redundancy

is independent of this decision; however the number of patches generated, is not A recursive approach to make patch-merging independent of the sequence of the reference images is computationally expensive Also, this is unnecessary in situations where there is

a clearly defined sampling trajectory on which the reference images were captured In our system implementation, the reference images are considered in the same order that they

Trang 34

Figure 3.6: From a group of Rectangle Patches to a 2-D contour

were captured during the sampling camera traversal on the sampling trajectory This ensures maximum overlap between consecutive reference images, given that our sampling trajectory was a circle

3.3 Contour Formation, Sampling and Visibility Graphs

A contour, literally, stands for a 2-dimensional shape expressed as a line representation Our definition isn’t far from this meaning of contours Having broken down the point cloud into rectangularised patches, we are now in a position to form a skeleton of the entire object

3.3.1 Contour Formation

In the previous section, we discussed how we categorized the point cloud into a set of rectangle patches In this section, we attempt to make a skeleton out of these uniform sections of the object

Each patch obtained thus far, is labeled with a number A scan line traversal is now performed to see the patches that are encountered at each scan line At the end of this scan

Trang 35

Algorithm 3.1: Contour Formation line traversal, we have a set of patches traversed for each scan line We group all the scan lines with the same patch traversals together Given these groups of patches, we try to make one contour to represent each of these groups The contour is basically the representation of what we would see when viewed from the top Each patch, since close-to-constant in the 3rd dimension, and a rectangle in 2 dimensions, can be represented as one line, as it would be when seen from the same plane in which it is present Algorithm 3.1 depicts the pseudo code that summarizes the contour formation algorithm applied to the rectangularised patches

We hence have distinctly formed contours, representing various vertical segments of the

procedure FormContours (patch [])

for k←0 to patch.size-1 //find patch demarcations in the vertical direction

demarcations.add(patch[k].min_y) demarcations.add(patch[k].max_y)

/form a contour with the patches in temp, and applicable for y from prev_demarcation to the current

createContour (temp, prev_demarcation, demarcations[j])

end for

end procedure

Trang 36

object Figure 3.6 depicts the formation of a contour from a simple group of rectangle patches

3.3.2 Sampling Arc

Having broken down the point cloud into a few 2D contours, the initial problem of sampling now boils down to adequately sampling all the edges in each of these contours

In this context we shall define the concept of a sampling arc For any contour, we attempt

to sample the edges from the circumference of a circle, lying on the plane of the contour, with its center at the object’s origin and a radius, which defines how close we can get to

the object during camera walkthrough We call this the sampling circle We can have multiple concentric sampling circles for various resolutions

For any edge, a sampling arc is defined as the arc of the sampling circle, such that from

any point on that arc, the edge under consideration has maximum visibility To understand the sampling arc better, we need to take a brief look at the concept of cameras and views

Any camera has a view plane on to which any point seen from the camera is projected When we see through the camera, or take images with the camera, what we see is a projection of the scene on to the camera’s view plane The number of pixels on the view plane does not necessarily have a one-to-one mapping with the number of actual points in the world coordinate system

The sampling arc of any edge can now be defined as the arc of the sampling circle, defined by all those points on the circle’s circumference, from which the geometric

Trang 37

Figure 3.7: Sampling Arc

content of the pixels seen, as compared to what is seen from an orthogonal view of the edge, is unchanged Figure 3.7 depicts a contour, the object’s sampling circle and the sampling arc of a particular edge labeled “e1” of the contour

When an edge is sampled from a point on the sampling circle outside the sampling arc, we end up sampling fewer pixels than can be seen from points on the sampling arc We call

this phenomenon oblique sampling

3.3.3 Determining the Sampling Arc

Having seen the definition and necessity of a sampling arc, we shall in this subsection see how to find the sampling arc, given an edge and the sampling circle

Suppose we choose a Cartesian st-coordinate system for the sampling circle with the

e1

Sampling Arc

Sampling

Figure 3.7: Sampling Arc

Trang 38

center of the circle placed at its origin and the t-axis parallel to the edge Before actually

deriving the sampling arc, let us look at how the edge depth z, varies with respect to the

camera motion, as depicted in Figure 3.8 We define the edge depth, as the distance

between the midpoint of the edge and the camera placed on the sampling circle The prime

sample point (so,to), is defined as the point of intersection of the edge’s orthogonal bisector

and the sampling circle For any given edge, its prime depth, zo, is defined as the edge depth when the camera is at the prime sampling point The edge depth function Fz, which

is the edge depth as a function of the sampling point, can now be formulated as:

z’ = Fz (s’, t’) = ∆s / cos ( tan-1 ( (zo - ∆t)/ ∆s ) ), ∆s ≠ 0 or ∆t ≠ 0

zo, ∆s = 0 and ∆t = 0 (where ∆s = s’ - so , and ∆t = t’ - t o )

(3.2)

Now we are well equipped to derive the sampling arc, given any edge and the sampling

circle We have discussed before, while defining the sampling arc, that there is no one mapping between the actual points on the object and the pixels seen on the image plane It is clear that the maximum visibility for the surface associated with the edge is obtained when seen from the prime sample point However, as we would see further, this maximal visibility extends to a certain span on either side of the prime sample point, on the circumference of the sampling circle, giving us the sampling arc, where the best

one-to-possible view of the edge can still be maintained

Our aim is to find a sampling arc, such that the same pixel resolution as seen from the

Trang 39

from Figure 3.7 that as we move away from the prime sample point, the number of points

on the edge projected to a pixel on the view plane will reduce Suppose, when viewed from the prime sample point, we establish clearly the largest segments on the edge which map to at most two pixels on the view plane, we would end up dividing the edge into several overlapping segments Given that the length on the image plane, that these segments correspond to, keeps reducing as we move away from the prime sample point,

we reach a stage where at least one of these segments corresponds to less than two pixels

on the image plane It is clear that, this is the point that defines the end point of the sampling arc From Figure 3.8, we infer that the righter more the segment on the edge, the more its reduction of size on the image plane, upon moving left Also, given that the right most and the left most segments of the edge, are the smallest, it is evident that one of these would be the first of the segments to correspond to less than two pixels on the image

plane We call these the critical segments of the edge

Hence, the problem of finding the sampling arc boils down to finding the points on the

sampling circle where the two critical segments occupy 2 pixels Before going further, we

need to acquaint ourselves with two camera dependent parameters The pixel size ∆p is the length of one pixel Its corresponding edge segment is ∆pw The camera distance f, is the distance between the camera’s lens and the view plane

We now term the pixel occupancy function, FW, as the number of pixels occupied by the critical segments Given that an edge has a right and left critical segment, we have a right and a left pixel occupancy function, FWR, FWL The sampling arc is thus determined by the

Trang 40

set (s’,t’) such that FWR (s’, t’) > 1 and FWL (s’, t’) > 1 Solving for this, the left end point

of the sampling arc is determined by the equation:

Định dạng
Số trang	109
Dung lượng	1,37 MB