A new depth image quality metric using a pair of color and depth images

DOI 10.1007/s11042-016-3392-4A new depth image quality metric using a pair of color and depth images Thanh-Ha Le 1 · Seung-Won Jung 2 · Chee Sun Won 3 Received: 17 July 2015 / Revised: 1

Trang 1

DOI 10.1007/s11042-016-3392-4

A new depth image quality metric using a pair of color

and depth images

Thanh-Ha Le 1 · Seung-Won Jung 2 · Chee Sun Won 3

Received: 17 July 2015 / Revised: 17 February 2016 / Accepted: 23 February 2016

Abstract Typical depth quality metrics require the ground truth depth image or

stereo-scopic color image pair, which are not always available in many practical applications In this paper, we propose a new depth image quality metric which demands only a single pair

of color and depth images Our observations reveal that the depth distortion is strongly related to the local image characteristics, which in turn leads us to formulate a new dis-tortion assessment method for the edge and non-edge pixels in the depth image The local depth distortion is adaptively weighted using the Gabor filtered color image and added up

to the global depth image quality metric The experimental results show that the proposed metric closely approximates the depth quality metrics that use the ground truth depth or stereo color image pair

Keywords Depth image· Image quality assessment · Reduced reference · Quality metric

Seung-Won Jung

swjung83@dongguk.edu

Thanh-Ha Le

ltha@vnu.edu.vn

Chee Sun Won

cswon@dongguk.edu

1 University and Engineering and Technology, Vietnam National University, Hanoi, Vietnam

2 Department of Multimedia Engineering, Dongguk University, Pildong-ro 1gil, Jung-gu, Seoul 100-715, Korea

3 Department of Electronics and Electrical Engineering, Dongguk University, Pildong-ro 1gil,

Jung-gu, Seoul 100-715, Korea

Trang 2

Multimed Tools Appl

1 Introduction

Depth images play a fundamental role in many 3-D applications [6,17,24,25] For example, depth images can be used to generate arbitrary novel viewpoint images by interpolating or extrapolating the images at the available viewpoints In addition, high quality depth images open up opportunities to solve challenging problems in computer vision [13] The depth image can be obtained either by matching a rectified color image pair, i.e., stereoscopic image, or by using depth cameras In stereo matching techniques [19,29], inaccurate depth images are often produced because of occlusion, repeated patterns, and large homogeneous regions Although the inherent difficulty of stereo matching can be solved using the depth camera [27], an inevitable sensor noise problem remains

Owing to the widespread use of depth images, the quality assessment of depth images becomes essential One simple method of the depth image quality assessment is to compare the depth image to be tested with its ground truth depth image [21] This method corre-sponds to the full reference depth quality metric (FR-DQM), which can precisely measure the accuracy of the depth image However, the ground truth depth image is not always attain-able in most practical applications An alternative method is to evaluate the quality of the reconstructed color image obtained using the depth image For example, the right-viewpoint image can be rendered by using the left-view image and depth image, and the rendered image can be compared with the original right-viewpoint image [12] However, such a color image pair is not always obtainable in the depth-image-based rendering (DIBR) applications [1,5]

In this paper we introduce a new depth quality metric, which requires only a pair of color and depth images and, thus, is a reduced reference DQM (RR-DQM) Here, we consider the color image as the reduced reference for the depth quality assessment To formulate the depth quality metric we investigate the effects of various sources of depth distortions and come up with a local measurement using the Gabor filter [2] and the smallest univalue segment assimilating nucleus (SUSAN) detector [23] The experimental results demon-strate that the proposed RR-DQM closely approximates the conventional DQMs that use the ground truth depth information or stereo image pair

This paper is an extended version of our conference paper [20] Compared to [20], more detailed description of the proposed metric is provided with extensive experimental verifica-tion Moreover, the proposed metric is applied to the depth image post-processing technique

to show its usefulness

The rest of the paper is organized as follows The proposed RR-DQM is described in Section2 The experimental results and conclusions are given in Section3and Section4, respectively

2 Proposed depth quality metric

A new depth image quality metric is designed for the case when the depth image is not used

in a stand-alone fashion but in a combined fashion with the color image The combination

of color and depth images is often required in multi-view and 3-D video applications, where the depth image is frequently used to render or synthesize color images at novel viewpoints

In such applications, since the same local distortion of the depth image does not equally affect the resultant color images, we need to consider the local distortion of the depth image jointly with the local characteristics of the color image For example, a pair of simple

Trang 3

Fig 1 (a)-(b) Synthetic image pair and (c) ground truth depth image

synthetic grayscale images of the size 400×400 is shown in Fig.1a and b Here, the square region of Fig 1a including horizontal, vertical, and two diagonal edges is left-shifted by

50 pixels as shown in Fig.1b In other words, the pixels inside the square have the same horizontal disparity value as shown in Fig.1c The black background is located in the infi-nite distance, i.e disparity value is zero Note that the other directional disparities can be ignored when the two images are rectified [14] To analyze the effect of depth distortion, we change the disparity values inside the square as shown in Fig.2a and b Precisely, one noisy row is generated using the zero-mean uniform random distribution with a variance of 10 The length of the row is the same as the width of the square, and the generated noisy row

is added to all the rows in the square of the depth image, resulting Fig.2a This can simu-late depth distortion along the horizontal direction Note that the depth values should be the same along the vertical direction The depth image with distortion along vertical direction can be produced in a similar manner as shown in Fig.2b

Given a pair of color and depth images, the stereoscopic image can be obtained Specif-ically, the pixels in the one viewpoint image can be found from the pixels in the other viewpoint image in which the pixel positions are determined according to the disparity val-ues in the depth image From the synthesized color image, we can analyze the influence of depth distortions From Fig.2c and d, one can notice that the horizontal image edges are not seriously deteriorated Since only the horizontal disparity is assumed, different direc-tional distortions can change only the start and end positions of the horizontal edges In other words, the local distortion of the depth image in the horizontal edge regions does not have a significant impact on the quality of the rendered image It can be also found that the distortion in the rendered images is prominent when the depth value varies along the image edges For example, the vertical image edges are severely damaged when the depth image has distortion along vertical direction as shown in Fig.2b and d

From the above observations, it is found that the effect of local depth distortion is strongly dependent on the local image characteristics Thus, the relation between the depth distortion and image characteristics should be exploited to measure the quality of the depth image Figure 3shows the flowchart of the proposed RR-DQM in which the Gabor filter [2] is used to weight differently according to the local image structures In addition, the SUSAN edge detector [23] is employed to attain the edge information of the image In particular, the SUSAN detector is known to robustly estimate image edges and their edge direction Of course, other edge detectors [15,18] can be employed

Trang 4

Multimed Tools Appl

Fig 2 Distorted depth images and rendered grayscale images: (a) depth image with distortion along

horizon-tal direction, (b) depth image with distortion along vertical direction, (c) rendered left-view image obtained

using Fig 1 b and a, g rendered left-view image obtained using Fig 1 b and b

The Gabor filter is a close model of the receptive fields [7,28] and widely applied to

image processing applications Let g denote the kernel of the Gabor filter defined as follows:

g(x, y)= exp

−x2r + γ2y r2 2σ2

cos

2π x r

x r = x cos θ + y sin θ

In (1) and (2), γ , σ , λ, φ and θ represent the aspect ratio, standard deviation, preferred

wavelength, phase offset, and orientation of the normal to the parallel stripes, respectively [4] Since the Gabor filter can be simply viewed as a sinusoidal plane wave multiplied by the

Trang 5

Fig 3 Flowchart of the proposed depth image quality metric

Gaussian envelope, especially for the edge and bar detection, antisymmetric and symmetric versions of the Gabor filter [10] can be defined as

g edge (x, y)= exp

−x r2+ γ2y2

r 2σ2

sin

2π x r λ

g bar (x, y)= exp

−x2r + γ2y r2 2σ2

cos

2π x r λ

The edges and bars of the image, I edge and I bar, are obtained by convolving the original

image I with g edge and g bar , respectively Here, the mean value of g bar is subtracted to compensate for the DC component

The filtered outputs are combined into a single quantity I θ, called the Gabor engergy, as follows:

I θ (x, y)=I bar2 (x, y) + I2

This Gabor energy approximates a specific type of orientation selective neuron in the pri-mary visual cortex [9] Figure4shows the Gabor energy results on Fig.1a with various θ values In this example, γ and σ are set to 0.5 and 1.69 according to the default settings [4]

In addition, λ is adjusted to 3 and the Gabor energy outputs are scaled for the visualization.

As can be seen, the four directional components are successfully decomposed and the per-ceptually sensitive regions are distinguished Thus, the Gabor energy of the image can be exploited to adaptively weight the local distortion of the depth image

In Fig 2, we found that the influence of local depth distortion is strongly related to edge direction To this end, the SUSAN detector is used to extract edges and their direc-tions Detailed description and analysis of the SUSAN operator can be found in [23] Let

E I b and E d I denote the edge map and edge direction map of the image I obtained using the SUSAN detector, respectively For simplicity, E I dis quantized to represent only the horizon-tal, vertical, left diagonal, and right diagonal directions At the non-edge pixels, local depth distortion is measured by the average difference of the depth values in the local neighbor-hood On the other hand, at the edge pixels, depth variation along edge direction is measured

to consider edge distortion or deformation

Trang 6

Multimed Tools Appl

Fig 4 Gabor filtered results on Fig.1a: (a) I0◦, (b) I90◦, (c) I135◦, (d) I45◦

Let denote the depth distortion map obtained using the binary edge map E b I and the

depth image D:

(x, y)=

⎧

⎨

⎩

1 8

(u,v) ∈N8

|D(x, y)−D(x+u, y+v)| ; if E I

b (x, y)=0

D(x, y)−1

2(D(x +x1, y +y1)+D(x+x2, y+y2)) ; otherwise , (6)

where N8 represents the 8-neighborhood At the non-edge pixels, the mean absolute dif-ference (MAD) is used to measure local depth distortion Meanwhile, at the edge pixels, the average of the two adjacent depth values along the edge direction is differentiated with

the center pixel’s depth value In other words, (x i , y i )is determined according to the edge

direction For example, (x1, y1) = (1, 0) and (x2, y2) = ( −1, 0) for the horizontal edge Note

that the central difference can distinguish an abrupt change from a gradual change Thus, a

Trang 7

natural change of depth values along edge direction caused by slanted surfaces is excluded

in the computation of local depth distortion

When the depth image is captured by the depth camera, saturated pixels often appear in highly reflective regions Such saturated depth pixels have invalid depth values, and thus we consider those pixels as outliers Many stereo matching algorithms [21,26] also identify the outlier pixels without estimating their depth values In the proposed method, if the neigh-boring pixel in (6) belongs to the outlier pixels, the corresponding position is excluded from

N8 In a similar manner, for the edge positions, only one neighboring depth value is used if one of two neighbors is outlier Distortion estimation is not performed if both are outliers

As the depth discontinuities along color image edges are the major source of local depth

distortion, both the local image characteristics (I θ ) and local depth distortion () are used

to obtain the global distortion map To this end, is defined by merging I θ and as

follows:

(x, y)=

θ ∈

where = {0◦,45◦,90◦,135◦} and α θ is the weight of the direction θ Figure5shows the resultant distortion maps obtained using Fig.2a and b, where α θ = {1, 0.5, 0, 0.5} for the four directions in By comparing Figs.2and5, it can be seen that the distortion maps are highly correlated with visual geometric degradation caused by depth distortion

Finally, the RR-DQM is defined by pooling all the distortion values in except for the

outlier regions,

n(ϒ1)

⎛

⎝

(x,y) ∈ϒ1

where ϒ1 is a set of all pixels excluding the outlier pixels and n(ϒ1)is the cardinality of

ϒ1 Note that the proposed RR-DQM requires only one pair of color and depth images

Fig 5 Global distortion maps corresponding to (a) Figs.2a and (b)2

Trang 8

Multimed Tools Appl

Fig 6 Example on the Cones image: (a) original left-view image, (b) ground truth depth image, (c) ground

truth occlusion map, (d) LPCD distorted depth image with max=2, (e) compensated left-view image, (f)

error image excluding occluded region

Trang 9

Fig 7 Scatter plot of the RR-DQM versus depth distortion for random noise: (a) Cones, (b) T eddy, (c)

T sukuba , (d) V enus

3 Experimental results

The proposed RR-DQM mainly consists of the Gabor filer and the SUSAN detector with

some parameters In the Gabor filter, γ and σ were set to 0.5 and 1.69, respectively (these are

the default values in typical applications of Gabor filters [4]) In addition, λ was empirically

determined as 3 by using test color-depth image pairs available in the Middlebury database [21] The brightness threshold and the kernel radius in the SUSAN operator were chosen to

15 and 3, respectively, according to [23]

In order to validate the proposed RR-DQM, the RR-DQM was compared with the con-ventional metrics To this end, we used the Middlebury dataset, where the ground truth depth image and stereo image pair are available The two different types of the depth distortion were simulated First, the uniformly distributed random noise was added to the ground truth depth image since the noisy depth images can approximate the depth images obtained by

Trang 10

Multimed Tools Appl

Fig 8 Scatter plot of the RR-DQM versus prediction error for random noise: (a) Cones, (b) T eddy, (c)

T sukuba , (d) V enus

the depth camera Second, the geometric distortion, local permutation with cancelation and duplication (LPCD) [3], was applied to the depth image D by

D(x, y) = D gt (x + h (x, y), y + w (x, y)), (9)

where D gt denotes the ground truth depth image, h and w are the i.i.d integer ran-dom variables uniformly distributed in the interval [−max, max], and maxcontrols the amount of distortion This local geometric distortion can simulate the inaccurate depth val-ues in the object boundaries, where the stereo matching techniqval-ues usually find difficulty in estimating depth values

Given the degraded and ground truth depth images, the FR-DQM measures the difference between two depth images as follows:

(x,y) ∈ϒ2

D(x, y) − D gt (x, y)2

Trang 10

Định dạng
Số trang	19
Dung lượng	3,97 MB