DOI 10.1007/s11042-016-3392-4A new depth image quality metric using a pair of color and depth images Thanh-Ha Le 1 · Seung-Won Jung 2 · Chee Sun Won 3 Received: 17 July 2015 / Revised: 1
Trang 1DOI 10.1007/s11042-016-3392-4
A new depth image quality metric using a pair of color
and depth images
Thanh-Ha Le 1 · Seung-Won Jung 2 · Chee Sun Won 3
Received: 17 July 2015 / Revised: 17 February 2016 / Accepted: 23 February 2016
© Springer Science+Business Media New York 2016
Abstract Typical depth quality metrics require the ground truth depth image or
stereo-scopic color image pair, which are not always available in many practical applications In this paper, we propose a new depth image quality metric which demands only a single pair
of color and depth images Our observations reveal that the depth distortion is strongly related to the local image characteristics, which in turn leads us to formulate a new dis-tortion assessment method for the edge and non-edge pixels in the depth image The local depth distortion is adaptively weighted using the Gabor filtered color image and added up
to the global depth image quality metric The experimental results show that the proposed metric closely approximates the depth quality metrics that use the ground truth depth or stereo color image pair
Keywords Depth image· Image quality assessment · Reduced reference · Quality metric
Seung-Won Jung
swjung83@dongguk.edu
Thanh-Ha Le
ltha@vnu.edu.vn
Chee Sun Won
cswon@dongguk.edu
1 University and Engineering and Technology, Vietnam National University, Hanoi, Vietnam
2 Department of Multimedia Engineering, Dongguk University, Pildong-ro 1gil, Jung-gu, Seoul 100-715, Korea
3 Department of Electronics and Electrical Engineering, Dongguk University, Pildong-ro 1gil,
Jung-gu, Seoul 100-715, Korea
Trang 2Multimed Tools Appl
1 Introduction
Depth images play a fundamental role in many 3-D applications [6,17,24,25] For example, depth images can be used to generate arbitrary novel viewpoint images by interpolating or extrapolating the images at the available viewpoints In addition, high quality depth images open up opportunities to solve challenging problems in computer vision [13] The depth image can be obtained either by matching a rectified color image pair, i.e., stereoscopic image, or by using depth cameras In stereo matching techniques [19,29], inaccurate depth images are often produced because of occlusion, repeated patterns, and large homogeneous regions Although the inherent difficulty of stereo matching can be solved using the depth camera [27], an inevitable sensor noise problem remains
Owing to the widespread use of depth images, the quality assessment of depth images becomes essential One simple method of the depth image quality assessment is to compare the depth image to be tested with its ground truth depth image [21] This method corre-sponds to the full reference depth quality metric (FR-DQM), which can precisely measure the accuracy of the depth image However, the ground truth depth image is not always attain-able in most practical applications An alternative method is to evaluate the quality of the reconstructed color image obtained using the depth image For example, the right-viewpoint image can be rendered by using the left-view image and depth image, and the rendered image can be compared with the original right-viewpoint image [12] However, such a color image pair is not always obtainable in the depth-image-based rendering (DIBR) applications [1,5]
In this paper we introduce a new depth quality metric, which requires only a pair of color and depth images and, thus, is a reduced reference DQM (RR-DQM) Here, we consider the color image as the reduced reference for the depth quality assessment To formulate the depth quality metric we investigate the effects of various sources of depth distortions and come up with a local measurement using the Gabor filter [2] and the smallest univalue segment assimilating nucleus (SUSAN) detector [23] The experimental results demon-strate that the proposed RR-DQM closely approximates the conventional DQMs that use the ground truth depth information or stereo image pair
This paper is an extended version of our conference paper [20] Compared to [20], more detailed description of the proposed metric is provided with extensive experimental verifica-tion Moreover, the proposed metric is applied to the depth image post-processing technique
to show its usefulness
The rest of the paper is organized as follows The proposed RR-DQM is described in Section2 The experimental results and conclusions are given in Section3and Section4, respectively
2 Proposed depth quality metric
A new depth image quality metric is designed for the case when the depth image is not used
in a stand-alone fashion but in a combined fashion with the color image The combination
of color and depth images is often required in multi-view and 3-D video applications, where the depth image is frequently used to render or synthesize color images at novel viewpoints
In such applications, since the same local distortion of the depth image does not equally affect the resultant color images, we need to consider the local distortion of the depth image jointly with the local characteristics of the color image For example, a pair of simple
Trang 3Fig 1 (a)-(b) Synthetic image pair and (c) ground truth depth image
synthetic grayscale images of the size 400×400 is shown in Fig.1a and b Here, the square region of Fig 1a including horizontal, vertical, and two diagonal edges is left-shifted by
50 pixels as shown in Fig.1b In other words, the pixels inside the square have the same horizontal disparity value as shown in Fig.1c The black background is located in the infi-nite distance, i.e disparity value is zero Note that the other directional disparities can be ignored when the two images are rectified [14] To analyze the effect of depth distortion, we change the disparity values inside the square as shown in Fig.2a and b Precisely, one noisy row is generated using the zero-mean uniform random distribution with a variance of 10 The length of the row is the same as the width of the square, and the generated noisy row
is added to all the rows in the square of the depth image, resulting Fig.2a This can simu-late depth distortion along the horizontal direction Note that the depth values should be the same along the vertical direction The depth image with distortion along vertical direction can be produced in a similar manner as shown in Fig.2b
Given a pair of color and depth images, the stereoscopic image can be obtained Specif-ically, the pixels in the one viewpoint image can be found from the pixels in the other viewpoint image in which the pixel positions are determined according to the disparity val-ues in the depth image From the synthesized color image, we can analyze the influence of depth distortions From Fig.2c and d, one can notice that the horizontal image edges are not seriously deteriorated Since only the horizontal disparity is assumed, different direc-tional distortions can change only the start and end positions of the horizontal edges In other words, the local distortion of the depth image in the horizontal edge regions does not have a significant impact on the quality of the rendered image It can be also found that the distortion in the rendered images is prominent when the depth value varies along the image edges For example, the vertical image edges are severely damaged when the depth image has distortion along vertical direction as shown in Fig.2b and d
From the above observations, it is found that the effect of local depth distortion is strongly dependent on the local image characteristics Thus, the relation between the depth distortion and image characteristics should be exploited to measure the quality of the depth image Figure 3shows the flowchart of the proposed RR-DQM in which the Gabor filter [2] is used to weight differently according to the local image structures In addition, the SUSAN edge detector [23] is employed to attain the edge information of the image In particular, the SUSAN detector is known to robustly estimate image edges and their edge direction Of course, other edge detectors [15,18] can be employed
Trang 4Multimed Tools Appl
Fig 2 Distorted depth images and rendered grayscale images: (a) depth image with distortion along
horizon-tal direction, (b) depth image with distortion along vertical direction, (c) rendered left-view image obtained
using Fig 1 b and a, g rendered left-view image obtained using Fig 1 b and b
The Gabor filter is a close model of the receptive fields [7,28] and widely applied to
image processing applications Let g denote the kernel of the Gabor filter defined as follows:
g(x, y)= exp
−x2r + γ2y r2 2σ2
cos
2π x r
x r = x cos θ + y sin θ
In (1) and (2), γ , σ , λ, φ and θ represent the aspect ratio, standard deviation, preferred
wavelength, phase offset, and orientation of the normal to the parallel stripes, respectively [4] Since the Gabor filter can be simply viewed as a sinusoidal plane wave multiplied by the
Trang 5Fig 3 Flowchart of the proposed depth image quality metric
Gaussian envelope, especially for the edge and bar detection, antisymmetric and symmetric versions of the Gabor filter [10] can be defined as
g edge (x, y)= exp
−x r2+ γ2y2
r 2σ2
sin
2π x r λ
g bar (x, y)= exp
−x2r + γ2y r2 2σ2
cos
2π x r λ
The edges and bars of the image, I edge and I bar, are obtained by convolving the original
image I with g edge and g bar , respectively Here, the mean value of g bar is subtracted to compensate for the DC component
The filtered outputs are combined into a single quantity I θ, called the Gabor engergy, as follows:
I θ (x, y)=I bar2 (x, y) + I2
This Gabor energy approximates a specific type of orientation selective neuron in the pri-mary visual cortex [9] Figure4shows the Gabor energy results on Fig.1a with various θ values In this example, γ and σ are set to 0.5 and 1.69 according to the default settings [4]
In addition, λ is adjusted to 3 and the Gabor energy outputs are scaled for the visualization.
As can be seen, the four directional components are successfully decomposed and the per-ceptually sensitive regions are distinguished Thus, the Gabor energy of the image can be exploited to adaptively weight the local distortion of the depth image
In Fig 2, we found that the influence of local depth distortion is strongly related to edge direction To this end, the SUSAN detector is used to extract edges and their direc-tions Detailed description and analysis of the SUSAN operator can be found in [23] Let
E I b and E d I denote the edge map and edge direction map of the image I obtained using the SUSAN detector, respectively For simplicity, E I dis quantized to represent only the horizon-tal, vertical, left diagonal, and right diagonal directions At the non-edge pixels, local depth distortion is measured by the average difference of the depth values in the local neighbor-hood On the other hand, at the edge pixels, depth variation along edge direction is measured
to consider edge distortion or deformation
Trang 6Multimed Tools Appl
Fig 4 Gabor filtered results on Fig.1a: (a) I0◦, (b) I90◦, (c) I135◦, (d) I45◦
Let denote the depth distortion map obtained using the binary edge map E b I and the
depth image D:
(x, y)=
⎧
⎨
⎩
1 8
(u,v) ∈N8
|D(x, y)−D(x+u, y+v)| ; if E I
b (x, y)=0
D(x, y)−1
2(D(x +x1, y +y1)+D(x+x2, y+y2)) ; otherwise , (6)
where N8 represents the 8-neighborhood At the non-edge pixels, the mean absolute dif-ference (MAD) is used to measure local depth distortion Meanwhile, at the edge pixels, the average of the two adjacent depth values along the edge direction is differentiated with
the center pixel’s depth value In other words, (x i , y i )is determined according to the edge
direction For example, (x1, y1) = (1, 0) and (x2, y2) = ( −1, 0) for the horizontal edge Note
that the central difference can distinguish an abrupt change from a gradual change Thus, a
Trang 7natural change of depth values along edge direction caused by slanted surfaces is excluded
in the computation of local depth distortion
When the depth image is captured by the depth camera, saturated pixels often appear in highly reflective regions Such saturated depth pixels have invalid depth values, and thus we consider those pixels as outliers Many stereo matching algorithms [21,26] also identify the outlier pixels without estimating their depth values In the proposed method, if the neigh-boring pixel in (6) belongs to the outlier pixels, the corresponding position is excluded from
N8 In a similar manner, for the edge positions, only one neighboring depth value is used if one of two neighbors is outlier Distortion estimation is not performed if both are outliers
As the depth discontinuities along color image edges are the major source of local depth
distortion, both the local image characteristics (I θ ) and local depth distortion () are used
to obtain the global distortion map To this end, is defined by merging I θ and as
follows:
(x, y)=
θ ∈
where = {0◦,45◦,90◦,135◦} and α θ is the weight of the direction θ Figure5shows the resultant distortion maps obtained using Fig.2a and b, where α θ = {1, 0.5, 0, 0.5} for the four directions in By comparing Figs.2and5, it can be seen that the distortion maps are highly correlated with visual geometric degradation caused by depth distortion
Finally, the RR-DQM is defined by pooling all the distortion values in except for the
outlier regions,
n(ϒ1)
⎛
⎝
(x,y) ∈ϒ1
where ϒ1 is a set of all pixels excluding the outlier pixels and n(ϒ1)is the cardinality of
ϒ1 Note that the proposed RR-DQM requires only one pair of color and depth images
Fig 5 Global distortion maps corresponding to (a) Figs.2a and (b)2
Trang 8Multimed Tools Appl
Fig 6 Example on the Cones image: (a) original left-view image, (b) ground truth depth image, (c) ground
truth occlusion map, (d) LPCD distorted depth image with max=2, (e) compensated left-view image, (f)
error image excluding occluded region
Trang 9Fig 7 Scatter plot of the RR-DQM versus depth distortion for random noise: (a) Cones, (b) T eddy, (c)
T sukuba , (d) V enus
3 Experimental results
The proposed RR-DQM mainly consists of the Gabor filer and the SUSAN detector with
some parameters In the Gabor filter, γ and σ were set to 0.5 and 1.69, respectively (these are
the default values in typical applications of Gabor filters [4]) In addition, λ was empirically
determined as 3 by using test color-depth image pairs available in the Middlebury database [21] The brightness threshold and the kernel radius in the SUSAN operator were chosen to
15 and 3, respectively, according to [23]
In order to validate the proposed RR-DQM, the RR-DQM was compared with the con-ventional metrics To this end, we used the Middlebury dataset, where the ground truth depth image and stereo image pair are available The two different types of the depth distortion were simulated First, the uniformly distributed random noise was added to the ground truth depth image since the noisy depth images can approximate the depth images obtained by
Trang 10Multimed Tools Appl
Fig 8 Scatter plot of the RR-DQM versus prediction error for random noise: (a) Cones, (b) T eddy, (c)
T sukuba , (d) V enus
the depth camera Second, the geometric distortion, local permutation with cancelation and duplication (LPCD) [3], was applied to the depth image D by
D(x, y) = D gt (x + h (x, y), y + w (x, y)), (9)
where D gt denotes the ground truth depth image, h and w are the i.i.d integer ran-dom variables uniformly distributed in the interval [−max, max], and maxcontrols the amount of distortion This local geometric distortion can simulate the inaccurate depth val-ues in the object boundaries, where the stereo matching techniqval-ues usually find difficulty in estimating depth values
Given the degraded and ground truth depth images, the FR-DQM measures the difference between two depth images as follows:
(x,y) ∈ϒ2
D(x, y) − D gt (x, y)2
... distributed random noise was added to the ground truth depth image since the noisy depth images can approximate the depth images obtained by Trang 10