1. Trang chủ
  2. » Thể loại khác

DSpace at VNU: Shift error analysis in image based 3D skull feature reconstruction

7 81 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 7
Dung lượng 418,84 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Shift error analysis in image based 3D skull feature reconstructionThi Chau Ma , The Duy Bui, Trung Kien Dang Human Machine Interaction Laboratory University of Engineering and Technolog

Trang 1

Shift error analysis in image based 3D skull feature reconstruction

Thi Chau Ma , The Duy Bui, Trung Kien Dang Human Machine Interaction Laboratory University of Engineering and Technology Vietnam National University, Hanoi, Vietnam

chaumt@vnu.edu.vn

Abstract

3D skull is crucial in skull-based 3D facial

reconstruc-tion [1, 2, 3, 4, 5, 6, 7, 8] In 3D reconstrucreconstruc-tion,

espe-cially in skull-based 3D facial reconstruction, features

usu-ally play an important role Because, the accuracy in

fea-ture detection strongly affects the accuracy of the 3D final

model In this paper, we concentrate on accuracy of 3D

reconstructed skull, one important part in skull-based 3D

facial reconstruction We discuss a cause of errors called

shift errors when taking sequence of skull images In

addi-tion, we analysis the effect of shift error in 3D

reconstruc-tion and propose solureconstruc-tion to limit the effect.

Keywords: feature detection, accuracy, 3D facial

recon-struction

1 Introduction

In skull based 3D facial reconstruction, given 3D

anthro-pometric landmarks on the 3D skull, it is possible to morph

a 3D template fitting the skull due to soft tissue thickness at

the landmarks to get the final 3D face The landmarks are

almost determined manually on scanned skull when

anthro-pometric information is supplied (Figure 1) Therefore,

in-stead of the whole skull, only 3D skull landmarks are used

Moreover, scanning skull requires expensive equipment To

overcome the problem, we used skull images rather than

scanned 3D skull The whole process is shown in Figure 2

We take sequence of images by moving camera around the

skull to capture all important landmarks The landmarks

included in set of features can be detected with any

auto-matic feature detector These features are matched to

cre-ate correspondences between successive images 3D skull

Figure 1 A scanned 3D skull images and landmarks on the skull.

People are sensitive in face recognition Biederman [9] concluded that face recognition and thing recognition are different Contrast, illumination, size, and transformation (especially rotation) have strongly affected in face recogni-tion, while, almost all those factors have few effects in other thing recognition Moreover, while distinctions between things are easily named, a little difference in two faces is easy to recognise but it is not named Hence, face recon-struction requires high accuracy for people to recognise Recently, in [1] the authors showed that ”54%, 65%, and 77% of the three facial reconstruction surfaces had less than 2.5 mm of error when compared to the relevant target face” for their experiments So, performance of 1mm-2mm in change to increase the accuracy for 3D reconstructed skull

or 3D reconstructed face is significant

In this paper, we have implemented errors assessment in feature detection from skull 2D images which affects 3D fa-cial reconstruction We first analyses the effect of the errors

in 3D skull features This analysis is then used to modify 3D skull features in order to increase the accuracy of the

2012 Fourth International Conference on Knowledge and Systems Engineering

Trang 2

Figure 2 Skull images based 3D facial

recon-struction.

3D skull feature In Section 4, we perform experiments to

confirm those theoretical results

2 Related works

For 3D facial reconstruction from skull, in almost

re-searches, authors used to make use of 3D scanned skulls as

inputs [1, 2, 3, 4, 5, 6, 7, 8] In [2], Archer used the

stan-dard system of 32 dowels mounted on the skull at location

of landmarks with lengths of the statistical data of the soft

tissue thickness of African-American men Besides, the

au-thor added 89 dowels whose lengths were interpolated from

32 standard lengths A top surface in the form of

hierar-chical B-spline was transformed to match the skull The

transformation was divided into 3 levels In this study, the

new dowels interpolation was not really accurate about

po-sitions as well as lengths So, the final face was swollen or

unwanted collapsed The author made adjustments based

on B-spline surface tangent at the control points The result

got positive feedbacks However, it also showed some

lim-itations In terms of anatomy, the mouth was wide, the eye

position and size were not correct, the nose shape was not

appropriate because the author did not take advantage of the

relationship between anthropometric information

In [3, 4], the authors used a reference face and a

cor-responded skull The authors then tried to find a

transfor-mation to transform from the refefence skull to the source

skull Then, that transformation was applied to transform

the refefrence face to get the target face This study

ex-ploited little anthropometric information Easy to see that,

with such a transformation, soft tissue thickness of the faces

to be rebuilt had no difference Thus reconstructed face

clearly was not carrying identity information

Another studies [5, 8] also carried out the transformation

of the face model to match the dowels on the target face In

[5], the soft tissue thickness was calculated from statistical data based on the soft tissue database of Rhine and Moore [6] Initially, the landmarks were transformed using Pro-crustes transformation method, then, combination of RBF and Procustes on all the remaining points of the face The face template obtained in a typical database scan There-fore, the generated face depended on the face database Moreover, the resulted face was flawed because of lack of information about soft tissue thickness In [8] the authors used soft tissued thickness calculated from skull measures instead of the statical one

Kahler et al [7] built expressive faces from skull The method used in this study combined surgery and soft tissue thickness The skull was scanned and attached 40 dowels corresponding to the statistical standard thickness of soft tissue [6] A mesh consisting of 8164 triangles was used to represent the skull Radius basis functions were used to de-form the face template to match the skull The authors also conducted interpolation the thickness of soft tissue land-marks Moreover, the authors used 5 additional anthropo-metric laws to support shaping the nose and mouth Recently, in [1], the faces of the subjects were recon-structed according to the facial soft tissue depths data for living Korean adults They used 3D deformation tools to alter the shape In addition, the authors used a number of guides to predict facial components, such as eyes, nose, mouth and ears

Though, skull based 3D facial reconstruction has not been new and there are a number of accuracy studies using traditional 3D manual methods demonstrating good levels

of likeness to the target faces [10, 11, 12, 13, 14, 15] How-ever, for almost computer aided 3D facial reconstruction system, accuracy of 3D final results is not quantitative The results were ussually got feedback from anthropologists or Forensic experts In recent, there was a accurate study for computer aided system of facial reconstruction [1] The au-thors compared the reconstructed faces and target faces by Geomagic Qualify software and gave quantitative compari-sion on 3 reconstructed faces

For 3D reconstruction from images, especially 3D facial reconstruction from images, so far, the reconstruction re-quires one, more than one or sequence of images A 2D image includes 2D array of pixels Each pixel value holds intensity at that location Usually the intensity value is rep-resented the mix of three colors Red, Green, Blue The in-tensity values is used to calculate the depth information (z

coordinates) of objects in images With an input image, the shape of the objects has to taken out to conduct the mod-eling The techniques to retrieve the shape are shape from shading [16], shape from texture [17] shape from specu-larity [18], shape from contour [19], shape from 2D edge gradients [20] However, with a single input image, the computational complexity is high, moreover, the final

Trang 3

mod-els can not be observed at various angles Furthermore, if

using direct intensity values to calculate the depth, the result

is not good for extra brightness depending on many factors

such as target surface color, geometry, material, direction of

the object and light

Approaches based on the input image sequence are then

divided into two types: (i) photometric stereo: the images

of objects are taken at an angle but under different lighting

[21, 22], (ii) stereopis images are taken at different angles

For photometric stereo methods, the surface orientation of

each object is calculated based on light intensity of the

cor-responding points in different images From information

on the orientation of the piece surface, people work out the

depth of the objects Photometric stereo method requires a

good setup of the light sources and understanding of related

light laws For stereopsis methods, the main problem is to

find matches between pairs of features of images to

deter-mine objects structures With sparse corresponding feature

points, 3D features were calculated Researchers used to

use extra 3D templates of objects and deformed the

tem-plates to fit to 3D features to get the final 3D model of

ob-jects The objects often are in the form of generic model

[23, 24, 25] and 3D morphable model [26, 27, 28, 29, 30]

3 Shift error: cause, effect and solution in 3D

skull reconstruction

3.1 Cause of shift error

When taking pictures of an object around (by moving the

camera), images are obtained in different views in

horizon-tal (x direction) (Figure 3) In each pair of successive

im-ages, features on the second are shifted a distancee in x

di-rection These features are almost unchanged iny direction.

Experiment in section 4 would confirm this conclusion

Figure 3 Camera setup for 3D

reconstruc-tion.

Features on the objectl appear on the camera images However, under a perspective projection the location of fea-tures on the camera images is not the location of projected features This error in localization is called shift error When we take pictures in two views (Figure 4), they will

in general be rotated and translated relative to each other This can be modeled by a 2D rotation and translation of the origin The difference between locations of features on the camera images and projected features is the shift error mentioned above

3.2 Effect of shift error in 3D reconstruc-tion

Figure 4 depicts how shift error affects in 3D reconstruc-tion We assume that the image planes lie between the 3D object and camera The small circle representing 2D points

inI iandI i+1 images are projected from big circle repre-senting 3D pointX In fact, when taking picture, the object

is rotated in x direction So, the projected point in I i+1

image is the small square point, not the small circle one

As the result, the reconstructed 3D point (big square point) does not coincide the original 3D point The reconstructed point is pushed far away from the object Clearly, the shift error causes wrong back projection

Figure 4 Shift error pushes reconstructed point far away from the object.

3.3 Accuracy improvement in 3D recon-struction

We define the notation as in Figure 4:C1,C2are the lo-cations of the camera in two successive cameras or centers

Trang 4

of projections Given 3D pointX, x1andx2 are the

the-oretical images ofX through the projection of centers C1,

C2respectively.x 1andx 2are the images ofX detected by

certain feature detector We assumex1= x 

1 Nevertheless,

x2= x 

2because of shift error.X is 3D reconstructed point

from corresponding pair(x 

1, x 2) Angle C1XC2 = α If

XX is estimated,X instead of X could be completely

re-constructed

XX is estimated to pullX toX We have

XX 

C1X = e

so

XX  = e C1X

C1C2(pixels) (2) Figure 5 depicts the ralation between a 3D point and

its projection Let us look at 3D point X(x s , y s , z s) and

x(x i , y i , f ) In the camera with center at C, X is projected

as(x i , y i , f ) The transformation is as followed:

XX  = e C1X

C1C2(pixels) (3)

u v

w

⎦ =

f 0 f 0 00 0 0

0 0 1 0

x s

y s

z s

1

wherex i= u

w,y i= v

w and

x i = f x s

z s

y i = f y s

z s

(4)

Figure 5 3D point and 2D point relation.

In the image plane coordinate (Figure 6), the origin

(x0, y0) is at center of the image, the image of X is in pixel

unit as following:

u



v 

w 

⎦ =

α0x α s y x y00 00

x s

y s

z s

1

whereα x = fk x,α y = fk y

x pix= u 

w 

y pix= v 

w 

(5)

and

α0x α s y x y00 00

⎦ =

α0x α s y x y00

1 s x 0 1 y00 00

0 0 1 0

= K [I3|03]

K is the calibration matrix with s ∼ 0, α x = α y = fk s From 4 and 5, we have

pixel

mm = x pix

x i = x x s + x0z s

z s z s

f x s = k s + x0

x i (6)

pixel

mm = x pix

x i = x x s + x0z s

z s z s

f x s = k s + x0

x i (7) From two formulas 1 and 7, we have

XX  = e C1X

C1C2(k s+x0

x i)−1 (mm) (8) or

XX  = e 2sin1 α

2

(k s+x0

x i)−1 (mm) (9)

Figure 6 Image plane.

As shown in formula 9, we easilly determine the 3D pointX from reconstructed point X due to shift error

Trang 5

4 Experiments

We use 2 scanned skulls for the experiment These skulls

are rotated The rotation has small angle steps, here set at

10 degrees by using MeshLab For each picture, images

are captured in the range of 100 degrees at the resolution of

1170 × 864.

To compute the projected relative shift we first extract

the ground truth homography between a pair of images

Features are detected by Harris [31] and SIFT [32]

detec-tor These features are matched between pair of successive

images Transferring coordinates of a feature in the first

im-age to the second imim-age using the ground truth homography

gives us the coordinates of its ideal match The difference

between the coordinates of the ideal match and the

coor-dinates of the match is ideally the projected relative drift

error

Figure 7 and Figure 8 show that, for all data sets, the

projected shift error is indeed mainly in thex -direction and

about zero iny -direction The result is fitted with the

anal-ysis of shift error in Section 3

Figure 7 Shift error on Skull 1.

Given the ground truth poses of the cameras, we

triangu-late to recover the 3D location of features (left of Figure 9)

For each sample of scanned skull, the distance from 3D

re-constructed point and the 3D ideal point is1.09 ∼ 1.93mm.

We suggest that 3D features are pulled closer approximately

1.5mm (right of Figure 9)

To give an assessment, we compare mean and max

er-rors between 3D recovered features before and after pulling

We compare set of 3D recovered features before and after

pulling with original scanned skulls respectively using

fol-lowed formulas LetS1be the set of 3D detected features,

andS2be the set of vertices of the scanned skull

The distance from pointp to surface S is estimated as:

e(p, S) = min p  ∈S d(p, p ) (10)

Figure 8 Shift error on Skull 2.

whered(.) is Euclidean distance.

Mean error is the average distance betweenS1andS2, which is

E(S1, S2) =||S1

1|| .Σ S1e(p, S2) (11) Max error is the max distance betweenS1andS2, which is

E max (S1, S2) = max p∈S1e(p, S2) (12)

Figure 9 3D recovered features before (left) and after (right) pulling

Table 1 shows the error of recovered 3D features before and after pulling of 1.5 mm The mean and max errors are very small In the case, the 3D skull features are not pulled, the mean error is about 13% the average of soft tissue thick-ness an adult Vietnamese skull (∼ 5, 895mm the average

of soft tissue thickness at 22 anthropometric landmarks on Vietnamese skull: a vertex, a trichion, two subraobitals, a

Trang 6

Skull E before E after E max,before E max,after

1 0.7271 0.6271 3.1314 2, 0312

2 0.7903 0.5903 2.9004 2.3032

Table 1 Mean and max errors of 3D features

and 3D recovered features

glabella, a nasion, two excathions, two endocathions, a

rhin-ion, two infraorbials, two zygomatics, two alares, a

sub-nasale, two molars, a stomion and a metal.) And the max

error is about 50% After pulling the features, these errors

reduce significantly: about 10,3% for mean error and 37,2%

for max error Obviously, 3D pulled features are better than

the reconstructed ones

5 Conclusions

When taking pictures by moving the camera around, we

have found that the error, which we called shift error is in

the same direction as the viewpoint move

We analyses the cause of shift error We also show

math-ematically the effect of the error on 3D reconstructed

land-marks of skull Then, we propose an effective solution to

the problem The solution could reduce the error of

re-constructed 3D landmarks relative to original scanned skull

The experiments consolidate all these conclusions

6 Acknowledgment

This work is supported by the project Towards a Model

of an ”Intelligent Office Enviroment”, No QGTD.10.23.

References

[1] L Won-Joon, M W Caroline, and H Hyeon-Shik,

“An accuracy assessment of forensic computerized

fa-cial reconstruction employing cone-beam computed

tomography from live subjects,” In Journal of

Foren-sic Sciences, 2011.

[2] K M Archer, “Craniofacial reconstruction using

hi-erarchical b-spline interpolation.” Masters thesis

Uni-versity of British Columbia, Department of Electrical

and Computer Engineering, 1997.

[3] S Michael and M Chen, “The 3d reconstruction of

facial features using volume distortion.” In Proc 14th

Eurographics UK Conference, pp 297–305, 1996.

[4] G Quatrehomme, S Cotin, G Subsol, H Delingette,

Y Garidel, G Grevin, M Fidrich, P Bailet, and A Ol-lier, “A fully three dimensional method for facial

re-construction based on deformable models,” Journal of

Forensic Science, pp 649–652, 1997.

[5] P Vanezis, M Vanezis, G MCCombe, and T Nibllet,

“Facial reconstruction using 3-d computer graphics,”

Journal of Forensic Science, vol 81, no 2, pp 81–95,

2000

[6] J S Rhine and C E Moore, “Tables of facial tissue thickness of american caucasoids in forensic

anthro-pology,” Maxwell Museum Technical Series 1, 1984.

[7] K Kolja, H Jorg, and S Hans-Peter, “Reanimating the dead: Reconstruction of expressive faces from

skull data,” Published in ACMTOG (SIG-GRAPH

con-ference proceedings), vol 23, no 3, Jyly 2003.

[8] Q H Dinh, C T Ma, T D Bui, T T Nguyen, and

D T Nguyen, “Facial soft tissue thicknesses

predic-tion using anthropometric distances,” Proceeding of

ACIIDS 2011, 2011.

[9] I Biederman and P Kalocsai, “Neural and psy-chophysical analysis of object and face recognition,”

In Face Recognition: From Theory to Applications NATO ASI Series F Springer Verlag, 1998.

[10] J Prag and R Neave, “Making faces.” London, UK:

British Museum Press, 1997.

[11] C Snow, B Gatliff, and K McWilliams, “Reconstruc-tion of facial features from the skull: an evalua“Reconstruc-tion

of its usefulness in forensic anthropology.” Am J Phys

Anthropol, vol 33, no 2, 1970.

[12] M Gerasimov, “The face finder.” New York, NY:

Lip-pincott, 1971.

[13] R Helmer, S Rhricht, D Petersen, and F Mhr, “As-sessment of the reliability of facial reconstruction,”

Forensic analysis of the skull: craniofacial analysis, reconstruction, and identification New York, Wiley Liss Publishers, pp 75–83, 1993.

[14] C Wilkinson and D Whittaker, “Juvenile forensic

fa-cial reconstruction: a detailed accuracy study,”

Pro-ceedings of the 10th Conference of the International Association of Craniofacial Identification, pp 11–14,

September 2002

[15] G Quatrehomme, T Balaguer, P Staccini, and

V Alunni-Perret, “Assessment of the accuracy of three-dimensional manual craniofacial reconstruction:

a series of 25 controlled cases.” Int J Legal Med, vol.

121, no 6, pp 469–475, 2007

Trang 7

[16] B Horn and M Brooks, “Shape from shading,” MIT

Press, Cambridge, MA, 1989.

[17] J Aloimonos, “Shape from texture,” Biological

cyber-netics, vol 58, no 5, pp 345–360, 1988.

[18] T O B Gleen Healey, “Local shape from

specular-ity,” Computer Vision, Graphics, and Image

Process-ing, vol 42, pp 62–86, 1988.

[19] F Ulupinar and R Nevatia, “Shape from contour:

Ho-mogeneous generalized cylinders and constant cross

section generalized cylinders,” IEEE transactions on

pattern analysis and machine intellegence, vol 12,

no 2, 1995

[20] S Winkelbach and F M Wahl, “Shape from 2d

edge gradients,” Proceedings of the 23rd

DAGM-Symposium on Pattern Recognition, 2001.

[21] F Solomon and K Ikeuchi, “Extracting the shape and

roughness of specular lobe objects using four light

photometric stereo,” IEEE, 1992.

[22] J Meng and J Zhu, “Recovering 3d face models by

a usb camera and a lamp,” CS682 Digital Image

Pro-cessing Term Project Report, 2006.

[23] R L Hsu and A K Jain, “Face modeling for

recog-nition.” Proc Int’l Conf Image Processing (ICIP),

vol 2, pp 693–696, 2001

[24] A Ansari and M Abdel-Mottaleb, “3-d face

model-ing usmodel-ing two viewsand a generic face model with

ap-plication to 3-d face recognition,” IEEE Conf on

Ad-vanced Video and Signal BasedSurveillance, pp 203–

222, 2003

[25] M Z Linna, M Xiangyong, and Z Yangsheng,

“Image-based 3dface modeling,” Proc of Int’l Conf.

on Computer Graphics, Imaging andVisualization, pp.

165–168, July 2004

[26] V Blanz and T Vetter, “A morphable model for the

synthesis of 3d faces,” Proc of the SIGGRAPH’99,

pp 187–194, August 1999

[27] H Guo, J Jiang, and L Zhang, “Building a 3d

mor-phable face model by using thin plate splines for face

reconstruction,” LNCS, vol 3338, pp 258–267, 2004.

[28] Y Hu, D Jiang, S Yan, L Zhang, and H Zhang,

“Au-tomatic 3d reconstruction for face recognition,” Proc.

6th IEEE Int’l Conf on Automatic Face and Gesture

Recognition, pp 843–848, 2004.

[29] Z Zhang, Z Liu, D Adler, M F Cohen, E Hanson, and Y Shan, “Robust and rapid generation of animated faces from video images: A model-based modeling

approach,” International Journal of ComputerVision,

vol 58, no 2, pp 93–119, 2006

[30] T Russ, C Boehnen, and T Peters, “3d face

recogni-tion using 3d alignment for pca,” IEEE Conf on

Com-puter Vision and PatternRecognition, vol 2, pp 1391–

1398, 2006

[31] C Harris and M Stephens, “A combined corner and

edge detector,” In Alvey Vision Conference, pp 147–

152, 1988

[32] D G Lowe, “Distinctive image features from

scale-invariant keypoints,” International Journal of

Com-puter Vision, vol 60, no 2, pp 91–110, 2004.

Ngày đăng: 16/12/2017, 06:31

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN