Shift error analysis in image based 3D skull feature reconstructionThi Chau Ma , The Duy Bui, Trung Kien Dang Human Machine Interaction Laboratory University of Engineering and Technolog
Trang 1Shift error analysis in image based 3D skull feature reconstruction
Thi Chau Ma , The Duy Bui, Trung Kien Dang Human Machine Interaction Laboratory University of Engineering and Technology Vietnam National University, Hanoi, Vietnam
chaumt@vnu.edu.vn
Abstract
3D skull is crucial in skull-based 3D facial
reconstruc-tion [1, 2, 3, 4, 5, 6, 7, 8] In 3D reconstrucreconstruc-tion,
espe-cially in skull-based 3D facial reconstruction, features
usu-ally play an important role Because, the accuracy in
fea-ture detection strongly affects the accuracy of the 3D final
model In this paper, we concentrate on accuracy of 3D
reconstructed skull, one important part in skull-based 3D
facial reconstruction We discuss a cause of errors called
shift errors when taking sequence of skull images In
addi-tion, we analysis the effect of shift error in 3D
reconstruc-tion and propose solureconstruc-tion to limit the effect.
Keywords: feature detection, accuracy, 3D facial
recon-struction
1 Introduction
In skull based 3D facial reconstruction, given 3D
anthro-pometric landmarks on the 3D skull, it is possible to morph
a 3D template fitting the skull due to soft tissue thickness at
the landmarks to get the final 3D face The landmarks are
almost determined manually on scanned skull when
anthro-pometric information is supplied (Figure 1) Therefore,
in-stead of the whole skull, only 3D skull landmarks are used
Moreover, scanning skull requires expensive equipment To
overcome the problem, we used skull images rather than
scanned 3D skull The whole process is shown in Figure 2
We take sequence of images by moving camera around the
skull to capture all important landmarks The landmarks
included in set of features can be detected with any
auto-matic feature detector These features are matched to
cre-ate correspondences between successive images 3D skull
Figure 1 A scanned 3D skull images and landmarks on the skull.
People are sensitive in face recognition Biederman [9] concluded that face recognition and thing recognition are different Contrast, illumination, size, and transformation (especially rotation) have strongly affected in face recogni-tion, while, almost all those factors have few effects in other thing recognition Moreover, while distinctions between things are easily named, a little difference in two faces is easy to recognise but it is not named Hence, face recon-struction requires high accuracy for people to recognise Recently, in [1] the authors showed that ”54%, 65%, and 77% of the three facial reconstruction surfaces had less than 2.5 mm of error when compared to the relevant target face” for their experiments So, performance of 1mm-2mm in change to increase the accuracy for 3D reconstructed skull
or 3D reconstructed face is significant
In this paper, we have implemented errors assessment in feature detection from skull 2D images which affects 3D fa-cial reconstruction We first analyses the effect of the errors
in 3D skull features This analysis is then used to modify 3D skull features in order to increase the accuracy of the
2012 Fourth International Conference on Knowledge and Systems Engineering
Trang 2Figure 2 Skull images based 3D facial
recon-struction.
3D skull feature In Section 4, we perform experiments to
confirm those theoretical results
2 Related works
For 3D facial reconstruction from skull, in almost
re-searches, authors used to make use of 3D scanned skulls as
inputs [1, 2, 3, 4, 5, 6, 7, 8] In [2], Archer used the
stan-dard system of 32 dowels mounted on the skull at location
of landmarks with lengths of the statistical data of the soft
tissue thickness of African-American men Besides, the
au-thor added 89 dowels whose lengths were interpolated from
32 standard lengths A top surface in the form of
hierar-chical B-spline was transformed to match the skull The
transformation was divided into 3 levels In this study, the
new dowels interpolation was not really accurate about
po-sitions as well as lengths So, the final face was swollen or
unwanted collapsed The author made adjustments based
on B-spline surface tangent at the control points The result
got positive feedbacks However, it also showed some
lim-itations In terms of anatomy, the mouth was wide, the eye
position and size were not correct, the nose shape was not
appropriate because the author did not take advantage of the
relationship between anthropometric information
In [3, 4], the authors used a reference face and a
cor-responded skull The authors then tried to find a
transfor-mation to transform from the refefence skull to the source
skull Then, that transformation was applied to transform
the refefrence face to get the target face This study
ex-ploited little anthropometric information Easy to see that,
with such a transformation, soft tissue thickness of the faces
to be rebuilt had no difference Thus reconstructed face
clearly was not carrying identity information
Another studies [5, 8] also carried out the transformation
of the face model to match the dowels on the target face In
[5], the soft tissue thickness was calculated from statistical data based on the soft tissue database of Rhine and Moore [6] Initially, the landmarks were transformed using Pro-crustes transformation method, then, combination of RBF and Procustes on all the remaining points of the face The face template obtained in a typical database scan There-fore, the generated face depended on the face database Moreover, the resulted face was flawed because of lack of information about soft tissue thickness In [8] the authors used soft tissued thickness calculated from skull measures instead of the statical one
Kahler et al [7] built expressive faces from skull The method used in this study combined surgery and soft tissue thickness The skull was scanned and attached 40 dowels corresponding to the statistical standard thickness of soft tissue [6] A mesh consisting of 8164 triangles was used to represent the skull Radius basis functions were used to de-form the face template to match the skull The authors also conducted interpolation the thickness of soft tissue land-marks Moreover, the authors used 5 additional anthropo-metric laws to support shaping the nose and mouth Recently, in [1], the faces of the subjects were recon-structed according to the facial soft tissue depths data for living Korean adults They used 3D deformation tools to alter the shape In addition, the authors used a number of guides to predict facial components, such as eyes, nose, mouth and ears
Though, skull based 3D facial reconstruction has not been new and there are a number of accuracy studies using traditional 3D manual methods demonstrating good levels
of likeness to the target faces [10, 11, 12, 13, 14, 15] How-ever, for almost computer aided 3D facial reconstruction system, accuracy of 3D final results is not quantitative The results were ussually got feedback from anthropologists or Forensic experts In recent, there was a accurate study for computer aided system of facial reconstruction [1] The au-thors compared the reconstructed faces and target faces by Geomagic Qualify software and gave quantitative compari-sion on 3 reconstructed faces
For 3D reconstruction from images, especially 3D facial reconstruction from images, so far, the reconstruction re-quires one, more than one or sequence of images A 2D image includes 2D array of pixels Each pixel value holds intensity at that location Usually the intensity value is rep-resented the mix of three colors Red, Green, Blue The in-tensity values is used to calculate the depth information (z
coordinates) of objects in images With an input image, the shape of the objects has to taken out to conduct the mod-eling The techniques to retrieve the shape are shape from shading [16], shape from texture [17] shape from specu-larity [18], shape from contour [19], shape from 2D edge gradients [20] However, with a single input image, the computational complexity is high, moreover, the final
Trang 3mod-els can not be observed at various angles Furthermore, if
using direct intensity values to calculate the depth, the result
is not good for extra brightness depending on many factors
such as target surface color, geometry, material, direction of
the object and light
Approaches based on the input image sequence are then
divided into two types: (i) photometric stereo: the images
of objects are taken at an angle but under different lighting
[21, 22], (ii) stereopis images are taken at different angles
For photometric stereo methods, the surface orientation of
each object is calculated based on light intensity of the
cor-responding points in different images From information
on the orientation of the piece surface, people work out the
depth of the objects Photometric stereo method requires a
good setup of the light sources and understanding of related
light laws For stereopsis methods, the main problem is to
find matches between pairs of features of images to
deter-mine objects structures With sparse corresponding feature
points, 3D features were calculated Researchers used to
use extra 3D templates of objects and deformed the
tem-plates to fit to 3D features to get the final 3D model of
ob-jects The objects often are in the form of generic model
[23, 24, 25] and 3D morphable model [26, 27, 28, 29, 30]
3 Shift error: cause, effect and solution in 3D
skull reconstruction
3.1 Cause of shift error
When taking pictures of an object around (by moving the
camera), images are obtained in different views in
horizon-tal (x direction) (Figure 3) In each pair of successive
im-ages, features on the second are shifted a distancee in x
di-rection These features are almost unchanged iny direction.
Experiment in section 4 would confirm this conclusion
Figure 3 Camera setup for 3D
reconstruc-tion.
Features on the objectl appear on the camera images However, under a perspective projection the location of fea-tures on the camera images is not the location of projected features This error in localization is called shift error When we take pictures in two views (Figure 4), they will
in general be rotated and translated relative to each other This can be modeled by a 2D rotation and translation of the origin The difference between locations of features on the camera images and projected features is the shift error mentioned above
3.2 Effect of shift error in 3D reconstruc-tion
Figure 4 depicts how shift error affects in 3D reconstruc-tion We assume that the image planes lie between the 3D object and camera The small circle representing 2D points
inI iandI i+1 images are projected from big circle repre-senting 3D pointX In fact, when taking picture, the object
is rotated in x direction So, the projected point in I i+1
image is the small square point, not the small circle one
As the result, the reconstructed 3D point (big square point) does not coincide the original 3D point The reconstructed point is pushed far away from the object Clearly, the shift error causes wrong back projection
Figure 4 Shift error pushes reconstructed point far away from the object.
3.3 Accuracy improvement in 3D recon-struction
We define the notation as in Figure 4:C1,C2are the lo-cations of the camera in two successive cameras or centers
Trang 4of projections Given 3D pointX, x1andx2 are the
the-oretical images ofX through the projection of centers C1,
C2respectively.x 1andx 2are the images ofX detected by
certain feature detector We assumex1= x
1 Nevertheless,
x2= x
2because of shift error.X is 3D reconstructed point
from corresponding pair(x
1, x 2) Angle C1XC2 = α If
XX is estimated,X instead of X could be completely
re-constructed
XX is estimated to pullX toX We have
XX
C1X = e
so
XX = e C1X
C1C2(pixels) (2) Figure 5 depicts the ralation between a 3D point and
its projection Let us look at 3D point X(x s , y s , z s) and
x(x i , y i , f ) In the camera with center at C, X is projected
as(x i , y i , f ) The transformation is as followed:
XX = e C1X
C1C2(pixels) (3)
⎡
⎣ u v
w
⎤
⎦ =
⎡
⎣ f 0 f 0 00 0 0
0 0 1 0
⎤
⎦
⎡
⎢
⎣
x s
y s
z s
1
⎤
⎥
⎦
wherex i= u
w,y i= v
w and
x i = f x s
z s
y i = f y s
z s
(4)
Figure 5 3D point and 2D point relation.
In the image plane coordinate (Figure 6), the origin
(x0, y0) is at center of the image, the image of X is in pixel
unit as following:
⎡
⎣ u
v
w
⎤
⎦ =
⎡
⎣ α0x α s y x y00 00
⎤
⎦
⎡
⎢
⎣
x s
y s
z s
1
⎤
⎥
⎦
whereα x = fk x,α y = fk y
x pix= u
w
y pix= v
w
(5)
and
⎡
⎣ α0x α s y x y00 00
⎤
⎦ =
⎡
⎣ α0x α s y x y00
⎤
⎦
⎡
⎣ 1 s x 0 1 y00 00
0 0 1 0
⎤
⎦
= K [I3|03]
K is the calibration matrix with s ∼ 0, α x = α y = fk s From 4 and 5, we have
pixel
mm = x pix
x i = x x s + x0z s
z s z s
f x s = k s + x0
x i (6)
pixel
mm = x pix
x i = x x s + x0z s
z s z s
f x s = k s + x0
x i (7) From two formulas 1 and 7, we have
XX = e C1X
C1C2(k s+x0
x i)−1 (mm) (8) or
XX = e 2sin1 α
2
(k s+x0
x i)−1 (mm) (9)
Figure 6 Image plane.
As shown in formula 9, we easilly determine the 3D pointX from reconstructed point X due to shift error
Trang 54 Experiments
We use 2 scanned skulls for the experiment These skulls
are rotated The rotation has small angle steps, here set at
10 degrees by using MeshLab For each picture, images
are captured in the range of 100 degrees at the resolution of
1170 × 864.
To compute the projected relative shift we first extract
the ground truth homography between a pair of images
Features are detected by Harris [31] and SIFT [32]
detec-tor These features are matched between pair of successive
images Transferring coordinates of a feature in the first
im-age to the second imim-age using the ground truth homography
gives us the coordinates of its ideal match The difference
between the coordinates of the ideal match and the
coor-dinates of the match is ideally the projected relative drift
error
Figure 7 and Figure 8 show that, for all data sets, the
projected shift error is indeed mainly in thex -direction and
about zero iny -direction The result is fitted with the
anal-ysis of shift error in Section 3
Figure 7 Shift error on Skull 1.
Given the ground truth poses of the cameras, we
triangu-late to recover the 3D location of features (left of Figure 9)
For each sample of scanned skull, the distance from 3D
re-constructed point and the 3D ideal point is1.09 ∼ 1.93mm.
We suggest that 3D features are pulled closer approximately
1.5mm (right of Figure 9)
To give an assessment, we compare mean and max
er-rors between 3D recovered features before and after pulling
We compare set of 3D recovered features before and after
pulling with original scanned skulls respectively using
fol-lowed formulas LetS1be the set of 3D detected features,
andS2be the set of vertices of the scanned skull
The distance from pointp to surface S is estimated as:
e(p, S) = min p ∈S d(p, p ) (10)
Figure 8 Shift error on Skull 2.
whered(.) is Euclidean distance.
Mean error is the average distance betweenS1andS2, which is
E(S1, S2) =||S1
1|| .Σ S1e(p, S2) (11) Max error is the max distance betweenS1andS2, which is
E max (S1, S2) = max p∈S1e(p, S2) (12)
Figure 9 3D recovered features before (left) and after (right) pulling
Table 1 shows the error of recovered 3D features before and after pulling of 1.5 mm The mean and max errors are very small In the case, the 3D skull features are not pulled, the mean error is about 13% the average of soft tissue thick-ness an adult Vietnamese skull (∼ 5, 895mm the average
of soft tissue thickness at 22 anthropometric landmarks on Vietnamese skull: a vertex, a trichion, two subraobitals, a
Trang 6Skull E before E after E max,before E max,after
1 0.7271 0.6271 3.1314 2, 0312
2 0.7903 0.5903 2.9004 2.3032
Table 1 Mean and max errors of 3D features
and 3D recovered features
glabella, a nasion, two excathions, two endocathions, a
rhin-ion, two infraorbials, two zygomatics, two alares, a
sub-nasale, two molars, a stomion and a metal.) And the max
error is about 50% After pulling the features, these errors
reduce significantly: about 10,3% for mean error and 37,2%
for max error Obviously, 3D pulled features are better than
the reconstructed ones
5 Conclusions
When taking pictures by moving the camera around, we
have found that the error, which we called shift error is in
the same direction as the viewpoint move
We analyses the cause of shift error We also show
math-ematically the effect of the error on 3D reconstructed
land-marks of skull Then, we propose an effective solution to
the problem The solution could reduce the error of
re-constructed 3D landmarks relative to original scanned skull
The experiments consolidate all these conclusions
6 Acknowledgment
This work is supported by the project Towards a Model
of an ”Intelligent Office Enviroment”, No QGTD.10.23.
References
[1] L Won-Joon, M W Caroline, and H Hyeon-Shik,
“An accuracy assessment of forensic computerized
fa-cial reconstruction employing cone-beam computed
tomography from live subjects,” In Journal of
Foren-sic Sciences, 2011.
[2] K M Archer, “Craniofacial reconstruction using
hi-erarchical b-spline interpolation.” Masters thesis
Uni-versity of British Columbia, Department of Electrical
and Computer Engineering, 1997.
[3] S Michael and M Chen, “The 3d reconstruction of
facial features using volume distortion.” In Proc 14th
Eurographics UK Conference, pp 297–305, 1996.
[4] G Quatrehomme, S Cotin, G Subsol, H Delingette,
Y Garidel, G Grevin, M Fidrich, P Bailet, and A Ol-lier, “A fully three dimensional method for facial
re-construction based on deformable models,” Journal of
Forensic Science, pp 649–652, 1997.
[5] P Vanezis, M Vanezis, G MCCombe, and T Nibllet,
“Facial reconstruction using 3-d computer graphics,”
Journal of Forensic Science, vol 81, no 2, pp 81–95,
2000
[6] J S Rhine and C E Moore, “Tables of facial tissue thickness of american caucasoids in forensic
anthro-pology,” Maxwell Museum Technical Series 1, 1984.
[7] K Kolja, H Jorg, and S Hans-Peter, “Reanimating the dead: Reconstruction of expressive faces from
skull data,” Published in ACMTOG (SIG-GRAPH
con-ference proceedings), vol 23, no 3, Jyly 2003.
[8] Q H Dinh, C T Ma, T D Bui, T T Nguyen, and
D T Nguyen, “Facial soft tissue thicknesses
predic-tion using anthropometric distances,” Proceeding of
ACIIDS 2011, 2011.
[9] I Biederman and P Kalocsai, “Neural and psy-chophysical analysis of object and face recognition,”
In Face Recognition: From Theory to Applications NATO ASI Series F Springer Verlag, 1998.
[10] J Prag and R Neave, “Making faces.” London, UK:
British Museum Press, 1997.
[11] C Snow, B Gatliff, and K McWilliams, “Reconstruc-tion of facial features from the skull: an evalua“Reconstruc-tion
of its usefulness in forensic anthropology.” Am J Phys
Anthropol, vol 33, no 2, 1970.
[12] M Gerasimov, “The face finder.” New York, NY:
Lip-pincott, 1971.
[13] R Helmer, S Rhricht, D Petersen, and F Mhr, “As-sessment of the reliability of facial reconstruction,”
Forensic analysis of the skull: craniofacial analysis, reconstruction, and identification New York, Wiley Liss Publishers, pp 75–83, 1993.
[14] C Wilkinson and D Whittaker, “Juvenile forensic
fa-cial reconstruction: a detailed accuracy study,”
Pro-ceedings of the 10th Conference of the International Association of Craniofacial Identification, pp 11–14,
September 2002
[15] G Quatrehomme, T Balaguer, P Staccini, and
V Alunni-Perret, “Assessment of the accuracy of three-dimensional manual craniofacial reconstruction:
a series of 25 controlled cases.” Int J Legal Med, vol.
121, no 6, pp 469–475, 2007
Trang 7[16] B Horn and M Brooks, “Shape from shading,” MIT
Press, Cambridge, MA, 1989.
[17] J Aloimonos, “Shape from texture,” Biological
cyber-netics, vol 58, no 5, pp 345–360, 1988.
[18] T O B Gleen Healey, “Local shape from
specular-ity,” Computer Vision, Graphics, and Image
Process-ing, vol 42, pp 62–86, 1988.
[19] F Ulupinar and R Nevatia, “Shape from contour:
Ho-mogeneous generalized cylinders and constant cross
section generalized cylinders,” IEEE transactions on
pattern analysis and machine intellegence, vol 12,
no 2, 1995
[20] S Winkelbach and F M Wahl, “Shape from 2d
edge gradients,” Proceedings of the 23rd
DAGM-Symposium on Pattern Recognition, 2001.
[21] F Solomon and K Ikeuchi, “Extracting the shape and
roughness of specular lobe objects using four light
photometric stereo,” IEEE, 1992.
[22] J Meng and J Zhu, “Recovering 3d face models by
a usb camera and a lamp,” CS682 Digital Image
Pro-cessing Term Project Report, 2006.
[23] R L Hsu and A K Jain, “Face modeling for
recog-nition.” Proc Int’l Conf Image Processing (ICIP),
vol 2, pp 693–696, 2001
[24] A Ansari and M Abdel-Mottaleb, “3-d face
model-ing usmodel-ing two viewsand a generic face model with
ap-plication to 3-d face recognition,” IEEE Conf on
Ad-vanced Video and Signal BasedSurveillance, pp 203–
222, 2003
[25] M Z Linna, M Xiangyong, and Z Yangsheng,
“Image-based 3dface modeling,” Proc of Int’l Conf.
on Computer Graphics, Imaging andVisualization, pp.
165–168, July 2004
[26] V Blanz and T Vetter, “A morphable model for the
synthesis of 3d faces,” Proc of the SIGGRAPH’99,
pp 187–194, August 1999
[27] H Guo, J Jiang, and L Zhang, “Building a 3d
mor-phable face model by using thin plate splines for face
reconstruction,” LNCS, vol 3338, pp 258–267, 2004.
[28] Y Hu, D Jiang, S Yan, L Zhang, and H Zhang,
“Au-tomatic 3d reconstruction for face recognition,” Proc.
6th IEEE Int’l Conf on Automatic Face and Gesture
Recognition, pp 843–848, 2004.
[29] Z Zhang, Z Liu, D Adler, M F Cohen, E Hanson, and Y Shan, “Robust and rapid generation of animated faces from video images: A model-based modeling
approach,” International Journal of ComputerVision,
vol 58, no 2, pp 93–119, 2006
[30] T Russ, C Boehnen, and T Peters, “3d face
recogni-tion using 3d alignment for pca,” IEEE Conf on
Com-puter Vision and PatternRecognition, vol 2, pp 1391–
1398, 2006
[31] C Harris and M Stephens, “A combined corner and
edge detector,” In Alvey Vision Conference, pp 147–
152, 1988
[32] D G Lowe, “Distinctive image features from
scale-invariant keypoints,” International Journal of
Com-puter Vision, vol 60, no 2, pp 91–110, 2004.