Journal of Computational Design and Engineering 00 2013 0000~0000 www.jcde.org A Method for Image-based Shadow Interaction with Virtual Objects Hyunwoo Ha and Kwanghee Ko* 1School of
Trang 1Author's Accepted Manuscript
A Method for Image-based Shadow Interaction with
Virtual Objects
Hyunwoo Ha, Kwanghee Ko
To appear in: Journal of Computational Design and Engineering
Cite this article as: Hyunwoo Ha, Kwanghee Ko, A Method for Image-based Shadow Interaction with Virtual Objects, Journal of Computational Design and Engineering, http://dx.doi.org/10.1016/j.jcde.2014.11.003
This is a PDF file of an unedited manuscript that has been accepted for publication As a service to our customers we are providing this early version of the manuscript The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply
to the journal pertain.
www.elsevier.com/locate/jcde
Trang 2Journal of Computational Design and Engineering 00 (2013) 0000~0000
www.jcde.org
A Method for Image-based Shadow Interaction with Virtual Objects
Hyunwoo Ha and Kwanghee Ko*
1School of Mechatronics, Gwangju Institute of Science and Technology, 123 Cheomdangwagiro, Bukgu, Gwangju, 500-712, Republic of Korea
2 Korea Culture Technology Institute, Gwangju Institute of Science and Technology, 123 Cheomdangwagiro, Bukgu, Gwangju, 500-712, Republic of
Korea
(Manuscript Received 000 0, 2013; Revised 000 0, 2013; Accepted 000 0, 2013) -
Abstract
A lot of researchers have been investigating interactive portable projection systems such as a mini-projector In addition, in ex-hibition halls and museums, there is a trend toward using interactive projection systems to make viewing more exciting and im-pressive They can also be applied in the field of art, for example, in creating shadow plays The key idea of the interactive porta-ble projection systems is to recognize the user’s gesture in real-time In this paper, a vision-based shadow gesture recognition method is proposed for interactive projection systems The gesture recognition method is based on the screen image obtained by a single web camera The method separates only the shadow area by combining the binary image with an input image using a learning algorithm that isolates the background from the input image The region of interest is recognized with labeling the sha-dow of separated regions, and then hand shasha-dows are isolated using the defect, convex hull, and moment of each region To dis-tinguish hand gestures, Hu’s invariant moment method is used An optical flow algorithm is used for tracking the fingertip Using this method, a few interactive applications are developed, which are presented in this paper
Keywords: shadow interaction; Hu moment; gesture recognition; interactive UI; image processing;
-
1 Introduction
There have been the increasing demands for a more active and interesting viewing experience, and interactive projection technology has been considered as a solution to this issue For example, if you can flip pages with a gesture when you make a presentation, or write a sentence without any manual tools, then the presentations can be more immersive and attractive to the audiences An interactive projection system also helps people to produce more attractive artistic exhibits, such as interactive walls and floors Lately, a lot of attempts have been made to use human-computer interaction in plays and musical perfor-mances Namely, if appropriate events occur when an actor performs on stage, a better reaction can be obtained from the au-dience because such events are well synchronized with the actor’s performance Using this concept, new applications with in-teresting interactions are possible such as the magic drawing board or virtual combat simulation
From a technical standpoint, research on gesture recognition is a topic of interest in the field of computer vision In particular recognizing gestures in real time is of paramount importance Most research groups use the Kinect camera to recognize gestures precisely because the Kinect camera can discern both depth and color information On the other hand, the Kinect cannot obtain depth and color information for shadows generated by light from behind the screen The detection range of the Kinect is limited when applied to a large screen because the distance from the sensor to the screen is considerably large Another method for gesture recognition is to recognize gestures from images The image-based approach is less expensive than the Kinect-based method because it uses less hardware for gesture acquisition
In this work, a vision-based interactive projection system is proposed, which recognizes shadow gestures with proper preci-sion The process consists of detection and recognition modules of shadow gestures in real time, which are the core parts of the proposed system Next, several novel applications based on the proposed system are presented to demonstrate the potential of the proposed method for use in various applications
Trang 3Numerous st
generating even
Mistry et al.
wearable gestu
sists of a portab
time Grønbæk
sists of a 12 m2
nized for variou
prototype of an
tual objects pro
image processi
shelf commodi
rard [4] develop
tion of the curr
tents It consist
cameras The c
niques Practic
mentioned syst
This paper is
process of d
presented In S
algorithm Fina
2 Overall Pr
Figure 1 sho
gesture, the sha
to recognize th
proper place in
The overall w
web camera T
the binary imag
ling algorithm
ter of the hand
is recognized, t
ed and given to
3 Separation
In this section,
tracted using th
tudies have bee
nts using hand g
[1] proposed a
ural interface tha
ble projector, a c
k et al [2] introd
2
glass surface w
us interactions, w
n interactive tab
ojected on a fla
ing are incorpora
ities such as a ca
ped the “Magic
rent whiteboard
ts of a projector,
captured images
cally, this system
tems, various oth
s structured as fo
detection and sep
ection 5, the tra
ally, the conclusi
rocess
ows the entire sy
adow is created o
he gesture throu
n real-time
workflow of the
he image is pro
ge in order to re
The area of the
can be recogniz
the shadow hand
o the user In the
and Detection
the technical ap
he background se
en conducted reg gestures because portable interac
at allows the use camera and a mo duced an interac with a projector which provide l letop projection
at surface For th ated to provide amera, projector Table” for meet
by providing v two cameras an are then proces
m allows the u her projection sy ollows: Section 2
paration of imag acking process o ion of the paper
ystem consisting
on the screen, w ugh image proce
e proposed syste cessed to produ emove the backg hand can be ob zed using the mo
d is traced by an subsequent sect
n Process
pproaches for sep eparation and sh
garding interact
e the hand gestur ctive projection
r to interact with obile wearable d ctive floor suppo
r that projects th earning environ n-vision system, his interaction, t
a convenient bu
r, and a screen t tings It has a wh arious operation
nd a white board ssed to extract th user to interactiv ystems have bee
2 presents the ov
ge data In Sect
of shadows is pre
is presented wit
g of a beam proj which is captured essing Next, the
em is illustrated uce a binary ima ground Shadow btained through c oment value Inv
n optical flow al tions, each mod
paration and det hadow detection
tive projection s res can represen system, SixthSe
h digital informa device, which sh ort system using
he glass upward
nments for childr called PlayAny the shadow-bas
ut flexible tablet that do not requi hiteboard on the
ns such as copy,
d The pen stroke
he position and vely create and
en developed wo verall process of
tion 4, the recog esented Section
th future work in
jector, a screen,
d by the camera
e computer con
in Figure 2 Fir
ge Then, an AN
ws that are distin curvature, a con variant moments lgorithm Finally dule in the overal
tection are expla
n methods
systems In parti
nt diverse shapes ense, based on n ation augmented hows digital info
g a vision-based Limbs of users
ren Wilson of M
ywhere, which a sed finger recog top projection-vi ire any detailed
e surface It was , paste, translatio
e and the conten the contents usi
d control the co orldwide [5]-[7]
f the proposed a
gnition process f
n 6 shows the ex
n Section 7
a web camera The computer t ntrols the beam p rst, the computer
ND operation is nct from the back nvex hull, and de
s are used for ge
y, events corresp
ll process is exp
ained Given an i Figure 1 O
icular researche
s appropriate for natural hand ges
d around the use ormation on phy tracking method
s (children) are
Microsoft Resear
allows the user t gnition, tracking ision system It configurations o developed to ov
on, and rotation nts on the board ing various imag ontents In addi
algorithm Sectio
for distinguishin xperimental resu
and a computer then performs c projector to cre
r receives an inp performed on th kground are det efect in each lab sture recognition ponding to the g plained in detail
image, the shado Overview of the
ers are interested
r recognition stures It provide
er The system c ysical objects in r
d The system c tracked and rec
rch [3] reported
to interact with , and various ot consists of off-t
or calibration vercome the lim
n of the drawn c are captured by
ge processing te ition to the abo
on 3 explains the
ng hand gesture ults of the propo
r If a user create alculations in or eate an event at put image from
he background tected using a la beled area The c
n After the gest gesture are gene
ow part is
ex-e systex-em
d in
es a con-real con- cog-the vir-ther the- Be- mita- con-the ech-
ove-e
es is osed
es a rder the the and abe- cen-ture
Trang 4erat-3.1 Backgroun
The backgroun
ing background
3.1.1 Averaging
The averagin
algorithm is de
based on the m
model, we con
ages of each fra
( )
Here, dst1(x,y)
image in a fram
between the pre
( )
Here, dst2(x,y)
age in the previ
( )
Figure
nd Separation P
nd separation ste
d algorithm is em
ng background a
ng background a
esigned to genera
model When the
sider it as a bac
ame are accumu
( )
←
is a pixel at the
me Next, varian
evious frame an
(
Pframe x
←
) is a pixel at the
ious frame We o
( )
total
←
2 Shadow gestu
Process
p segments the i mployed
algorithm
algorithm is used ate a backgroun
e current frame i kground Otherw ulated for some p
( ,
frame x
+
e position of x an
nce is needed to
nd the current fra
)
,
e position of x a
obtain average v
) ( )
ure recognition p
image into the b
d in order to dist
nd model using t
is present betwe wise, it is recog period The form )
nd y in the imag
generate a back ame is carried ou
( ) ,
and y in the imag values of dst1 an
) dst 2 ( x ,
total
←
process
background and
tinguish between the mean and va een the upper an gnized as an obje mula is expressed
)
ge of dst1, and f
kground model
ut as
(2)
ge of dst2, and P
nd dst2 by divid
)
y
(3) Figure 4
objects For this
n the backgroun ariance of each p
nd lower thresho ect First, to obta
d as
frame(x,y) indic
Accumulating t
Pframe(x,y) is th ding the total num
4 Problem with t
s operation, an im
nd and the objec pixel, and to del olds obtained fro ain the backgrou
ate the pixel val the absolute valu
he pixel value at mber of frames a the averaging ba
mproved averag
cts in an image T lete the backgrou
om the backgrou und model, the
lue at x and y of
ue of the differe
t x and y of the
as follows
ackground algor
g-The und und
f an ence
im-rithm
Trang 5The upper and
adjust the range
lower threshold
( ) ,
upper x y
( ,
lower x y
To apply this m
frame values T
( ) ,
Figure 3 rep
user’s shadow
vided with cert
objects are reco
3.1.2 Problems
It was determ
background wh
location is still
is still detected
In the previo
dow’s 2D imag
background alg
improved recog
Figure 5 AND
3.2 Shadow De
lower threshold
e where the bac
ds are calculated
) ← dst 1 ( x y ,
) 1 ( ,
method in real-tim
The formula is ex
( 1 α ) ds
← − C
resents the resu
appears after th
tainty by using
ognized as belon
s with the averag
mined that some
hen they stay in
regarded as a sh
as a shadow in
ous studies, the d
ge information
gorithm and the
gnition of the sh
D operator (a) Im
etection Process
d values are det ckground is reco
d through the fol ) 2 ( ,
, y − dst 2 x
me, we need to i xpressed as follo
( ) ,
ult of the averagi
he background i
a reverse binari nging to the back
ging backgroun
e problems occu
n the same place hadow The righ the current fram depth and color
To solve this pr
e current binary hadow Figure 5
mage of the ave
s
termined by calc ognized, he/she c llowing formula )
y )
,
x y
introduce the rat ows
( ,
frame x y
C
ing background
s recognized by ization method p kground, and on
nd algorithm
ur because the b
e for longer than
ht figure in Figu
me although it no information was roblem, a curren
y image are reca shows the proce
eraging backgrou
ope
culating the add can do this by ad
ae
(4)
tio α , which i
)
y (5)
d algorithm It in
y accumulating t provided throug nly moving objec
background is u
n a certain perio ure 4 shows the a
o longer exists
s used to solve t
nt binary image alculated by the ess of solving th
und algorithm (b eration
Figur
dition and subtra djusting the thre
s the ratio of the
ndicates that the the first 30 fram
gh the OpenCV cts are detected
updated in real-t
od of time Then aforementioned the same problem should be empl
e AND operatio
he problem
b) Current binary
re 3 Separation b
action of values eshold In this p
e accumulated v
shadow can be mes The separat library [8] In a because of the r
time Shadows a
n, if the shadow problem that th m; however, we loyed The imag
on This method
y image (c) Ima between backgr
If a user wants aper, the upper a
values and curren
e detected when ted portions are addition, station real-time update
are recognized a
w moves, its form
he previous shad
e have only the s
ge of the averag
d contributes to
age after the AND round and shado
s to and
nt
the di-nary
es
as a mer dow sha-ging the
D
ow
Trang 6Once the bac
efficient access
3.2.1 Labeling
The principl
rithm begins at
When the first
of one is detect
3.2.2 Region of
To distinguis
labeled areas I
are shown in Fi
4 Recognition
This paper is
hand region rep
order to overco
information
4.1 Recognition
Given ROIs’
4.1.1 Convex h
This method
Step 1: In orde
tion is
gions
ckground is sep
s using the labeli
algorithm
e of the labeling
t a pixel (the left
pixel that has th
ted Figure 6 dep
f interest (ROI)
sh the hand and
Image processin
igure 7
n Process
s focused on the
presented in sha
ome this limitati
n of the Hand R
’ in the image, th
hull & defect det
d consists of thre
er to extract the
obtained using
parated, the shad ing algorithm
g algorithm [9] i t-top pixel) If th
he value of one i picts the operatio
d to increase the
g can be made f
Fi
e recognition of h adows, however ion, this paper p
Region
he hand region i
tection
ee steps as follow convex hull and the Canny edge
dows are then pr
is as follows A
he value of one i
is detected, it is
on of the labelin
processing spee faster by using a
igure 7 Example
hand gestures, w
r, can be limited proposes a meth
is detected for ge
ws
d defects, we nee
e detection algor
rocessed for rec
A binary image h
is not present in marked as the s
ng algorithm
Fi
ed, we need to s
an ROI image in
e of regions of in
which may find a
d in that the sha hod of extracting
esture recognitio
ed to determine rithm [10] Figur
cognition The is
has only values o every direction, starting point Th
igure 6 Operatio
set a region of in nstead of using th
nterest
a lot of applicati adows do not ha
g the hand area
on
the contours of
re 8 shows the d
solated shadows
of Zero (0) and , the algorithm c
he endpoint is w
on of the labeling
nterest (ROI) wi
he entire image
ions in diverse a ave depth or co only using conv
f the regions Th detected contour
s are labeled for
One (1) The al continues search where the last va
g algorithm
ithin the previou Examples of RO
areas Detecting olor information vex hull and def
he contour inform
rs of the shadow
r an
lgo-hing alue
usly OIs
the
n In fect
ma-w
Trang 7re-Step 2: Many
[11] an
exampl
In this pa
principle
is, if ther
value, the
and sorte
ing point
each resu
the triang
as shown
searching
Step 3: To dist
sary be
hand ar
shadow
the con
and con
researchers hav
nd Quick hull [1
le of a convex h
aper, the Graham
e that a point can
re are points S, A
e largest point o
ed in the order o
ts We can deter
ult of the scan I
gle In that case,
n in Figure 10 (d
g algorithm oper
tinguish hands an
cause there is n
rea To classify t
w are used for thi
nvex hull In othe
nvex hull line W
ve been studyin 12] The convex hull for points i
m scan algorithm nnot be part of th
A, B, C, D as in
of the x-axis is se
of size As a resu rmine whether a
If the cross vect the point is rem d) once all poin rates independen
(a) Figu
and other objects
no depth or colo the shadow of th
is purpose Defe
er words, they a
We can draw the
Figure 8 De
g and developin
x hull is the sho
s given in Figur
Figure 9 Conv
m [13][14] was c
he convex hull w Figure 10 (a), S elected) Then, t ult, S, A, and B
a point is on the
or of the three s moved from the s nts are scanned
ntly in each ROI
(b) ure 10 Convex h
s from the shado
or information th
he hand, the bes ects are defined are the points on
e line perpendic
etected contours
ng algorithms to ortest closed pa
re 9
vex Hull Princip
chosen for conve when a triangle
is selected to be the angles of all are put on the st convex hull by stacked points is stack, and the ne
We can obtain
I
(c) hull searching al
ow, a new metho hat can be utiliz
t method is to p
as the farthest p
n contour lines t cular from the co
s
o search for con ath including al
ple
ex hull computa consisting of thr
e the smallest va four points (A, tack, and the sca
y checking the d
s negative, it me ext point is stack convex hulls se
(d) lgorithm
od that uses only zed Thus, we pr plot the location point from the li that have the lon onvex hull line t
nvex hulls, such
ll points given a
ation The algorit ree points includ
alue of the y-axi
B, C, D) from p
an algorithm beg direction of the c eans that the poi ked A convex h eparately for eac
y the shape of th ropose a method
of the wrist In t ine segment mad ngest distance b
to the shadow u
h as Gift wrapp
a set of points
thm relies upon des that point T
s (if it has the sa point S are obtain gins for all rema cross vector arou int is located ins hull can be obtain
ch ROI because
he shadow is nec
d for extracting this step, defects
de by two points etween the shad using a straight-l
ping
An
the That ame ned ain-und side ned the
ces-the
s of
s of dow line
Trang 8the defe
4.1.2 Resetting
For faster im
of the ROI ima
we can only rec
only the hand r
is fixed Theref
last defects Fig
4.2 Recognition
In this section,
ecuted by calcu
function f x (
finite part of th
by f x y ( ) , ;
sumption is imp
4.2.1 Hu invar
n The shadow e
ect points
g the ROI
mage processing,
age size is modif
cognize gesture
regardless of the
fore, resetting th
gure 12 indicate
n of Hand Gest
we describe the
ulating moments
)
,
x y is piecew
he xy plane, an
; and contrariw
portant; otherwi
riant moments
edge point that h
Figure 1
, the smaller sha fied according to
s through mome
e length of the a
he ROI will cont
s the new ROI u
tures
e process to be u
s of ROIs Accor wise and continuo
nd then, the mom
wise, f x y ( ) ,
ise, the aboveme
has the longest s
11 Defect points
adow regions are
o the length of th ent values How arm The momen tribute to recogn using defect poin
Figure 1
sed in distinguis rding to the uniq ous, and therefo
ments of all orde
is uniquely det entioned uniquen
straight line is th
and the experim
e considered Or
he arm, which m wever, if we crop
nt values are not nizing hand gestu nts
2 Resetting a RO
shing the variou queness theorem ore, a bounded fu
ers exist The m
termined by {m
ness theorem m
hen identified as
mental result
riginal ROIs oft makes it difficult
p the image at th
t changed becau ures We can re
OI
us hand gestures
m [15], if it is ass unction; it can h
moments sequenc
}
ij
m It should b
may not hold
s the defect poin
ten change beca
t to identify a ha
he wrist position use the aspect ra set the new ROI
Recognition alg sumed that the d have nonzero val
ce {mij} is uniq
be noted that the
nt Figure 11 sho
ause the aspect ra and gesture beca
n, our ROI inclu atio of the new R
I using the first
gorithms are ex-density distributi lues only in the
quely determined
restriction
as-ows
atio ause udes ROI and
-ion
d
Trang 9In digital image
mij =
The centroid of
centroid of grav
10
00
,
m
m
Figure 13 indic
users want to s
central momen
(
,
ij
x y
f x
Here, i and
tionship as follo
00 m00
10 01
20 m20
11 m11
02 m02
03 m03
30 m30
21 m21
es, the two-dime
( )
,
,
x y
f x y x
f gravity is deter
vity (
´ ´
,
x y) will
01 00
m
m
=
cates the centroi
select a benchm
nts shifted by a c
) ´
,
i
j represent the
ows:
0
2
´
x
μ
´
xy
μ
2
´
y
μ
´ 02
3 m μ y + 2 μ
´ 20
3 m μ x + 2 μ
´
20 2 11
ensional (i + j
( , 0,
i j
x y i j =
rmined by the m
l be expressed a
(7)
d of the hand an mark, such as the entroid are defin
´
.
i j
e horizontal x-ax
(9) (10)
(11)
(12)
(13)
3
´
y
μ (
3
´
x
μ (
2 2
´ ´ ´
2
)th order mome
)
,1, 2… (6)
moment value, an
as
Figure 13 Ce
nd the farthest lo
e mouse pointer ned as
(8)
xis and vertical y
(14)
(15)
(16)
ents of f x y ( ,
)
nd it is utilized a
enter of the hand ocated convex h
r The central mo
y-axis, respectiv
)
y are defined i
as a reference po
d hull from the cen oments do not c
vely, in Eq (8) T
in terms of Riem
oint when an eve
nter This point i change under tra
The central mom
mann integrals as
ent occurs The
is very useful wh anslation [16] T
ments have a rela
s
hen The
Trang 10
12 m12 m x02 2 m y11 2 xy
The mathematical interpretation of the moments is as follows
02
μ : The dispersion of the horizontal axis
20
μ : The dispersion of the vertical axis
11
μ : The covariance of the horizontal and vertical axes
12
μ : The degree of dispersion of the left side compared to the right side in the horizontal axis
21
μ : The degree of dispersion of the lower direction compared to the upper direction in the horizontal axis
30
μ : The degree of asymmetry in the horizontal axis (skew)
03
μ : The degree of asymmetry in the vertical axis (skew)
The normalized moments are obtained by dividing the values of consistent size, and those give the invariable characteristics
at that size [17] The normalized moments are defined as
00
2
ij
ij
γ
μ
μ
+
= = + (18)
In this work, we extract the Hu invariant moments [18] through (18) and (25) and use them for the gesture recognition algo-rithm The Hu invariant moments consist of 2nd and 3rd order central moments, and are as follows:
1 20 02
I = η + η (19)
2 ( 20 02) 4 11
I = η + η + η (20)
3 ( 30 3 12) (3 21 03)
I = η − η + η − η (21)
4 ( 30 12) ( 21 03)
I = η + η + η + η (22)
I = η − η η + η η + η − η + η + η − η η + η η + η − η + η
(23)
6 ( 20 02)[( 30 12) ( 21 03) ]
I = η − η η + η − η + η 2
11 30 12 21 03
7 (3 12 30)( 30 12)[( 30 12) 3( 21 03) ]
I = η − η η + η η + η − η + η
− ( η30 − 3 η η12)( 21+ η03)[3( η30 + η12)2− ( η21+ η03)2] (25)
Our analysis of the Hu invariant moments defined in the above equation is as follows:
1
I : The sum of the dispersion of the horizontal and vertical directions The more the values are spread out along the horizontal and vertical directions, the greater the value is
2
I : The covariance of the horizontal and vertical directions (if dispersions of the horizontal and vertical directions are similar.)