Recognizing Postures in Vietnamese Sign LanguageWith MEMS Accelerometers The Duy Bui and Long Thang Nguyen, Member, IEEE Abstract—In this paper, we discuss the application of microelec-
Trang 1Recognizing Postures in Vietnamese Sign Language
With MEMS Accelerometers
The Duy Bui and Long Thang Nguyen, Member, IEEE
Abstract—In this paper, we discuss the application of
microelec-tronic mechanical system (MEMS) accelerometers for recognizing
postures in Vietnamese Sign Language (VSL) We develop a
sim-ilar device to the Accele Glove [6] for the recognition of VSL In
addition to the five sensors as in the Accele Glove, we placed one
more sensor on the back of the hand to improve the recognition
process In addition, we use a completely different method for
the classification process leading to very promising results This
paper concentrates on signing with postures, in which the user
spells each word with finger signs corresponding to the letters of
the alphabet Therefore, we focus on the recognition of postures
that represent the 23 Vietnamese-based letters together with two
postures for “space” and “punctuation.” The data obtained from
the sensing device is transformed to relative angles between fingers
and the palm Each character is recognized by a fuzzy rule-based
classification system, which allows the concept of vagueness in
recognition In addition, a set of Vietnamese spelling rules has
been applied to improve the classification results The recognition
rate is high even when the postures are not performed perfectly,
e.g., the finger is not bended completely or the palm is not straight.
Index Terms—Human computer interaction, microelectronic
mechanical system (MEMS) sensors, sign language recognition,
Vietnamese sign language (VSL).
I INTRODUCTION
GESTURE recognition has been a research area which
re-ceived much attention from many research communities
such as human computer interaction and image processing
Gesture recognition has contributed significantly to the
im-provement of interaction between human and computer
Another application of gesture recognition is sign language
translation Among many types of gestures, sign languages
seem to be the most structured ones Each gesture in a sign
language is usually associated with a predefined meaning
Moreover, the application of strong rules of context and
grammar makes the sign language easier to recognize [13]
There are two main approaches in sign language recognition
The former is vision-based, which uses color cameras to track
hand and understand sign language The latter uses expensive
Manuscript received June 15, 2006; revised August 28, 2006; accepted
Au-gust 29, 2006 The associate editor coordinating the review of this paper and
approving it for publication was Dr Subhas Mukhopadhyay.
T D Bui is with the Faculty of Information Technology, College of
Tech-nology, Vietnam National University, Hanoi 144 Xuan Thuy, Hanoi, Vietnam
(e-mail: duybt@vnu.edu.vn).
T L Nguyen is with Faculty of Electrical Engineering and
Telecommuni-cation, College of Technology, Vietnam National University, Hanoi 144 Xuan
Thuy, Hanoi, Vietnam (e-mail: longnt@vnu.edu.vn).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/JSEN.2007.894132
sensing gloves to extract parameters such as joint angles that describe the shape and position of the hand
With the vision-based approach, Uras and Verri [17] obtain some success when trying to recognize the shapes using the
“size function” concept on a Sun Sparc Station A recognition rate of 93% for the most recognizable letter and of 70% for the most difficult case is obtained by Lamart and Bhuiyant [9] with the use of colored gloves and neural networks Starner and Pentland track hands with a color camera and then use hidden Markov models to interpret American Sign Language (ASL) [13] The hands are tracked by their color In this approach, in-stead of attempting a fine description of hand shape during the hand tracking stage, only a coarse description of hand shape, orientation, and trajectory is produced This information is used
as the input for the hidden Markov models to understand ASL
This approach is developed further by Starner et al [14] with
98% accuracy Clearly, they abandoned the idea of recognizing hand postures
In sign languages, many signs may look similar to each other For example, in the ASL alphabet, the letters “A,” “M,” “N,”
“S,” and “T” are signed with a closed fist (see [1]) At the first sight, the postures for these five letters appear to be the same A vision-based system would encounter difficulties in recognizing these postures One approach to overcome these difficulties is to use sensing gloves In literature, there are numbers of work on gesture recognitions based on sensing gloves For example, in order to enter ASCII characters to a computer, Grimes [4] de-veloped the Data Entry Glove using switches and other sensors sewn to the glove Kramer and Leifer [8] used a lookup table with his patented CyberGlove to recognize the 26 letters of the
alphabet Alternatively, Erenshteyn et al [3] used a method
in-volving coded output, such as Hamming, Golay, and other hy-brid codes together with the CyberGlove Zimmerman invented the VPL Data Glove [23] in order to recognize postures in dif-ferent sign languages For example, a set of 51 basic postures of Taiwanese Sign Language was solved by Liang and Ouhyoung [11] with probability models; and 36 ASL postures were able
to be recognized with this glove by the work of Waldron and Kim [18] with a two-stage neural network Those mentioned gloves, however, are very expensive A more affordable option was proposed by Kadous [7] This is a system for Australian Sign Language based on Mattel’s Power Glove However, be-cause of a lack of sensors on the pinky finger, the glove could not
be used to recognize the alphabet hand shapes With
accelerom-eters at fingertips, Perng et al [12] developed a text editor where
each hand gesture refers to a letter of the alphabet For more de-tailed reviews of gesture recognition with sensing gloves, see [15] and [21]
1530-437X/$25.00 © 2007 IEEE
Trang 2nipulate three different virtual objects: a virtual hand, icons on
a virtual desktop and a virtual keyboard using the 26 postures
of the ASL alphabet When using this device as finger spelling
translator, a multiclass pattern recognition algorithm is applied
[5] First, the data are collected and analyzed “offline” on a
PC The obtained data are transformed to vectors in the posture
space then divided into subclasses This way, it is possible to
apply simple linear discrimination of the postures in 2-D space,
and Bayes’ Rule in those cases where classes have features with
overlapped distributions This algorithm can be implemented as
a sequence of “if-then-else” statements in the microcontroller,
allowing a real-time processing The application of this device
has much potential, which can be developed further to be a more
comprehensive system and for other sign languages
In this paper, we discuss the application of MEMS
accelerom-eters for recognizing postures in Vietnamese Sign Language
(VSL) We develop a similar device to the Accele Glove [6] for
recognition of VSL In addition to the five sensors as in the
Ac-cele Glove [6], we place one more sensor on the back of the hand
to improve the recognition process In addition, we use a
com-pletely different method for the classification process leading to
very promising results
This paper concentrates on signing with postures, in which
the user spells each word with finger signs corresponding to the
letters of the alphabet Therefore, we focus on the recognition of
postures that represent the 23 Vietnamese-based letters together
with two postures for “space” and “punctuation.” The data
ob-tained from the sensing device are transformed to relative angles
between fingers and the palm Characters are recognized by a
fuzzy rule-based classification system, which allows the concept
of vagueness in recognition In addition, a set of Vietnamese
spelling rules has been applied to improve the classification
re-sults The recognition rate is high even when the postures are
not performed perfectly, e.g., the finger is not bent completely
or the palm is not straight
II VIETNAMESEALPHABETSYSTEM
Vietnamese was originally written with a Chinese-like script
During the 17th century, a Latin-based orthography for
Viet-namese was introduced by Roman Catholic missionaries Until
the early 20th century, both orthographies were used in parallel
Today, the Latin-based is the only orthography used in Vietnam
The Latin-based Vietnamese alphabet is listed below:
A ˘A Â B C D -D E Ê G H I K L M N O Ô ˙O P Q R S T U ˙U V
X Y
Fig 1 Alphabet system in VSL.
The letters J, W, and Z are also used, but only in foreign loan words In addition, Vietnamese is a tonal language with six tones These tones are marked as follows: level, high rising, low (falling), dipping rising, high rising glottalized, and low glottalized
Since the Vietnamese alphabet system is more complicated than the English alphabet system, more signs are required for VSL in comparison with ASL However, it is possible to im-plement finger spelling of Vietnamese words similar to the ASL system In principle, VSL is based on the well-established ASL According to the ASL dictionary [1], four components are used
to describe a sign: hand shape, location in relation to the body, movement of the hands, and orientation of the palms A popular concept in sign language, “posture,” is formed by the hand shape (position of the fingers with respect to the palm), the static com-ponent of the sign, and the orientation of the palm The alphabet
in ASL, which consists of 26 unique distinguishable postures,
is used to spell names or uncommon words that are not well de-fined in the dictionary
The VSL consists of 23 based letter and some addition signs for the accents and the tones The 23 based letters are:
A B C D -D E G H I K L M N O P Q R S T U V X Y
In this paper, we concentrate on the recognition of postures for these based letters These postures are shown in Fig 1
III THESENSINGDEVICE One of the most successful MEMS sensors in the market is ADXL202 accelerometers from Analog Devices, Inc (www analog.com) The ADXL202 are low cost, low power, com-plete two-axis accelerometers on a single IC chip with a mea-surement range of The ADXL202 can measure both dy-namic acceleration (e.g., vibration) and static acceleration (e.g., gravity) The accelerometer is fabricated by the surface micro-maching technology It is composed of a small mass suspended
by springs Capacitive sensors distributed along two orthogonal axes (X and Y) provide a measurement proportional to the dis-placement of the mass with respect to its rest position Because the mass is displaced from the center, either due to acceleration
or due to an inclination with respect to the gravitational vector , the sensor can be used to measure absolute angular position The outputs are digital signals whose duty cycles (ratio of pulsewidth
to period) are proportional to the acceleration in each of the two
Trang 3Fig 2 Function block diagram of ADXL202 (from Analog Devices, Inc.).
Fig 3 Sensing glove with six accelerometers and a basic stamp
microcontroller.
sensitive axes The output period is adjustable from 0.5 to 10
ms via a single resistor If a voltage output is desired, a
voltage output proportional to acceleration is available from the
and pins, or may be reconstructed by filtering the
duty cycle outputs The bandwidth of the ADXL202 may be set
from 0.01 Hz to 5 kHz via capacitors and The typical
noise floor is 500 g/ allowing signals below 5 mg to be
resolved for bandwidths below 60 Hz The function block
dia-gram of ADXL202 is shown in Fig 2
Our sensing device, which is shown in Fig 3, consists of six
ADXL202 accelerometers attached on a glove, five on the
fin-gers, and one on the back of the palm The Y axis of the sensor
on each finger points toward the fingertip, providing a measure
of joint flexion (see Fig 4) The Y axis of the sensor on the back
of the palm measures the flexing angle of the palm The X axis
of the sensor on the back of the palm can be used to extract
infor-mation of hand roll, while the X axis of the sensor on each finger
can provide information of individual finger abduction Data are
collected by measuring the duty cycle of a train of pulses of
1 kHz When a sensor is in its horizontal position, the duty cycle
is 50% When it is tilted from to , the duty cycle
varies from 37.5% (.375 ms) to 62.5% (.625 ms), respectively
(see Fig 5) In our device, the duty cycle is measured using
a BASIC Stamp microcontroller The Parallax BASIC Stamp
module is a small, low cost general-purpose I/O computer that
is programmed in a simple form of BASIC (from Parallax, Inc.,
Fig 4 The X axis and Y axis of the sensor on the finger and of the sensor on the back of the palm.
Fig 5 Dependence of accelerometer output on tilt angle.
www.parallax.com) The pulsewidth modulated output of the ADXL202 can be read directly of the BASIC Stamp module, so
no ADC is necessary Twelve pulsewidths are read sequentially
by the microcontroller, beginning with the X axis followed by the Y axis, thumb first The data are then sent through the serial port to a PC for further analyses
IV DATAPROCESSING Our sensing glove produces the raw data represented as a vector of 12 measurements, two axes per finger, and the last two axes for the palm
At first, we convert our data to the angles After that we sub-tract the and values of the fingers to the and values of the palm, respectively Note that our sensing device has a sensor
on the back of the palm, which measures the rolling and flexing angle of the palm By processing the data this way, we convert the raw data into the relative angles between the fingers and the palm We will do the classification based on the and value
Trang 4Fig 6 An overview of the classification system.
of the palm and the relative angles between the fingers and the
palm
Our approach is different from the approach proposed in [6],
which recognizes the postures by extracting the features directly
from the raw data There are two reasons for this The first
reason is that the raw data are the pulsewidths which relate to
the rolling or flexing angles through cosine functions Since the
cosine function itself is not linear, the sum of pulsewidths
mea-sured on fingers does not represent the hand shape accurately
The second reason is that the sum is not a good function to
ex-tract the feature of hand shape A lot of different hand shapes
can result in the same feature extracted by the sum function
V CLASSIFICATION The classification system can be seen in Fig 6 First of all,
we use the value of the palm to divide the postures into three
categories: “Vertical” which consists of the postures of letter
“A,” “B,” “E,” “H,” “I,” “K,” “L,” “O,” “R,” “T,” “U,” “V,” and
“Y”; “Sloping” which consists of the postures of letter “C,” “D,”
“-D,” and “S”; and “Horizontal” which consists of the postures of
letter “G,” “M,” “N,” “P,” “Q,” and “X.” After the postures are
divided into three categories, we use a three fuzzy rule-based
system to perform further classification
Human beings often need to deal with input that is not in
pre-cise or numerical form Inspired by that observation, Zadeh [22]
developed a fuzzy set theory that allows concepts that do not
have well-defined sharp boundaries In contrast to the classical
set theory in which an object can only be a full member or a
full nonmember of a set, an object in fuzzy set theory can
pos-sess a partial membership of a fuzzy set A fuzzy proposition of
the form “if is A” is partially satisfied if the object (usually
crisp value ) is partial membership of the fuzzy set A Based
on that, fuzzy logic was developed to deal with fuzzy “if-then”
rules, where the “if” condition of the rules is a Boolean
combi-nation of fuzzy propositions When the “if” condition is partially
satisfied, the conclusion of a fuzzy rule is drawn based on the
degree to which the condition is satisfied
We have found that the concept of fuzzy set is well suited for the problem of posture classification because the posture is normally defined in a vague way, e.g., “the index finger bends a little bit.” Moreover, with a fuzzy rule-based system, the classi-fication can be solved by a set of rules in natural language which look like:
if all fingers bend maximally then
it is the posture of letter “A”
if all fingers does not bend then
it is the posture of letter “B”
The fuzzy rule-based system allows the classification imme-diately at high recognition rate without having to collect training samples Moreover, a new rule can be added easily for a new posture without changing the existing rules We would miss out on that when using other models like neural networks and hidden Markov models
We model the level of bending or flexing of the fingers by
five fuzzy sets (Fig 7): Low, Low, Medium, High,
Very-High The fuzzy classification rules look like:
if thumb finger’s bending is Low and index finger’s bending is Very-Low and medium finger’s bending is Very-Low and ring finger’s bending is Very-Low
and pinky finger’s bending is Very-Low then the
posture is recognized as letter “B”
We have created 22 fuzzy rules to classify VSL postures The posture of letter “G” is recognized directly with the use of the value of the palm With these fuzzy rules, the classification process is done as follows Every time we receive the data from the sensing device, we first verify if the hand is at static position
by comparing with previous data We wait until the hand stops moving to start the recognition process The preprocessed data are used to calculate the “membership values”—the degree to which the data belongs to the fuzzy sets We then calculate the degree to which the current data set matches each of the 22 fuzzy rules The matching degree is calculated as the product of the membership values to which the data belongs to the fuzzy sets
Trang 5in the rules Finally, the data set is recognized by the rule with
the highest matching degree
The recognition process is enhanced by the use of Vietnamese
spelling rules Vietnamese has a very special characteristic,
which is that all the words are monosyllabic Moreover,
Viet-namese spelling rules are very strict The combination of
consonant and vowels must follow a set of predefined rules In
most of the cases, a consonant cannot be followed by another
consonant Taking advantage of these rules, when recognizing
words formed by postures of letters, we can eliminate many
misclassifications
VI RESULT The system was implemented in C++, and was tested using a
total of 200 samples for each letter to measure the recognition
rate In order to collect the samples, we have asked five different
persons to perform the posture of each letter 40 times
Twenty out of the 23 letters reached a 100% recognition
rate This is a very good recognition rate compared with that
of vision-based approaches This is because in front of a
single camera, many postures look very similar This is not an
issue with glove-based approaches because joint flexion and
finger abduction of each individual finger can be measured
When comparing our recognition rate with other glove-based
approaches with expensive and comprehensive gloves, this is
a also a competitive rate
In our approach, the problem arises from the three letters
“R,” “U,” and “V.” They produce ambiguity as the data
repre-sented these letters is similar The recognition rate is 90%, 79%,
and 93%, respectively The current two-axis MEMS accelerator
cannot detect very well the different among the postures of these
three letters Vision-based approaches or other glove-based
ap-proaches might not suffer from this problem
We have improved the situation by applying Vietnamese
spelling rules on word spelling in our system After this, the
recognition rates for the three letters “R,” “U,” and “V” have
increased significantly, which is 94%, 90%, and 96%,
respec-tively
The novelty of our system is that the postures can also be
recognized even when they are not performed perfectly, e.g.,
the finger is not bent completely or the palm is not straight This
is because we carry out the classification on the relative angles
between the fingers and the palm instead of the classification
on the raw data, as in [6] This is also the result of the fuzzy
rule-based system, which allows the concept of vagueness in
recognition
VII CONCLUSION
In this paper, we presented work on understanding VSL
through the use of MEMS accelerometers The system consists
of six ADXL202 accelerometers for sensing the hand posture,
a BASIC Stamp microcontroller, and a PC for data acquisition
and recognition of sign language The classification process is
done by a fuzzy rule-based system on the preprocessed data In
addition, we have applied a set of Vietnamese spelling rules in
order to improve the classification results We have achieved
very high recognition rates Moreover, the postures can also be recognized even when they are not performed perfectly One advantage of the glove-based approaches is the poten-tial of mobility The glove can be used independently with an embedded processor or by connecting wirelessly with mobile devices such as mobile phones or PDAs This requires our fu-ture work on wireless communication for the system, of which the problem of energy consumption is the most challenging one This also requires us to port the program part of the system into mobile devices In the future, we also want to place more sen-sors of different types such as three-axis MEMS accelerator and strain gauge into our sensing device in order to recognize more complex forms of the VSL, as well as to recognize gestures for other human computer interaction applications
The approach presented in this paper can be easily applied to other sign languages This is done mainly by modifying the rules
in our Fuzzy Rule-Based System There are also some other po-tential applications for our Sensing Glove which are: a wireless wearable mouse pointing device, a wireless wearable keyboard, hand motion and gesture recognition tools, virtual musical in-struments, computer sporting games, and work training in a sim-ulated environment
REFERENCES [1] N Chaimanonart and D J Young, “Remote RF powering system for
wireless MEMS strain sensors,” IEEE Sensors J., vol 6, no 2, pp.
484–489, Apr 2006.
[2] E Costello, Random House Webster’s Concise American Sign
Lan-guage Dictionary. New York: Random House, 1999.
[3] R Erenshteyn, D Saxe, P Laskov, and R Foulds, “Distributed output
encoding for multi-class pattern recognition,” in Proc Int Conf Image
Anal Process., 1999, pp 229–234.
[4] G Grimes, “Digital data entry glove interface device,” U.S Patent No.
4414537, 1983.
[5] J Hernandez, N Kyriakopoulos, and R Lindeman, “The AcceleGlove
a whole hand input for virtual reality,” in Proc SIGGRAPH 2002, San
Antonio, TX, 2002, p 259.
[6] J Hernandez, R Lindeman, and N Kyriakopoulos, “A multi-class pattern recognition system for practical finger spelling translation,”
in Proc 4th IEEE Int Conf Multimodal Interfaces, Pittsburgh, PA,
2002, pp 185–190.
[7] M W Kadous, “Grasp: Recognition of australian sign language using instrumented gloves,” M.S thesis, Univ New South Wales, Sydney, Australia, 1995.
[8] J Kramer and L Leifer, “The talking glove: An expressive and recep-tive ‘verbal’ communication aid for the deaf, deaf-blind, and nonvocal,”
in Proc SIGCAPH 39, 1988, pp 12–15.
[9] M V Lamart and M S Bhuiyant, “Hand alphabet recognition using
morphological PCA and neural networks,” in Proc Int Joint Conf.
Neural Netw., Washington, DC, 1999, vol 4, pp 2839–2844.
[10] S Lei, C A Zorman, and S L Garverick, “An oversampled capac-itance-to-voltage converter IC with application to time-domain
char-acterization of MEMS resonators,” IEEE Sensors J., vol 5, no 6, pp.
1353–1361, Dec 2005.
[11] R Liang and M Ouhyoung, “A real-time continuous gesture
recogni-tion system for sign language,” in Proc 3rd IEEE Int Conf Autom.
Face Gesture Recogn., 1998, pp 558–567.
[12] J Perng, B Fisher, S Hollar, and K S J Pister, “Acceleration sensing
glove (ASG),” in Proc ISWC Int Symp Wearable Comput., San
Fran-cisco, CA, 1999, pp 178–180.
[13] T Starner and A Pentland, “Real-time american sign language recog-nition from video using hidden Markov models.” MIT Media Lab, Per-ceptual Computing Group, Cambridge, MA, Tech Rep 375, 1995 [14] T Starner, J Weaver, and A Pentland, “A wearable computer based american sign language recognizer.” MIT Media Lab., Cambridge,
MA, Tech Rep 425, 1998.
Trang 6400/spl /C MEMS sensing and data telemetry,” IEEE Sensors J., vol.
5–6, pp 1389–1394, 2005.
[20] Y Wang, X Li, T Li, H Yang, and J Jiao, “Nanofabrication based on
MEMS technology,” IEEE Sensors J., vol 6, no 3, pp 686–690, Jun.
2006.
[21] R Watson, A Survey of Gesture Recognition Techniques Dept.
Comput Sci., Trinity College, Dublin, Ireland, Tech Rep
TCD-CS-93-11, 1993.
[22] L A Zadeh, “Fuzzy sets,” Inf Control, vol 8, pp 358–353, 1965.
[23] T Zimmerman, “Optical Flex Sensor,” U.S Patent No 4 542 291,
1987.
Long Thang Nguyen (S’03–M’04) received the
M.S degree from the International Institute of Materials Science, Hanoi University of Technolo-gies, Hanoi, Vietnam in 1998, and the Doctor of Engineering degree from the University of Twente, Enschede, The Netherlands, in 2004.
He has worked as a Lecturer with the Faculty
of Electronics and Telecommunications, College
of Technology, Hanoi National University, since
2004 His main activities are related to design and application of MEMS sensors He has been involved
in several projects such as designing of the patient monitoring system and integrations of inertial MEMS sensors and GPS for navigation.