Contents lists available atScienceDirect Robotics and Autonomous Systems journal homepage:www.elsevier.com/locate/robot Planning of goal-oriented motion from stochastic motion primitives
Trang 1Contents lists available atScienceDirect Robotics and Autonomous Systems journal homepage:www.elsevier.com/locate/robot
Planning of goal-oriented motion from stochastic motion primitives
and optimal controlling of joint torques in whole-body
Wataru Takano * , Yoshihiko Nakamura
Mechano-Informatics, University of Tokyo, 7-3-1, Hongo, Bunkyoku, Tokyo, 113-8656, Japan
h i g h l i g h t s
• This paper proposes the synthesis of whole body motions from stochastic motion primitives
• The joint torques are computed during preserving the profile of the synthesized motion and controlling the reaction forces
• Simulation demonstrates the validity of the motion synthesis and force control
Article history:
Received 15 September 2016
Received in revised form 1 January 2017
Accepted 28 January 2017
Available online 4 February 2017
Keywords:
Motion primitive
Motion synthesis
Stochastic model
a b s t r a c t Humanoid robots are expected to be integrated into daily life This requires the robots to perform human-like actions that are easily understandable by humans Learning by imitation is an effective framework that enables the robots to generate the same motions that humans do However, it is generally not useful for the robots to generate motions that are precisely the same as learned motions because the environment
is likely to be different from the environment where the motions were learned The humanoid robot should synthesize motions that are adaptive to the current environment by modifying learned motions Previous research encoded captured human whole-body motions into hidden Markov models, which are hereafter referred to as motion primitives, and generated human-like motions based on the acquired motion primitives The contact between the body and the environment also needs to be controlled, so that the humanoid robot’s whole-body motion can be realized in its current environment This paper proposes
a novel approach to synthesizing kinematic data using the motion primitive and controlling the torques
of all the joints in the humanoid robot to achieve the desired whole-body motions and contact forces The experiments demonstrate the validity of the proposed approach to synthesizing and controlling whole-body motions by humanoid robots
© 2017 The Authors Published by Elsevier B.V This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/)
1 Introduction
Humanoid robots need to understand human actions and
com-municate with humans using human-like gestures to be integrated
into daily life Approaches to representing human motions have
been developed as a typical framework for imitation learning
[1 3] Representations of motion can be grouped into two
cate-gories One approach encodes motions into a set of parameters of
a dynamical system, such as systems of differential equations or
neural networks [4 7] The other approach uses stochastic models,
typified by hidden Markov models (HMMs) [8 11] The dynamical
systems or stochastic models represent motions in abstract forms
as their parameters These abstract representations are called
motion primitives Robots learn motions through observation or
* Corresponding author Fax: +81 3 3818 0835
E-mail addresses:takano@ynl.t.u-tokyo.ac.jp (W Takano),
nakamura@ynl.t.u-tokyo.ac.jp (Y Nakamura).
demonstration as motion primitives, and subsequently use the motion primitives for motion recognition and motion generation One popular approach is Dynamic Movement Primitive (DMP) that encodes the demonstration into the parameters in the dif-ferential equation Petric et al created a virtual force field that attracts a trajectory to the target one [12] This approach generate the trajectory that satisfies a desired motion while preserving a natural posture in the training motions Wang et al also extended DMP to modifying the training trajectories while preserving their properties to create a new trajectory based on the Kernel func-tions [13] These approaches improve reusability of the DMP, but
do not handle the contact force Gams et al developed a method
to control the force using the DMP [14] The DMP generates the kinematic motion in the feedforward manner and the contact was controlled to maintain a predefined force in the feedback manner The Concept of Synthesizing a trajectory in the feedforward and controlling a reaction force from the environment in the feedback
is very similar to this paper
http://dx.doi.org/10.1016/j.robot.2017.01.013
0921-8890/© 2017 The Authors Published by Elsevier B.V This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Trang 2Another approach uses the stochastic model Research based
on the stochastic model has focused on developing efficient
algo-rithms for motion recognition, and has been developed in
intelli-gent humanoid robots that can understand human actions [15–17]
In addition to motion recognition, several methods have been
pro-posed to generate motions by using motion primitives [8,18–21]
Inamura et al have developed a sampling-based method to
gener-ate a sequence of joint angles of the entire robot body from HMMs
In the first step, a node is sampled according to probabilities of
node transitions in the HMM, and a sequence of nodes is created
In the second step, vectors of joint angles are sampled according to
the distribution of joint angles at each node in the node sequence
derived in the first step These two steps result in a motion
trajec-tory, represented by a sequence of the joint angles [8] Calinon et
al proposed an approach to encoding joint angles and joint angular
velocities into HMMs These HMMs can calculate the distribution
of the joint angular velocities given the joint angles at the previous
frame, sample the joint angular velocities based on the conditional
distribution, and update the joint angles at the current frame [21]
This framework includes the feedback loop and is adaptive to the
current state Because the synthesized motions are likely to
be-come the average of the training motions, the synthesized motions
are not applied to environments considerably different from where
the motions were originally learned Moreover, these methods
do not handle the targeted motions, and as a consequence, the
synthesized motion does not necessarily achieve a specific task For
example, if a robot learns a motion for grasping an object on a desk,
then the robot can reach an object located at the same position
by the learned motion primitive However, the robot is not able
to reach to an object located at a position on a shelf different from
the learned position on the desk
Robots should be able to modify the learned motions and
per-form flexible motions for adapting to their environment Herzog
et al propose an approach to recognize and synthesize motions
of manipulating objects at different locations using a parametric
hidden Markov model (PHMM)[18] PHMMs represent typical
mo-tions and compensate for different momo-tions of a particular type
Multiple motions are encoded into an HMM as a representative
motion, and each motion variant is encoded into its own HMM The
parameters of the HMM of each motion variant are represented as
a linear function of the parameters of the HMM and the position of
the object Consequently, individual motions are parameterized by
the position of the object Calinon et al have developed a method
for synthesizing motions by using descriptors of both the motion
and the manipulated object [22] These approaches do not control
the contact forces that are important for motions interacting with
environments or manipulation motions
As described above, several researches have been tackling the
motion generation from the motion primitives and extending
the generation to adapting to the environment for achieving the
reaching task But these approaches do not take into account the
dynamics, more specifically, contact force between the body and
environment or the equation of motion for the robot body The
motion of the robot is created by the joint torques and the
inter-action force from the environment The kinematic based motion
generation needs to be applied to the force control such that the
robot can perform kinetically and dynamically suitable motions
This paper proposes a novel approach to synthesizing whole-body
motion that satisfies configurations for a goal or subgoal postures
from HMMs of motion primitives, and to subsequently controlling
the torques of all the joints in a humanoid robot’s body such
that the robot can perform human-like motions and maintain the
desired contact with the environment
2 Motion synthesizer and torque controller
We propose a novel approach to synthesizing motions by using
the motion primitives and to subsequently controlling the joint
torques in the humanoid robot’s whole body, as shown inFig 1
Fig 1 A whole body motion is encoded into an HMM that is referred to as a
motion primitive Motion synthesizer generates a sequence of joint angles that not only maintain the profile of motions learned as the motion primitive but also reach the goal postures Force controller computes the joint torques such that the humanoid robot can performed the synthesized motion and the reaction forces can
be controlled as desired.
2.1 Representation of motion primitives
A configuration is defined as a complete specification of the pose of a human performer or a humanoid robot We choose the angles of all the joints in the whole body as the configuration in this study A whole-body motion is represented by a vector of the
joint angles x t at time t A motion segment is expressed by a motion
trajectory{x1,x2, ,x l} The motion segment is encoded into a set of parameters of an HMM, the HMM is hereinafter referred
to as a ‘‘motion primitive’’ An HMM is a stochastic model that
is defined by a compact notation λ = {Π,A,B}, where Π = { π1, π2, , πm}is the set of initial node probability, A= {a ij}is
the matrix whose element a ijis the probability of transition from
the ith node to the jth node, and B is the set of output probabilities
at the nodes Encoding of the motion segment optimizes these parameters of an HMM such that the likelihood of the motion segment being generated by the HMM is maximized The HMM can be utilized for motion recognition and motion generation The motion recognition finds the HMM from among a set of HMMs that
is the most likely to generate the observed motion The motion generation outputs a sequence of joint angles according to the distribution of the motion trajectories learned by the HMM Thus,
a humanoid robot learns motion segments in the form of motion primitives, recognizes observations, and generates its own motions
by using the motion primitives
2.2 Synthesis of joint angle sequence
This subsection describes a method to synthesize a sequence
of joint angles from a motion primitive in a way that enables a humanoid robot not only to perform human-like motions but also
to satisfy the (sub)goal postures
This method introduces an objective function that consists of three functions The first function calculates the likelihood that the HMM generates a motion trajectory The second function cal-culates the distance between the motion trajectory and goal pos-tures The third function calculates the smoothness of the motion
Trang 3trajectory The motion synthesis finds the motion trajectory that
maximizes the objective function
The first function of the likelihood that motion trajectory
{x1,x2, ,x l}is generated by the motion primitiveλis denoted
by
ψp=ln P(x1,x2, ,x l| λ). (1)
This likelihood can be efficiently computed by a forward
algo-rithm [23]
The variableδtis assumed to take a binary value This variable is
set toδt=1 if the goal posture is given to the motion at time t and
δt =0 otherwise The second function in the objective function is
derived from the difference between the configuration x tand the
goal posture x t,g at time t This function is written as a weighted
quadratic form of these differences:
2
l
∑
t= 1
δt
(
x t−x t,g
)T(
x t−x t,g
)
where the positive weight parameterwgis manually specified
The joint-angular jerks are chosen as the smoothness of the
mo-tion trajectory [24] The third function is written as the following
weighted quadratic form
2
l
∑
t= 4
x T
t
x t
where the positive weight parameterwd is manually specified
These three terms lead to the resultant objective function
The motion trajectory{x1,x2, ,x l}is rewritten as following
vector:
z= [x T, ,x T
l
]T
where column vectors of the joint angles x t at time t, are
concate-nated to a vector z x T denotes the transpose of x The objective
function in Eq.(4)can be subsequently rewritten in the simple
form:
Ψ(z)=ln P(z| λ)− wg
2
(
z−z g)T∆g(z−z g) − wd
2 z
T Lz, (6) where∆g and z care given as follows:
∆g =
⎡
⎢
⎤
z g = [x T,g, ,x T l,g
]T
Here, I d and O d are the d×d identity matrix and the zero matrix
re-spectively, where d is the number of dimensions of x t Additionally,
the matrix L is written as the dl×dl Laplacian matrix.
⎡
⎢
⎢
⎢
⎣
O 3d,dn
⎤
⎥
⎥
⎥
⎦
(9)
An optimal motion trajectory z that maximizes the objective
functionΨ(z) has to be computed In this study, the optimal
mo-tion trajectory is derived by using an iterative method The momo-tion
trajectory z (k) is derived after k iterations The difference between
the objective functions of z (k+ 1)and z (k)is expressed by
Ψ(z (k+1))−Ψ(z (k))=ln P(z (k+1)| λ)−ln P(z (k)| λ)
2
(
z (k+1)−z g)T∆g
(
z (k+1)−z g)
2
(
z (k)−z g)T
∆g
(
z (k)−z g)
2 z
(k+1)T Lz (k+1)+ wd
2 z
The likelihood P(z| λ) that the motion trajectory z is generated by
HMMλ, can be rewritten as the following:
ln P(z| λ)= ln∑
∀q
P(z,q| λ)
∀q
P(q|z) P(z,q| λ)
P(q|z)
= ln E P(q|z)
[
P(z,q| λ)
P(q|z)
]
q= {q1,q2, ,q l}is a sequence of nodes in the HMMλ, where q t
is the node at time t P(z,q| λ) represents the likelihood that motion
trajectory z is generated along the node sequence q by the HMMλ
Additionally, E P(q|z) [R] defines the expected value of R given P(q|z),
which is the distribution of the sequence of nodes q According to
Jensen’s inequality, Eq.(12)satisfies the following inequality
ln E P(q|z)
[
P(z,q| λ)
P(q|z)
]
≥E P(q|z)
[
lnP(z,q| λ)
P(q|z)
]
By using Eq.(13), following equations can be derived
ln P(z| λ)−E P(q|z)
[
lnP(z,q| λ)
P(q|z)
]
=ln P(z| λ)−E P(q|z)
[
ln P(z| λ)−lnP(q|z, λ)
P(q|z)
]
=E P(q|z)
[
ln P(q|z)
P(q|z, λ)
]
Eq (14)is the Kullback–Leibler information that measures the
dissimilarity between the distribution P(q|z) and the distribution
P(q|z, λ) The Kullback–Leibler information becomes zero only when these two distributions are the same and is positive other-wise Eq.(11)can be rewritten as the following equation by using the notations of the Kullback Leibler information:
Ψ(z (k+1))−Ψ(z (k))
=E P(q|z (k+ 1) )
[
lnP(z
(k+ 1),q| λ)
P(q|z (k+ 1))
]
+KL(P(q|z (k+1)) ∥ P(q|z (k+1), λ))
−E P(q|z (k))
[
lnP(z
(k),q| λ)
P(q|z (k))
]
−KL(P(q|z (k)) ∥ P(q|z (k), λ))
2
(
z (k+1)−z g)T
∆g
(
z (k+1)−z g)
2
(
z (k)−z g)T∆g
(
z (k)−z g)
2 z
(k+1)T Lz (k+1)+ wd
2 z
We approximate the distributions of the sequence of node q
un-der the motion trajectory More specifically, we set distributions
P(q|z (k+ 1)) and P(q|z (k) ) to the distribution P(q|z (k), λ)
P(q|z (k+ 1))=P(q|z (k), λ), P(q|z (k))=P(q|z (k), λ). (16)
Trang 4These approximations result in the relation, in which the Kullback–
Leibler information of the second term in Eq.(15)becomes greater
than zero, and the relation in which the Kullback–Leibler
informa-tion of the fourth term is equal to zero Therefore, the k+1th
iteration only has to find the new motion trajectory z (k+ 1) that
give the positive value of Eq.(15)removed the second and fourth
terms since theΨ(z (k+ 1)) becomes greater thanΨ(z (k)) and the
mo-tion trajectory z (k+ 1)increases the value on the objective function
Ψ(z (k+ 1)) This iteration leads to a gradual increase in the value on
the objective function Synthesis of motion trajectory z given goal
postures according to motion primitiveλcan be formulated by
arg max
z (k+ 1)
{
E P(q|z (k),λ )
[
lnP(z
(k+ 1),q| λ)
P(q|z (k), λ)
]
−E P(q|z (k),λ )
[
lnP(z
(k),q| λ)
P(q|z (k), λ)
]
2
(
z (k+1)−z g)T
∆g(z (k+1)−z g)
2
(
z (k)−z g)T
∆g(z (k)−z g)
2 z
(k+1)T Lz (k+1)+ wd
2 z
(k)T Lz (k)
}
By eliminating terms that do not include motion trajectory z (k+ 1),
Eq.(17)can be rewritten as
arg max
z (k+ 1)
{
E P(q|z (k),λ )
[
ln P(z (k+1)|q, λ)P(q| λ)]
2
(
z (k+1)−z g)T∆g(z (k+1)−z g) − wd
2 z
(k+1)T Lz (k+1)
}
=arg max
z (k+ 1)
{
E P(q|z (k),λ )
[
ln P(z (k+1)|q, λ)]
2
(
z (k+1)−z g)T∆g(z (k+1)−z g)
2 z
The first term in Eq.(18)is expressed by the summation of
prob-ability P(q|z (k), λ) over all the node sequences q, and Eq.(18)is
rewritten as
arg max
z (k+ 1)
⎧
⎨
⎩
∑
∀q
P(q|z (k), λ) ln P(z (k+1)|q, λ)
2
(
z (k+1)−z g)T∆g(z (k+1)−z g) − wd
2 z
(k+1)T Lz (k+1)
⎫
⎬
⎭
(19)
P(z (k+ 1)|q, λ) is the likelihood that motion trajectory z (k+ 1)is
gen-erated along the sequence of nodes q from HMMλ, and it can be
expressed in a closed form using parameters of the HMM
P(z (k+1)|q, λ)=lnπq1+
l− 1
∑
t= 1
ln a q t,q t+ 1+
l
∑
t= 1
ln b q t (x (k t+1)) (20)
where b i (x) is the probability of vector of joint angles x being
generated from the ith node By substituting Eq.(20)into Eq.(19),
Eq.(19)can be rewritten as the following:
arg max
z (k+ 1)
⎧
⎨
⎩
∑
∀q
P(q|z (k), λ)
×
(
lnπq1+
l− 1
∑
ln a q t,q t+ 1+
l
∑
ln b q t (x (k t+1))
)
2
(
z (k+1)−z g)T∆g(z (k+1)−z g) − wd
2 z
(k+1)T Lz (k+1)
⎫
⎬
⎭
=arg max
z (k+ 1)
⎧
⎨
⎩
∑
∀q
P(q|z (k), λ)
l
∑
t= 1
ln b q t (x (k t+1))
2
(
z (k+1)−z g)T
∆g(z (k+1)−z g) − wd
2 z
(k+1)T Lz (k+1)
⎫
⎬
⎭
=arg max
z (k+ 1)
∑
i= 1
l
∑
t= 1
ln b i (x (k t+1))P(q t=i|z (k), λ)
2
(
z (k+ 1)−z g)T
∆g(z (k+ 1)−z g)
2 z
(k+1)T Lz (k+ 1)
⎫
⎬
⎭
where P(q t =i|z (k), λ) is the probability of staying at the ith node
at time t given the motion trajectory z (k) The output probability
b i (x) follows the Gaussian distribution, and the logarithm of this
probability can be written as
ln b i (x)= −1
2
[
d ln(2π)+ln(|Σi|)+ (x− µi
)T
Σi−1(x− µi
) ]
(22)
where d is the number of dimensions in the vector x By
substi-tuting Eq.(22)into Eq.(21)and removing terms that are
indepen-dent of the motion trajectory z (k+ 1)from Eq.(21), Eq.(21)can be deformed into Eq.(26)
arg max
z (k+ 1)
{
−
m
∑
i= 1
l
∑
t= 1
1
2P(q t=i|z
(k), λ)
× (
x (k t+1)− µi
)T
Σi−1
(
x (k t+1)− µi
)
2
(
z (k+1)−z g)T∆g(z (k+1)−z g)
2 z
(k+1)T Lz (k+1)
⎫
⎬
⎭
We define the dl×dl matrix V i and dl-dimensional vectorµ ˜i as follows:
V i =
⎡
⎢
Σi−1P(q1=i|z (k), λ) O
⎤
˜
µi = [ µT
i, , µT i
]T
Therefore, Eq.(23)can be rewritten as
arg max
z (k+ 1)
{
2
m
∑
i= 1
[
z (k+ 1)− ˜ µi
]T
V i[
z (k+ 1)− ˜ µi
]
2
(
z (k+1)−z g)T∆g(z (k+1)−z g)
2 z
(k+1)T Lz (k+ 1)
⎫
⎬
⎭
The optimal motion trajectory z (k+ 1)at the k+1th iteration can be derived as Eq.(27)
z (k+1)=
∑
V iµ ˜i+ wg∆g z g
∑
V i+ wd L+ wg∆g
)− 1
Trang 5Note that the initial motion trajectory z(0)is randomly set and that
the iterative optimization is terminated at a predefined iteration
number
2.3 Control of reaction forces
This subsection describes a method to calculate the torques of
all the joints in a humanoid robot whole body in order to control
not only the synthesized motion trajectory but also reaction forces
from contacting environments In order to discuss the controlling,
the equation of motion of the humanoid robot is formulated by the
following [25,26]
[
M bb M bj
M jb M jj
] [ ¨q b
¨
q j
]
+ [
C b
C j
] + [
g b
g j
]
= [
0
τ
] +
Nc
∑
i
[
J bi T
J ji
]
where M bb∈R6×6is the inertia matrix of the base link, M bj∈R6×n
is the coupling inertia matrix, M jb is the transpose of M bj , M jj ∈
Rn×n is the inertia matrix of the links, n denotes the number of
joints, q b ∈ R6and q j ∈ Rnare the generalized coordinates of
the base link, and the joint angle vector, C b ∈ R6and C j ∈ Rn
are centrifugal and Coriolis terms, g b ∈ R6 and g j ∈ Rn are
gravitational terms,τ ∈ Rn is the vector of joint torques, N c is
the number of contact points, J bi ∈ R6×6 is the Jacobian matrix
that relates the base link velocity to the velocity of the ith contact
point, J ji∈R6×nis the Jacobian matrix that relates the joint angular
velocity to the velocity of the ith contact points, and F i∈R6is the
vector of reaction forces and moments from the ith contact point.
We choose the desired contact points, each of which is an
element of setF(d), and set the desired contact forces and moments
for each chosen contact point F i (d)is the vector of desired contact
forces and moments at the ith contact point of the setF(d)
Addi-tionally, we define the set of all the contact points except for the
desired contact points asF The objective functionΓof the vector
of joint torquesτand the vector of reaction forces and moments
F i:i=1,2, ,N cis defined as follows:
i∈ F(d)
wF(d)
F i−F i (d)
2
i∈ F
where wF(d), wF and wτ are positive weight parameters The
vector u is defined as
u= [τT F T · · · ,F T
| F(d)|F T
| F(d)|+1 · · · F T
| F(d)+F |
]T
Where F i,(i = 1,2, , |F(d)|) is the vector of reaction forces
and moments at the desired contact point, and F i,(i = |F(d)| +
at other undesired contact point Subsequently, Eq.(29)can be
rewritten as
whereA and˜ b are given as follows:˜
˜
wF(d) I|F(d)|
]
(32)
˜
[
0n − wF(d) F1(d) T · · · − wF(d) F|(d)
F(d)|
T
06| F |T] (33)
where 0n denotes the n-dimensional zero vector.
The synthesized motion trajectory gives a sequence of the
configurations
{
q (s)1 ,q (s)2 , ,q (s) l
}
, where configuration q (s) = [
q (s) b T q (s) j T
]T
consists of the generalized coordinates of the base
link and the joint angle vectors The velocityq˙(s) t and the
acceler-ationq¨(s) t of q (s) t at time t are given as q˙(s) t = q
(s)
t −q (s) t−1
∆t andq¨(s) t =
˙
q (s) t − ˙q (s) t−1
∆t respectively.∆t is the sampling interval The vectors q (s) t
andq˙(s) t are set to the desired configuration q (d) t and its velocityq˙(d) t
at time t for the synthesized motion trajectory The acceleration
¨
q (s) t is modified into the following desired acceleration:
¨
q (d) t = ¨q (s) t +K D
(
˙
q (d) t − ˙q t
)
(
q (d) t −q t
)
(34)
where q t is the measured configuration These desired
configu-ration q (d) t , velocityq˙(d) t and accelerationq¨(d) t are substituted to
Eq.(28), and the following constraints can be yielded
M b q¨(d)+C b+g b= ∑
i∈ F(d)
J bi T F i+ ∑
i∈ F
M j q¨(d)+C j+g j= τ + ∑
i∈ F(d)
J ji F i+ ∑
i∈ F
where the matrices M b and M j are given as M b = [M bb M bj]and
M jb M jj]
respectively By denoting the left-hand sides of Eqs.(35)and(36)by the vectors d1and d2respectively, and using Eqs.(30),(35)and(36)can be reduced as follows:
[
O6,n JF(d)
b
T
JF
b T
I n J jF(d) T J jFT
⎣
τ
FFd
FF
⎤
⎦ = [
d1
d2
]
(37)
where
J bF(d) T = [J b1 T· · ·J b| Fd|T]
(38)
JF
b T
= [J b|F(d)+1 |T· · ·J b|Fd+F |T]
(39)
J jF(d) T = [
J j1 T· · ·J j|Fd|T]
(40)
J jFT = [
J j|F(d)+1 |
T· · ·J j|Fd+F |
T]
Eq.(36)consequently results in the linear constraints as follows:
˜
where
˜
[
O6,n J bF(d) T J bFT
I n JF(d)
j
T
JF
j T
]
(43)
˜
d = [d1 d2 ]T
The optimal u that minimizes the objective function represented
by Eq.(31)subject to the constraint represented by Eq.(42)can
be calculated using quadratic programming The joint torques that
are elements of the optimal u are utilized to control the humanoid
robot’s whole body motion while they not only maintain the profile
of the motion trajectory but also control the reaction forces from the contacting environments as desired
3 Experimental verification
We tested the synthesis of whole-body motion of a humanoid robot and the control of reaction forces from the contact en-vironment on the simulation To create motion primitives with HMMs, we measured human whole-body motions with an optical motion capture system The positions of 34 markers attached to
a human performer were recorded The sequence of position data was converted to angles of all the joints in the humanoid robot by inverse kinematic computation, and a sequence of joint angles was consequently obtained In this study, we recorded three motions
of ‘‘touching an object with the right hand’’ and created its corre-sponding motion primitive using a 30-node left-to-right HMM The first test was conducted on synthesizing the whole-body motion of the robot from the motion primitive ‘‘touching an object
Trang 6Fig 2 The motions generated from the motion primitive ‘‘touching an object with the right hand’’ The left panel shows the motion generated given no goal posture The
middle and right panels show the motions generated given goal postures of touching an object at low and high positions respectively.
Fig 3 Profiles of right and left shoulder joint angles The red solid line shows the joint angle trajectory generated from the motion primitive given no goal posture The
red-shaded graph shows the distribution of joint trajectories for the motion primitive The green and blue solid lines show the synthesized motions when different goal postures were given (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
with the right hand’’ We set goal postures at the final frame
such that the humanoid robot could touch an object at different
positions with the right hand More specifically, two different goal
postures were set for reaching to touch objects in high and low
positions.Fig 2shows the motions generated from the motion
primitive On the left panel, no goal posture is given On the middle
and right panels, the goal postures of touching an object at low
and high positions are given respectively.Fig 3shows the
trajec-tories of the right and left shoulder joint angles for the given goal
postures The distribution can be estimated by using the Gaussian
distribution at the node that is the most likely to generate the joint
angle at each frame Here, we manually set both of the positive
weight parameterswg andwdto 100 For the right shoulder, the
synthesized motions for reaching with the right hand to high and
low positions were similar to the motion primitive given no goal
posture The synthesized motions for reaching to low and high
positions respectively converged to−0.9 rad and−1.5 rad at the
final frame The three corresponding motions for the left shoulder
are exactly same as one another The synthesized motions are very
smooth without abrupt acceleration and come close to the goal
postures while maintaining the profile of joint angles learned as
the motion primitive
The second test was conducted on control of both synthesized
motion and reaction forces from the contact environment We used
the synthesized motion of touching an object at a low position
with the right hand A wall was created in front of the humanoid
robot in the simulation, and the robot touched the wall with the
right hand The desired reaction force on the right hand was set to
(10,0,0) kN, where the first component is a force perpendicular to
the wall surface, and the second and third components are forces
horizontal to the wall The desired vertical reaction forces acting
on both feet and both hips of the robot were set to one-fourth of its
total weight Additionally, we set the positive weight parameters
wF(d),wFandwτ to 10, 1 and 1 respectively The bottom panel in
Fig 4shows the humanoid robot’s whole-body motion controlled
by the proposed method The robot raised its right hand, touched the wall, and bent the right wrist outward to control the reaction force from the contacting wall The bottom panel inFig 5shows the profile of reaction forces on the right hand during application of our proposed controller The reaction force perpendicular to the wall was maintained around 10 kN, and the reaction forces horizontal
to the wall were maintained at a constant level of zero Large reaction forces were measured when the right hand first came into contact with the wall, and so a limitation of this controller is its performance at the transition to contact However, as shown
in Table 1, the average and standard deviation of the reaction force perpendicular to the wall were−10.7 kN and 14.1 kN, and the averages of the reaction forces horizontal to the wall were
−1.12 kN and 0.02 kN, with standard deviations of 2.34 kN and 1.62 kN, respectively We additionally tested the force controller
on the motion of touching an object at a high position with the right hand The bottom panel inTable 1shows the statistics of reaction force The reaction force perpendicular to the wall surface can be controlled to be maintained about 10 kN
We compared our proposed method with a position controller (proportional–derivative control) for whole-body motion This controller computed the joint toques of all the joints in humanoid robot as follows:
where K p and K d are the control parameters The top panel in
Fig 4shows the whole-body motion where the joint angles were controlled according to Eq.(45)and the reaction forces were not controlled The robot did not bend the right wrist even after the right hand contacted the wall because only the learned kinematic profile was maintained The top panel inFig 5shows the profile of the reaction forces on the right hand when the position controller was used The reaction force perpendicular to the wall did not converge As shown byTable 1, the errors between the measured reaction forces and the desired reaction forces were larger during
Trang 7Fig 4 The motion ‘‘touching an object with the right hand’’ is given the desired reaction force perpendicular to the wall The top panel shows the whole-body motion
controlled by the position controller, and the bottom panel shows the whole-body motion controlled by the force controller.
Fig 5 Profiles of reaction forces on the right hand during the motion of touching the
wall with the right hand The red solid line shows the reaction force perpendicular
to the wall The green and blue solid lines show the reaction forces horizontal to the
wall (For interpretation of the references to color in this figure legend, the reader
is referred to the web version of this article.)
application of the position controller than during application of our
proposed force controller
The reaction force could be controlled during maintaining
kine-matic profile similar to the training motion in the simulation
ex-periments But there are several problems that lead to the critical
issues especially when a real humanoid robot is controlled The
first is how to control the reaction force at the first contact with
the environment The large contact force was observed and was
not suitably controlled when the robot’s hand touched the
environ-ment The additional feedback controller needs to be developed
Another problem is the scalability of our proposed method The
motions were limited to reaching a target object After reaching
phase, the robot is required to grasping or manipulate the object
Our proposed method needs to be extended to controlling the
robot hands
Table 1
Averages and standard deviations of the reaction forces.F¯xis the average of the reaction force perpendicular to the wall, andσ¯xis its standard deviation.F¯yand
¯
F zare the averages of the reaction forces horizontal to the wall, andσ¯yandσ¯zare their standard deviations The top and bottom panels shows the statistics during reaching for an object at low and high positions respectively.
Position control Force control
¯
¯
¯
¯
¯
¯
Position control Force control
¯
¯
¯
¯
¯
¯
4 Conclusion
In this paper we have described a method for synthesizing whole-body motion for a humanoid robot from a motion primitive and controlling reaction forces while maintaining the synthesized motion
The motions are represented by sequences of angles of all joints
in the robot, and these sequences are encoded into an HMM called
a motion primitive The objective function takes into consideration three evaluations: the likelihood that the synthesized motion is generated by the HMM, the distance between the synthesized motion and given goal postures, and the total jerk along the syn-thesized motion We proposed an algorithm to find an optimal sequence of joint angles that maximizes the objective function, and this optimal sequence results in the synthesized whole-body motion
We also proposed a method for computing the input torques of all the joints in the robot, so that it could perform the synthesized motion, and for controlling the reaction forces from the contact environment as desired The objective function consists of the error between the actual reaction force and the desired one, total joint torque, and total reaction force The constraint for the objective function is given as the equation of motion for the synthesized
Trang 8motion The optimal joint torques can be found by maximizing
the objective function subject to this constraint by using quadratic
programming
Our proposed methods for motion synthesis and force control
were tested by using the motion of touching an object with the
right hand The experiments on motion synthesis showed that the
proposed method could generate adaptive whole-body motions of
touching an object at a high position and touching an object at a
low position Additionally, experimental results on force control
clearly showed the method’s validity The error of the reaction
force perpendicular to the wall was 0.7 kN, and the reaction force
was less than the value of 19.9 kN obtained by position control
Acknowledgments
This research was partially supported by a Grant-in-Aid for
Young Scientists (A) (No 26700021) from the Japan Society for
the Promotion of Science and by the Strategic Information and
Communications R&D Promotion Program (No 142103011) of the
Ministry of Internal Affairs and Communications
References
[1] C Breazeal, B Scassellati, Robots that imitate humans, Trends Cogn Sci 6 (11)
(2002) 481–487.
[2] S Schaal, Is imitation learning the route to humanoid robot? Trends Cogn Sci.
3 (6) (1999) 233–2422.
[3] M.J Mataric, Getting humaniods to move and imitate, IEEE Intell Syst 15 (4)
(2000) 18–24.
[4] M Okada, K Tatani, Y Nakamura, Polynomial design of the nonlinear
dy-namics for the brain-like information processing of whole body motion,
in: Proceedings of the IEEE International Conference on Robotics and
Automa-tion, 2002, pp 1410–1415.
[5] A.J Ijspeert, J Nakanishi, S Shaal, Learning control policies for movement
imitation and movement recognition, Neural Inf Process Syst 15 (2003)
1547–1554.
[6] H Kadone, Y Nakamura, Symbolic memory for humanoid robots using
hierarchical bifurcations of attractors in nonmonotonic neural networks,
in: Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent
Robots and Systems, 2005, pp 2900–2905.
[7] M Ito, K Noda, Y Hoshino, J Tani, Dynamic and interactive generation of
object handing behaviors by a small humanoid robot using a dynamic neural
network model, Neural Netw 19 (3) (2006) 323–337.
[8] T Inamura, I Toshima, H Tanie, Y Nakamura, Embodied symbol emergence
based on mimesis theory, Int J Robot Res 23 (4) (2004) 363–377.
[9] T Asfour, F Gyarfas, P Azad, R Dillmann, Imitation learning of dual-arm
manipulation task in humanoid robots, in: Proceedings of the IEEE-RAS
Inter-national Conference on Humanoid Robots, 2006, pp 40–47.
[10] A Billard, S Calinon, F Guenter, Discriminative and adaptive imitation in
uni-manual and bi-uni-manual tasks, Robot Auton Syst 54 (2006) 370–384.
[11] D Kulic, W Takano, Y Nakamura, Incremental learning, clustering and
hierar-chy formation of whole body motion patterns using adaptive hidden markov
chains, Int J Robot Res 27 (7) (2008) 761–784.
[12] T Petric, A Gams, L Zlajpah, A Ude, J Morimoto, in: Proceedings of the IEEE
International Conference on Robotics and Automation.
[13] R Wang, Y Wu, W.L Chan, K.P Tee, in: Proceedings of the IEEE/RSJ
Interna-tional Conference on Intelligent Robots and Systems.
[14] A Gams, T Petric, M Do, B Nemec, J Morimoto, T Asfour, A Ude, Adaptation
and coaching of periodic motion primitives through physical and visual
inter-action, Robot Auton Syst 75 (B) (2016) 340–351.
[15] W Takano, Y Nakamura, Symbolically structured database for human whole
body motions based on association between motion symbols and motion
words, Robot Auton Syst 66 (2015) 75–85.
[16] W Takano, Y Nakamura, Integrating whole body motion primitives and
nat-ural language for humanoid robots, in: Proceedings of the IEEE-RAS
Interna-tional Conference on Humanoid Robots, 2008, pp 708–713.
[17] W Takano, Y Nakamura, Statistical mutual conversion between whole body motion primitives and linguistic sentences for human motions, Int J Robot Res (2015) http://dx.doi.org/10.1177/0278364915587923.
[18] D Herzog, V Krueger, D Grest, Parametric hidden markov models for recogni-tion and synthesis of movements, in: Proceedings of the British Machine Vision Conference, 2008, pp 163–172.
[19] K Sugiura, N Iwahashi, Motion recognition and generation by combining reference-point-dependent probabilistic models, in: Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008,
pp 852–857.
[20] H Kunori, D Lee, Y Nakamura, Associating and reshaping of whole body motions for object manipulation, in: Proceedings of the 2008 IEEE/RSJ Inter-national Conference on Intelligent Robots and Systems, 2009, pp 5240–5247 [21] S Calinon, F D’halluin, E.L Sauser, D.G Caldwell, A Billard, Learning and reproduction of gestures by imitation, IEEE Robot Autom Mag 17 (2) (2010) 44–54.
[22] S Calinon, T Alisadeh, D.G Caldwell, On improving the extrapolation capabil-ity of task-parameterized movement models, in: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, pp 610– 617.
[23] L Rabiner, A tutorial on hidden markov models and selected applications in speech recognition, in: Proceedings of the IEEE, vol 77, 1989, pp 257–286 [24] T Flash, N Hogan, The coordination of arm movement: an experimentally confirmed mathematical model, Neurosci 5 (1985) 1688–1703.
[25] H Mayeda, K Yoshida, K Osuka, Base parameters of manipulator dynamic models, IEEE Trans Robot Autom 6 (3) (1990) 312–321.
[26] K Yoshida, D.N Nenchev, M Uchiyama, Moving base robotocs and reaction management control, in: Proceedings of the 7th International Symposium on Robotics Research, 1995, pp 101–108.
Wataru Takano is an Associate Professor at Department of
Mechano-Informatics, School of Information Science and Technology, University of Tokyo He was born in Kyoto, Japan, in 1976 He received the B.S and M.S degrees from Kyoto University, Japan, in precision engineering in 1999 and 2001, Ph.D degree from Mechano-Informatics, the University of Tokyo, Japan, in 2006 He was an Project Assistant Professor at the University of Tokyo from 2006 to
2007, and a Researcher on Project of Information Environ-ment and Humans, Presto, Japan Science and Technology Agency from 2010 His field of research includes kinemat-ics, dynamkinemat-ics, artificial intelligence of humanoid robots, and intelligent vehicles He
is a member of IEEE, Robotics Society of Japan, and Information Processing Society
of Japan He has been the chair of Technical Committee of Robot Learning, IEEE RAS.
Yoshihiko Nakamura is a Professor at Department of
Mechano-Informatics, School of Information Science and Technology, University of Tokyo He was born in Osaka, Japan, in 1954 He received the B.S., M.S., and Ph.D degrees from Kyoto University, Japan, in precision engineering in
1977, 1978, and 1985, respectively He was an Assistant Professor at the Automation Research Laboratory, Kyoto University, from 1982 to 1987 He joined the Department
of Mechanical and Environmental Engineering, University
of California, Santa Barbara, in 1987 as an Assistant Pro-fessor, and became an Associate Professor in 1990 He was also a co-director of the Center for Robotic Systems and Manufacturing at UCSB He moved to University of Tokyo as an Associate Professor of Department
of Mechano-Informatics, University of Tokyo, Japan, in 1991 His fields of research include the kinematics, dynamics, control and intelligence of robots ? particularly, robots with non-holonomic constraints, computational brain information process-ing, humanoid robots, human-figure kinetics, and surgical robots He is a member
of IEEE, ASME, SICE, Robotics Society of Japan, the Institute of Systems, Control, and Information Engineers, and the Japan Society of Computer Aided Surgery He was honored with a fellowship from the Japan Society of Mechanical Engineers Since
2005, he has been the president of Japan IFToMM Congress He is a foreign member
of the Academy of Engineering in Serbia and Montenegro.