planning of goal oriented motion from stochastic motion primitives and optimal controlling of joint torques in whole body

Contents lists available atScienceDirect Robotics and Autonomous Systems journal homepage:www.elsevier.com/locate/robot Planning of goal-oriented motion from stochastic motion primitives

Trang 1

Contents lists available atScienceDirect Robotics and Autonomous Systems journal homepage:www.elsevier.com/locate/robot

Planning of goal-oriented motion from stochastic motion primitives

and optimal controlling of joint torques in whole-body

Wataru Takano * , Yoshihiko Nakamura

Mechano-Informatics, University of Tokyo, 7-3-1, Hongo, Bunkyoku, Tokyo, 113-8656, Japan

h i g h l i g h t s

• This paper proposes the synthesis of whole body motions from stochastic motion primitives

• The joint torques are computed during preserving the profile of the synthesized motion and controlling the reaction forces

• Simulation demonstrates the validity of the motion synthesis and force control

Article history:

Received 15 September 2016

Received in revised form 1 January 2017

Accepted 28 January 2017

Available online 4 February 2017

Keywords:

Motion primitive

Motion synthesis

Stochastic model

a b s t r a c t Humanoid robots are expected to be integrated into daily life This requires the robots to perform human-like actions that are easily understandable by humans Learning by imitation is an effective framework that enables the robots to generate the same motions that humans do However, it is generally not useful for the robots to generate motions that are precisely the same as learned motions because the environment

is likely to be different from the environment where the motions were learned The humanoid robot should synthesize motions that are adaptive to the current environment by modifying learned motions Previous research encoded captured human whole-body motions into hidden Markov models, which are hereafter referred to as motion primitives, and generated human-like motions based on the acquired motion primitives The contact between the body and the environment also needs to be controlled, so that the humanoid robot’s whole-body motion can be realized in its current environment This paper proposes

a novel approach to synthesizing kinematic data using the motion primitive and controlling the torques

of all the joints in the humanoid robot to achieve the desired whole-body motions and contact forces The experiments demonstrate the validity of the proposed approach to synthesizing and controlling whole-body motions by humanoid robots

(http://creativecommons.org/licenses/by/4.0/)

1 Introduction

Humanoid robots need to understand human actions and

com-municate with humans using human-like gestures to be integrated

into daily life Approaches to representing human motions have

been developed as a typical framework for imitation learning

[1 3] Representations of motion can be grouped into two

cate-gories One approach encodes motions into a set of parameters of

a dynamical system, such as systems of differential equations or

neural networks [4 7] The other approach uses stochastic models,

typified by hidden Markov models (HMMs) [8 11] The dynamical

systems or stochastic models represent motions in abstract forms

as their parameters These abstract representations are called

motion primitives Robots learn motions through observation or

* Corresponding author Fax: +81 3 3818 0835

E-mail addresses:takano@ynl.t.u-tokyo.ac.jp (W Takano),

nakamura@ynl.t.u-tokyo.ac.jp (Y Nakamura).

demonstration as motion primitives, and subsequently use the motion primitives for motion recognition and motion generation One popular approach is Dynamic Movement Primitive (DMP) that encodes the demonstration into the parameters in the dif-ferential equation Petric et al created a virtual force field that attracts a trajectory to the target one [12] This approach generate the trajectory that satisfies a desired motion while preserving a natural posture in the training motions Wang et al also extended DMP to modifying the training trajectories while preserving their properties to create a new trajectory based on the Kernel func-tions [13] These approaches improve reusability of the DMP, but

do not handle the contact force Gams et al developed a method

to control the force using the DMP [14] The DMP generates the kinematic motion in the feedforward manner and the contact was controlled to maintain a predefined force in the feedback manner The Concept of Synthesizing a trajectory in the feedforward and controlling a reaction force from the environment in the feedback

is very similar to this paper

http://dx.doi.org/10.1016/j.robot.2017.01.013

Trang 2

Another approach uses the stochastic model Research based

on the stochastic model has focused on developing efficient

algo-rithms for motion recognition, and has been developed in

intelli-gent humanoid robots that can understand human actions [15–17]

In addition to motion recognition, several methods have been

pro-posed to generate motions by using motion primitives [8,18–21]

Inamura et al have developed a sampling-based method to

gener-ate a sequence of joint angles of the entire robot body from HMMs

In the first step, a node is sampled according to probabilities of

node transitions in the HMM, and a sequence of nodes is created

In the second step, vectors of joint angles are sampled according to

the distribution of joint angles at each node in the node sequence

derived in the first step These two steps result in a motion

trajec-tory, represented by a sequence of the joint angles [8] Calinon et

al proposed an approach to encoding joint angles and joint angular

velocities into HMMs These HMMs can calculate the distribution

of the joint angular velocities given the joint angles at the previous

frame, sample the joint angular velocities based on the conditional

distribution, and update the joint angles at the current frame [21]

This framework includes the feedback loop and is adaptive to the

current state Because the synthesized motions are likely to

be-come the average of the training motions, the synthesized motions

are not applied to environments considerably different from where

the motions were originally learned Moreover, these methods

do not handle the targeted motions, and as a consequence, the

synthesized motion does not necessarily achieve a specific task For

example, if a robot learns a motion for grasping an object on a desk,

then the robot can reach an object located at the same position

by the learned motion primitive However, the robot is not able

to reach to an object located at a position on a shelf different from

the learned position on the desk

Robots should be able to modify the learned motions and

per-form flexible motions for adapting to their environment Herzog

et al propose an approach to recognize and synthesize motions

of manipulating objects at different locations using a parametric

hidden Markov model (PHMM)[18] PHMMs represent typical

mo-tions and compensate for different momo-tions of a particular type

Multiple motions are encoded into an HMM as a representative

motion, and each motion variant is encoded into its own HMM The

parameters of the HMM of each motion variant are represented as

a linear function of the parameters of the HMM and the position of

the object Consequently, individual motions are parameterized by

the position of the object Calinon et al have developed a method

for synthesizing motions by using descriptors of both the motion

and the manipulated object [22] These approaches do not control

the contact forces that are important for motions interacting with

environments or manipulation motions

As described above, several researches have been tackling the

motion generation from the motion primitives and extending

the generation to adapting to the environment for achieving the

reaching task But these approaches do not take into account the

dynamics, more specifically, contact force between the body and

environment or the equation of motion for the robot body The

motion of the robot is created by the joint torques and the

inter-action force from the environment The kinematic based motion

generation needs to be applied to the force control such that the

robot can perform kinetically and dynamically suitable motions

This paper proposes a novel approach to synthesizing whole-body

motion that satisfies configurations for a goal or subgoal postures

from HMMs of motion primitives, and to subsequently controlling

the torques of all the joints in a humanoid robot’s body such

that the robot can perform human-like motions and maintain the

desired contact with the environment

2 Motion synthesizer and torque controller

We propose a novel approach to synthesizing motions by using

the motion primitives and to subsequently controlling the joint

torques in the humanoid robot’s whole body, as shown inFig 1

Fig 1 A whole body motion is encoded into an HMM that is referred to as a

motion primitive Motion synthesizer generates a sequence of joint angles that not only maintain the profile of motions learned as the motion primitive but also reach the goal postures Force controller computes the joint torques such that the humanoid robot can performed the synthesized motion and the reaction forces can

be controlled as desired.

2.1 Representation of motion primitives

A configuration is defined as a complete specification of the pose of a human performer or a humanoid robot We choose the angles of all the joints in the whole body as the configuration in this study A whole-body motion is represented by a vector of the

joint angles x t at time t A motion segment is expressed by a motion

trajectory{x1,x2, ,x l} The motion segment is encoded into a set of parameters of an HMM, the HMM is hereinafter referred

to as a ‘‘motion primitive’’ An HMM is a stochastic model that

is defined by a compact notation λ = {Π,A,B}, where Π = { π1, π2, , πm}is the set of initial node probability, A= {a ij}is

the matrix whose element a ijis the probability of transition from

the ith node to the jth node, and B is the set of output probabilities

at the nodes Encoding of the motion segment optimizes these parameters of an HMM such that the likelihood of the motion segment being generated by the HMM is maximized The HMM can be utilized for motion recognition and motion generation The motion recognition finds the HMM from among a set of HMMs that

is the most likely to generate the observed motion The motion generation outputs a sequence of joint angles according to the distribution of the motion trajectories learned by the HMM Thus,

a humanoid robot learns motion segments in the form of motion primitives, recognizes observations, and generates its own motions

by using the motion primitives

2.2 Synthesis of joint angle sequence

This subsection describes a method to synthesize a sequence

of joint angles from a motion primitive in a way that enables a humanoid robot not only to perform human-like motions but also

to satisfy the (sub)goal postures

This method introduces an objective function that consists of three functions The first function calculates the likelihood that the HMM generates a motion trajectory The second function cal-culates the distance between the motion trajectory and goal pos-tures The third function calculates the smoothness of the motion

Trang 3

trajectory The motion synthesis finds the motion trajectory that

maximizes the objective function

The first function of the likelihood that motion trajectory

{x1,x2, ,x l}is generated by the motion primitiveλis denoted

by

ψp=ln P(x1,x2, ,x l| λ). (1)

This likelihood can be efficiently computed by a forward

algo-rithm [23]

The variableδtis assumed to take a binary value This variable is

set toδt=1 if the goal posture is given to the motion at time t and

δt =0 otherwise The second function in the objective function is

derived from the difference between the configuration x tand the

goal posture x t,g at time t This function is written as a weighted

quadratic form of these differences:

2

l

∑

t= 1

δt

(

x t−x t,g

)T(

x t−x t,g

)

where the positive weight parameterwgis manually specified

The joint-angular jerks are chosen as the smoothness of the

mo-tion trajectory [24] The third function is written as the following

weighted quadratic form

2

l

∑

t= 4

x T

t

x t

where the positive weight parameterwd is manually specified

These three terms lead to the resultant objective function

The motion trajectory{x1,x2, ,x l}is rewritten as following

vector:

z= [x T, ,x T

l

]T

where column vectors of the joint angles x t at time t, are

concate-nated to a vector z x T denotes the transpose of x The objective

function in Eq.(4)can be subsequently rewritten in the simple

form:

Ψ(z)=ln P(z| λ)− wg

2

(

z−z g)T∆g(z−z g) − wd

2 z

T Lz, (6) where∆g and z care given as follows:

∆g =

⎡

⎢

⎤

z g = [x T,g, ,x T l,g

]T

Here, I d and O d are the d×d identity matrix and the zero matrix

re-spectively, where d is the number of dimensions of x t Additionally,

the matrix L is written as the dl×dl Laplacian matrix.

⎡

⎢

⎣

O 3d,dn

⎤

⎥

⎦

(9)

An optimal motion trajectory z that maximizes the objective

functionΨ(z) has to be computed In this study, the optimal

mo-tion trajectory is derived by using an iterative method The momo-tion

trajectory z (k) is derived after k iterations The difference between

the objective functions of z (k+ 1)and z (k)is expressed by

Ψ(z (k+1))−Ψ(z (k))=ln P(z (k+1)| λ)−ln P(z (k)| λ)

2

(

z (k+1)−z g)T∆g

(

z (k+1)−z g)

2

(

z (k)−z g)T

∆g

(

z (k)−z g)

2 z

(k+1)T Lz (k+1)+ wd

2 z

The likelihood P(z| λ) that the motion trajectory z is generated by

HMMλ, can be rewritten as the following:

ln P(z| λ)= ln∑

∀q

P(z,q| λ)

∀q

P(q|z) P(z,q| λ)

P(q|z)

= ln E P(q|z)

[

P(z,q| λ)

P(q|z)

]

q= {q1,q2, ,q l}is a sequence of nodes in the HMMλ, where q t

is the node at time t P(z,q| λ) represents the likelihood that motion

trajectory z is generated along the node sequence q by the HMMλ

Additionally, E P(q|z) [R] defines the expected value of R given P(q|z),

which is the distribution of the sequence of nodes q According to

Jensen’s inequality, Eq.(12)satisfies the following inequality

ln E P(q|z)

[

P(z,q| λ)

P(q|z)

]

≥E P(q|z)

[

lnP(z,q| λ)

P(q|z)

]

By using Eq.(13), following equations can be derived

ln P(z| λ)−E P(q|z)

[

lnP(z,q| λ)

P(q|z)

]

=ln P(z| λ)−E P(q|z)

[

ln P(z| λ)−lnP(q|z, λ)

P(q|z)

]

=E P(q|z)

[

ln P(q|z)

P(q|z, λ)

]

Eq (14)is the Kullback–Leibler information that measures the

dissimilarity between the distribution P(q|z) and the distribution

P(q|z, λ) The Kullback–Leibler information becomes zero only when these two distributions are the same and is positive other-wise Eq.(11)can be rewritten as the following equation by using the notations of the Kullback Leibler information:

Ψ(z (k+1))−Ψ(z (k))

=E P(q|z (k+ 1) )

[

lnP(z

(k+ 1),q| λ)

P(q|z (k+ 1))

]

+KL(P(q|z (k+1)) ∥ P(q|z (k+1), λ))

−E P(q|z (k))

[

lnP(z

(k),q| λ)

P(q|z (k))

]

−KL(P(q|z (k)) ∥ P(q|z (k), λ))

2

(

z (k+1)−z g)T

∆g

(

z (k+1)−z g)

2

(

z (k)−z g)T∆g

(

z (k)−z g)

2 z

(k+1)T Lz (k+1)+ wd

2 z

We approximate the distributions of the sequence of node q

un-der the motion trajectory More specifically, we set distributions

P(q|z (k+ 1)) and P(q|z (k) ) to the distribution P(q|z (k), λ)

P(q|z (k+ 1))=P(q|z (k), λ), P(q|z (k))=P(q|z (k), λ). (16)

Trang 4

These approximations result in the relation, in which the Kullback–

Leibler information of the second term in Eq.(15)becomes greater

than zero, and the relation in which the Kullback–Leibler

informa-tion of the fourth term is equal to zero Therefore, the k+1th

iteration only has to find the new motion trajectory z (k+ 1) that

give the positive value of Eq.(15)removed the second and fourth

terms since theΨ(z (k+ 1)) becomes greater thanΨ(z (k)) and the

mo-tion trajectory z (k+ 1)increases the value on the objective function

Ψ(z (k+ 1)) This iteration leads to a gradual increase in the value on

the objective function Synthesis of motion trajectory z given goal

postures according to motion primitiveλcan be formulated by

arg max

z (k+ 1)

{

E P(q|z (k),λ )

[

lnP(z

(k+ 1),q| λ)

P(q|z (k), λ)

]

−E P(q|z (k),λ )

[

lnP(z

(k),q| λ)

P(q|z (k), λ)

]

2

(

z (k+1)−z g)T

∆g(z (k+1)−z g)

2

(

z (k)−z g)T

∆g(z (k)−z g)

2 z

(k+1)T Lz (k+1)+ wd

2 z

(k)T Lz (k)

}

By eliminating terms that do not include motion trajectory z (k+ 1),

Eq.(17)can be rewritten as

arg max

z (k+ 1)

{

E P(q|z (k),λ )

[

ln P(z (k+1)|q, λ)P(q| λ)]

2

(

z (k+1)−z g)T∆g(z (k+1)−z g) − wd

2 z

(k+1)T Lz (k+1)

}

=arg max

z (k+ 1)

{

E P(q|z (k),λ )

[

ln P(z (k+1)|q, λ)]

2

(

z (k+1)−z g)T∆g(z (k+1)−z g)

2 z

The first term in Eq.(18)is expressed by the summation of

prob-ability P(q|z (k), λ) over all the node sequences q, and Eq.(18)is

rewritten as

arg max

z (k+ 1)

⎧

⎨

⎩

∑

∀q

P(q|z (k), λ) ln P(z (k+1)|q, λ)

2

(

z (k+1)−z g)T∆g(z (k+1)−z g) − wd

2 z

(k+1)T Lz (k+1)

⎫

⎬

⎭

(19)

P(z (k+ 1)|q, λ) is the likelihood that motion trajectory z (k+ 1)is

gen-erated along the sequence of nodes q from HMMλ, and it can be

expressed in a closed form using parameters of the HMM

P(z (k+1)|q, λ)=lnπq1+

l− 1

∑

t= 1

ln a q t,q t+ 1+

l

∑

t= 1

ln b q t (x (k t+1)) (20)

where b i (x) is the probability of vector of joint angles x being

generated from the ith node By substituting Eq.(20)into Eq.(19),

Eq.(19)can be rewritten as the following:

arg max

z (k+ 1)

⎧

⎨

⎩

∑

∀q

P(q|z (k), λ)

×

(

lnπq1+

l− 1

∑

ln a q t,q t+ 1+

l

∑

ln b q t (x (k t+1))

)

2

(

z (k+1)−z g)T∆g(z (k+1)−z g) − wd

2 z

(k+1)T Lz (k+1)

⎫

⎬

⎭

=arg max

z (k+ 1)

⎧

⎨

⎩

∑

∀q

P(q|z (k), λ)

l

∑

t= 1

ln b q t (x (k t+1))

2

(

z (k+1)−z g)T

∆g(z (k+1)−z g) − wd

2 z

(k+1)T Lz (k+1)

⎫

⎬

⎭

=arg max

z (k+ 1)

∑

i= 1

l

∑

t= 1

ln b i (x (k t+1))P(q t=i|z (k), λ)

2

(

z (k+ 1)−z g)T

∆g(z (k+ 1)−z g)

2 z

(k+1)T Lz (k+ 1)

⎫

⎬

⎭

where P(q t =i|z (k), λ) is the probability of staying at the ith node

at time t given the motion trajectory z (k) The output probability

b i (x) follows the Gaussian distribution, and the logarithm of this

probability can be written as

ln b i (x)= −1

2

[

d ln(2π)+ln(|Σi|)+ (x− µi

)T

Σi−1(x− µi

) ]

(22)

where d is the number of dimensions in the vector x By

substi-tuting Eq.(22)into Eq.(21)and removing terms that are

indepen-dent of the motion trajectory z (k+ 1)from Eq.(21), Eq.(21)can be deformed into Eq.(26)

arg max

z (k+ 1)

{

−

m

∑

i= 1

l

∑

t= 1

1

2P(q t=i|z

(k), λ)

× (

x (k t+1)− µi

)T

Σi−1

(

x (k t+1)− µi

)

2

(

z (k+1)−z g)T∆g(z (k+1)−z g)

2 z

(k+1)T Lz (k+1)

⎫

⎬

⎭

We define the dl×dl matrix V i and dl-dimensional vectorµ ˜i as follows:

V i =

⎡

⎢

Σi−1P(q1=i|z (k), λ) O

⎤

˜

µi = [ µT

i, , µT i

]T

Therefore, Eq.(23)can be rewritten as

arg max

z (k+ 1)

{

2

m

∑

i= 1

[

z (k+ 1)− ˜ µi

]T

V i[

z (k+ 1)− ˜ µi

]

2

(

z (k+1)−z g)T∆g(z (k+1)−z g)

2 z

(k+1)T Lz (k+ 1)

⎫

⎬

⎭

The optimal motion trajectory z (k+ 1)at the k+1th iteration can be derived as Eq.(27)

z (k+1)=

∑

V iµ ˜i+ wg∆g z g

∑

V i+ wd L+ wg∆g

)− 1

Trang 5

Note that the initial motion trajectory z(0)is randomly set and that

the iterative optimization is terminated at a predefined iteration

number

2.3 Control of reaction forces

This subsection describes a method to calculate the torques of

all the joints in a humanoid robot whole body in order to control

not only the synthesized motion trajectory but also reaction forces

from contacting environments In order to discuss the controlling,

the equation of motion of the humanoid robot is formulated by the

following [25,26]

[

M bb M bj

M jb M jj

] [ ¨q b

¨

q j

]

+ [

C b

C j

] + [

g b

g j

]

= [

0

τ

] +

Nc

∑

i

[

J bi T

J ji

]

where M bb∈R6×6is the inertia matrix of the base link, M bj∈R6×n

is the coupling inertia matrix, M jb is the transpose of M bj , M jj ∈

Rn×n is the inertia matrix of the links, n denotes the number of

joints, q b ∈ R6and q j ∈ Rnare the generalized coordinates of

the base link, and the joint angle vector, C b ∈ R6and C j ∈ Rn

are centrifugal and Coriolis terms, g b ∈ R6 and g j ∈ Rn are

gravitational terms,τ ∈ Rn is the vector of joint torques, N c is

the number of contact points, J bi ∈ R6×6 is the Jacobian matrix

that relates the base link velocity to the velocity of the ith contact

point, J ji∈R6×nis the Jacobian matrix that relates the joint angular

velocity to the velocity of the ith contact points, and F i∈R6is the

vector of reaction forces and moments from the ith contact point.

We choose the desired contact points, each of which is an

element of setF(d), and set the desired contact forces and moments

for each chosen contact point F i (d)is the vector of desired contact

forces and moments at the ith contact point of the setF(d)

Addi-tionally, we define the set of all the contact points except for the

desired contact points asF The objective functionΓof the vector

of joint torquesτand the vector of reaction forces and moments

F i:i=1,2, ,N cis defined as follows:

i∈ F(d)

wF(d)



F i−F i (d)



2

i∈ F

where wF(d), wF and wτ are positive weight parameters The

vector u is defined as

u= [τT F T · · · ,F T

| F(d)|F T

| F(d)|+1 · · · F T

| F(d)+F |

]T

Where F i,(i = 1,2, , |F(d)|) is the vector of reaction forces

and moments at the desired contact point, and F i,(i = |F(d)| +

at other undesired contact point Subsequently, Eq.(29)can be

rewritten as

whereA and˜ b are given as follows:˜

˜

wF(d) I|F(d)|

]

(32)

˜

[

0n − wF(d) F1(d) T · · · − wF(d) F|(d)

F(d)|

T

06| F |T] (33)

where 0n denotes the n-dimensional zero vector.

The synthesized motion trajectory gives a sequence of the

configurations

{

q (s)1 ,q (s)2 , ,q (s) l

}

, where configuration q (s) = [

q (s) b T q (s) j T

]T

consists of the generalized coordinates of the base

link and the joint angle vectors The velocityq˙(s) t and the

acceler-ationq¨(s) t of q (s) t at time t are given as q˙(s) t = q

(s)

t −q (s) t−1

∆t andq¨(s) t =

˙

q (s) t − ˙q (s) t−1

∆t respectively.∆t is the sampling interval The vectors q (s) t

andq˙(s) t are set to the desired configuration q (d) t and its velocityq˙(d) t

at time t for the synthesized motion trajectory The acceleration

¨

q (s) t is modified into the following desired acceleration:

¨

q (d) t = ¨q (s) t +K D

(

˙

q (d) t − ˙q t

)

(

q (d) t −q t

)

(34)

where q t is the measured configuration These desired

configu-ration q (d) t , velocityq˙(d) t and accelerationq¨(d) t are substituted to

Eq.(28), and the following constraints can be yielded

M b q¨(d)+C b+g b= ∑

i∈ F(d)

J bi T F i+ ∑

i∈ F

M j q¨(d)+C j+g j= τ + ∑

i∈ F(d)

J ji F i+ ∑

i∈ F

where the matrices M b and M j are given as M b = [M bb M bj]and

M jb M jj]

respectively By denoting the left-hand sides of Eqs.(35)and(36)by the vectors d1and d2respectively, and using Eqs.(30),(35)and(36)can be reduced as follows:

[

O6,n JF(d)

b

T

JF

b T

I n J jF(d) T J jFT

⎣

τ

FFd

FF

⎤

⎦ = [

d1

d2

]

(37)

where

J bF(d) T = [J b1 T· · ·J b| Fd|T]

(38)

JF

b T

= [J b|F(d)+1 |T· · ·J b|Fd+F |T]

(39)

J jF(d) T = [

J j1 T· · ·J j|Fd|T]

(40)

J jFT = [

J j|F(d)+1 |

T· · ·J j|Fd+F |

T]

Eq.(36)consequently results in the linear constraints as follows:

˜

where

˜

[

O6,n J bF(d) T J bFT

I n JF(d)

j

T

JF

j T

]

(43)

˜

d = [d1 d2 ]T

The optimal u that minimizes the objective function represented

by Eq.(31)subject to the constraint represented by Eq.(42)can

be calculated using quadratic programming The joint torques that

are elements of the optimal u are utilized to control the humanoid

robot’s whole body motion while they not only maintain the profile

of the motion trajectory but also control the reaction forces from the contacting environments as desired

3 Experimental verification

We tested the synthesis of whole-body motion of a humanoid robot and the control of reaction forces from the contact en-vironment on the simulation To create motion primitives with HMMs, we measured human whole-body motions with an optical motion capture system The positions of 34 markers attached to

a human performer were recorded The sequence of position data was converted to angles of all the joints in the humanoid robot by inverse kinematic computation, and a sequence of joint angles was consequently obtained In this study, we recorded three motions

of ‘‘touching an object with the right hand’’ and created its corre-sponding motion primitive using a 30-node left-to-right HMM The first test was conducted on synthesizing the whole-body motion of the robot from the motion primitive ‘‘touching an object

Trang 6

Fig 2 The motions generated from the motion primitive ‘‘touching an object with the right hand’’ The left panel shows the motion generated given no goal posture The

middle and right panels show the motions generated given goal postures of touching an object at low and high positions respectively.

Fig 3 Profiles of right and left shoulder joint angles The red solid line shows the joint angle trajectory generated from the motion primitive given no goal posture The

red-shaded graph shows the distribution of joint trajectories for the motion primitive The green and blue solid lines show the synthesized motions when different goal postures were given (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

with the right hand’’ We set goal postures at the final frame

such that the humanoid robot could touch an object at different

positions with the right hand More specifically, two different goal

postures were set for reaching to touch objects in high and low

positions.Fig 2shows the motions generated from the motion

primitive On the left panel, no goal posture is given On the middle

and right panels, the goal postures of touching an object at low

and high positions are given respectively.Fig 3shows the

trajec-tories of the right and left shoulder joint angles for the given goal

postures The distribution can be estimated by using the Gaussian

distribution at the node that is the most likely to generate the joint

angle at each frame Here, we manually set both of the positive

weight parameterswg andwdto 100 For the right shoulder, the

synthesized motions for reaching with the right hand to high and

low positions were similar to the motion primitive given no goal

posture The synthesized motions for reaching to low and high

positions respectively converged to−0.9 rad and−1.5 rad at the

final frame The three corresponding motions for the left shoulder

are exactly same as one another The synthesized motions are very

smooth without abrupt acceleration and come close to the goal

postures while maintaining the profile of joint angles learned as

the motion primitive

The second test was conducted on control of both synthesized

motion and reaction forces from the contact environment We used

the synthesized motion of touching an object at a low position

with the right hand A wall was created in front of the humanoid

robot in the simulation, and the robot touched the wall with the

right hand The desired reaction force on the right hand was set to

(10,0,0) kN, where the first component is a force perpendicular to

the wall surface, and the second and third components are forces

horizontal to the wall The desired vertical reaction forces acting

on both feet and both hips of the robot were set to one-fourth of its

total weight Additionally, we set the positive weight parameters

wF(d),wFandwτ to 10, 1 and 1 respectively The bottom panel in

Fig 4shows the humanoid robot’s whole-body motion controlled

by the proposed method The robot raised its right hand, touched the wall, and bent the right wrist outward to control the reaction force from the contacting wall The bottom panel inFig 5shows the profile of reaction forces on the right hand during application of our proposed controller The reaction force perpendicular to the wall was maintained around 10 kN, and the reaction forces horizontal

to the wall were maintained at a constant level of zero Large reaction forces were measured when the right hand first came into contact with the wall, and so a limitation of this controller is its performance at the transition to contact However, as shown

in Table 1, the average and standard deviation of the reaction force perpendicular to the wall were−10.7 kN and 14.1 kN, and the averages of the reaction forces horizontal to the wall were

−1.12 kN and 0.02 kN, with standard deviations of 2.34 kN and 1.62 kN, respectively We additionally tested the force controller

on the motion of touching an object at a high position with the right hand The bottom panel inTable 1shows the statistics of reaction force The reaction force perpendicular to the wall surface can be controlled to be maintained about 10 kN

We compared our proposed method with a position controller (proportional–derivative control) for whole-body motion This controller computed the joint toques of all the joints in humanoid robot as follows:

where K p and K d are the control parameters The top panel in

Fig 4shows the whole-body motion where the joint angles were controlled according to Eq.(45)and the reaction forces were not controlled The robot did not bend the right wrist even after the right hand contacted the wall because only the learned kinematic profile was maintained The top panel inFig 5shows the profile of the reaction forces on the right hand when the position controller was used The reaction force perpendicular to the wall did not converge As shown byTable 1, the errors between the measured reaction forces and the desired reaction forces were larger during

Trang 7

Fig 4 The motion ‘‘touching an object with the right hand’’ is given the desired reaction force perpendicular to the wall The top panel shows the whole-body motion

controlled by the position controller, and the bottom panel shows the whole-body motion controlled by the force controller.

Fig 5 Profiles of reaction forces on the right hand during the motion of touching the

wall with the right hand The red solid line shows the reaction force perpendicular

to the wall The green and blue solid lines show the reaction forces horizontal to the

wall (For interpretation of the references to color in this figure legend, the reader

is referred to the web version of this article.)

application of the position controller than during application of our

proposed force controller

The reaction force could be controlled during maintaining

kine-matic profile similar to the training motion in the simulation

ex-periments But there are several problems that lead to the critical

issues especially when a real humanoid robot is controlled The

first is how to control the reaction force at the first contact with

the environment The large contact force was observed and was

not suitably controlled when the robot’s hand touched the

environ-ment The additional feedback controller needs to be developed

Another problem is the scalability of our proposed method The

motions were limited to reaching a target object After reaching

phase, the robot is required to grasping or manipulate the object

Our proposed method needs to be extended to controlling the

robot hands

Table 1

Averages and standard deviations of the reaction forces.F¯xis the average of the reaction force perpendicular to the wall, andσ¯xis its standard deviation.F¯yand

¯

F zare the averages of the reaction forces horizontal to the wall, andσ¯yandσ¯zare their standard deviations The top and bottom panels shows the statistics during reaching for an object at low and high positions respectively.

Position control Force control

¯

Position control Force control

¯

4 Conclusion

In this paper we have described a method for synthesizing whole-body motion for a humanoid robot from a motion primitive and controlling reaction forces while maintaining the synthesized motion

The motions are represented by sequences of angles of all joints

in the robot, and these sequences are encoded into an HMM called

a motion primitive The objective function takes into consideration three evaluations: the likelihood that the synthesized motion is generated by the HMM, the distance between the synthesized motion and given goal postures, and the total jerk along the syn-thesized motion We proposed an algorithm to find an optimal sequence of joint angles that maximizes the objective function, and this optimal sequence results in the synthesized whole-body motion

We also proposed a method for computing the input torques of all the joints in the robot, so that it could perform the synthesized motion, and for controlling the reaction forces from the contact environment as desired The objective function consists of the error between the actual reaction force and the desired one, total joint torque, and total reaction force The constraint for the objective function is given as the equation of motion for the synthesized

Trang 8

motion The optimal joint torques can be found by maximizing

the objective function subject to this constraint by using quadratic

programming

Our proposed methods for motion synthesis and force control

were tested by using the motion of touching an object with the

right hand The experiments on motion synthesis showed that the

proposed method could generate adaptive whole-body motions of

touching an object at a high position and touching an object at a

low position Additionally, experimental results on force control

clearly showed the method’s validity The error of the reaction

force perpendicular to the wall was 0.7 kN, and the reaction force

was less than the value of 19.9 kN obtained by position control

Acknowledgments

This research was partially supported by a Grant-in-Aid for

Young Scientists (A) (No 26700021) from the Japan Society for

the Promotion of Science and by the Strategic Information and

Communications R&D Promotion Program (No 142103011) of the

Ministry of Internal Affairs and Communications

References

[1] C Breazeal, B Scassellati, Robots that imitate humans, Trends Cogn Sci 6 (11)

(2002) 481–487.

[2] S Schaal, Is imitation learning the route to humanoid robot? Trends Cogn Sci.

3 (6) (1999) 233–2422.

[3] M.J Mataric, Getting humaniods to move and imitate, IEEE Intell Syst 15 (4)

(2000) 18–24.

[4] M Okada, K Tatani, Y Nakamura, Polynomial design of the nonlinear

dy-namics for the brain-like information processing of whole body motion,

in: Proceedings of the IEEE International Conference on Robotics and

Automa-tion, 2002, pp 1410–1415.

[5] A.J Ijspeert, J Nakanishi, S Shaal, Learning control policies for movement

imitation and movement recognition, Neural Inf Process Syst 15 (2003)

1547–1554.

[6] H Kadone, Y Nakamura, Symbolic memory for humanoid robots using

hierarchical bifurcations of attractors in nonmonotonic neural networks,

in: Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent

Robots and Systems, 2005, pp 2900–2905.

[7] M Ito, K Noda, Y Hoshino, J Tani, Dynamic and interactive generation of

object handing behaviors by a small humanoid robot using a dynamic neural

network model, Neural Netw 19 (3) (2006) 323–337.

[8] T Inamura, I Toshima, H Tanie, Y Nakamura, Embodied symbol emergence

based on mimesis theory, Int J Robot Res 23 (4) (2004) 363–377.

[9] T Asfour, F Gyarfas, P Azad, R Dillmann, Imitation learning of dual-arm

manipulation task in humanoid robots, in: Proceedings of the IEEE-RAS

Inter-national Conference on Humanoid Robots, 2006, pp 40–47.

[10] A Billard, S Calinon, F Guenter, Discriminative and adaptive imitation in

uni-manual and bi-uni-manual tasks, Robot Auton Syst 54 (2006) 370–384.

[11] D Kulic, W Takano, Y Nakamura, Incremental learning, clustering and

hierar-chy formation of whole body motion patterns using adaptive hidden markov

chains, Int J Robot Res 27 (7) (2008) 761–784.

[12] T Petric, A Gams, L Zlajpah, A Ude, J Morimoto, in: Proceedings of the IEEE

International Conference on Robotics and Automation.

[13] R Wang, Y Wu, W.L Chan, K.P Tee, in: Proceedings of the IEEE/RSJ

Interna-tional Conference on Intelligent Robots and Systems.

[14] A Gams, T Petric, M Do, B Nemec, J Morimoto, T Asfour, A Ude, Adaptation

and coaching of periodic motion primitives through physical and visual

inter-action, Robot Auton Syst 75 (B) (2016) 340–351.

[15] W Takano, Y Nakamura, Symbolically structured database for human whole

body motions based on association between motion symbols and motion

words, Robot Auton Syst 66 (2015) 75–85.

[16] W Takano, Y Nakamura, Integrating whole body motion primitives and

nat-ural language for humanoid robots, in: Proceedings of the IEEE-RAS

Interna-tional Conference on Humanoid Robots, 2008, pp 708–713.

[17] W Takano, Y Nakamura, Statistical mutual conversion between whole body motion primitives and linguistic sentences for human motions, Int J Robot Res (2015) http://dx.doi.org/10.1177/0278364915587923.

[18] D Herzog, V Krueger, D Grest, Parametric hidden markov models for recogni-tion and synthesis of movements, in: Proceedings of the British Machine Vision Conference, 2008, pp 163–172.

[19] K Sugiura, N Iwahashi, Motion recognition and generation by combining reference-point-dependent probabilistic models, in: Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008,

pp 852–857.

[20] H Kunori, D Lee, Y Nakamura, Associating and reshaping of whole body motions for object manipulation, in: Proceedings of the 2008 IEEE/RSJ Inter-national Conference on Intelligent Robots and Systems, 2009, pp 5240–5247 [21] S Calinon, F D’halluin, E.L Sauser, D.G Caldwell, A Billard, Learning and reproduction of gestures by imitation, IEEE Robot Autom Mag 17 (2) (2010) 44–54.

[22] S Calinon, T Alisadeh, D.G Caldwell, On improving the extrapolation capabil-ity of task-parameterized movement models, in: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, pp 610– 617.

[23] L Rabiner, A tutorial on hidden markov models and selected applications in speech recognition, in: Proceedings of the IEEE, vol 77, 1989, pp 257–286 [24] T Flash, N Hogan, The coordination of arm movement: an experimentally confirmed mathematical model, Neurosci 5 (1985) 1688–1703.

[25] H Mayeda, K Yoshida, K Osuka, Base parameters of manipulator dynamic models, IEEE Trans Robot Autom 6 (3) (1990) 312–321.

[26] K Yoshida, D.N Nenchev, M Uchiyama, Moving base robotocs and reaction management control, in: Proceedings of the 7th International Symposium on Robotics Research, 1995, pp 101–108.

Wataru Takano is an Associate Professor at Department of

Mechano-Informatics, School of Information Science and Technology, University of Tokyo He was born in Kyoto, Japan, in 1976 He received the B.S and M.S degrees from Kyoto University, Japan, in precision engineering in 1999 and 2001, Ph.D degree from Mechano-Informatics, the University of Tokyo, Japan, in 2006 He was an Project Assistant Professor at the University of Tokyo from 2006 to

2007, and a Researcher on Project of Information Environ-ment and Humans, Presto, Japan Science and Technology Agency from 2010 His field of research includes kinemat-ics, dynamkinemat-ics, artificial intelligence of humanoid robots, and intelligent vehicles He

is a member of IEEE, Robotics Society of Japan, and Information Processing Society

of Japan He has been the chair of Technical Committee of Robot Learning, IEEE RAS.

Yoshihiko Nakamura is a Professor at Department of

Mechano-Informatics, School of Information Science and Technology, University of Tokyo He was born in Osaka, Japan, in 1954 He received the B.S., M.S., and Ph.D degrees from Kyoto University, Japan, in precision engineering in

1977, 1978, and 1985, respectively He was an Assistant Professor at the Automation Research Laboratory, Kyoto University, from 1982 to 1987 He joined the Department

of Mechanical and Environmental Engineering, University

of California, Santa Barbara, in 1987 as an Assistant Pro-fessor, and became an Associate Professor in 1990 He was also a co-director of the Center for Robotic Systems and Manufacturing at UCSB He moved to University of Tokyo as an Associate Professor of Department

of Mechano-Informatics, University of Tokyo, Japan, in 1991 His fields of research include the kinematics, dynamics, control and intelligence of robots ? particularly, robots with non-holonomic constraints, computational brain information process-ing, humanoid robots, human-figure kinetics, and surgical robots He is a member

of IEEE, ASME, SICE, Robotics Society of Japan, the Institute of Systems, Control, and Information Engineers, and the Japan Society of Computer Aided Surgery He was honored with a fellowship from the Japan Society of Mechanical Engineers Since

2005, he has been the president of Japan IFToMM Congress He is a foreign member

of the Academy of Engineering in Serbia and Montenegro.

Tiêu đề	Planning of goal-oriented motion from stochastic motion primitives and optimal controlling of joint torques in whole body
Tác giả	Wataru Takano, Yoshihiko Nakamura
Trường học	Mechano-Informatics, University of Tokyo
Chuyên ngành	Robotics
Thể loại	Article
Năm xuất bản	2017
Thành phố	Tokyo

Định dạng
Số trang	8
Dung lượng	1,09 MB