hardcore ai for computer games and animation

Hardcore AI for Computer Games and AnimationSiggraph Course Notes Part I John Funge and Xiaoyuan Tu Microcomputer Research Lab Intel Corporation fjohn funge|xiaoyuan tug@ccm.sc.intel.com

Trang 1

Hardcore AI for Computer Games and Animation

SIGGRAPH 98 Course Notes

by

John David Funge

Copyright c 1998 by John David Funge

Trang 3

Hardcore AI for Computer Games and Animation

SIGGRAPH 98 Course Notes

John David Funge1998Welcome to this tutorial on AI for Computer Games and Animation These course notes consist of two parts:Part I is a short overview that misses out lots of details

Part II goes into all these details in great depth

Trang 5

John Funge is a member of Intel’s graphics research group He received a BS in Mathematics from King’s CollegeLondon in 1990, a MS in Computer Science from Oxford University in 1991, and a PhD in Computer Science from theUniversity of Toronto in 1998 It was during his time at Oxford that John became interested in computer graphics Hewas commissioned by Channel 4 television to perform a preliminary study on a proposed computer game show This

made him acutely aware of the difficulties associated with developing intelligent characters Therefore, for his PhD at

the University of Toronto he successfully developed a new approach to high-level control of characters in games andanimation John is the author of several papers and has given numerous talks on his work, including a technical sketch

at SIGGRAPH 97 His current research interests include computer animation, computer games, interval arithmeticand knowledge representation

Trang 7

Hardcore AI for Computer Games and Animation

Siggraph Course Notes (Part I) John Funge and Xiaoyuan Tu

Microcomputer Research Lab Intel Corporation

fjohn funge|xiaoyuan tug@ccm.sc.intel.com

Abstract

Recent work in behavioral animation has taken impressive steps toward autonomous, self-animating

characters for use in production animation and computer games It remains difficult, however, to direct autonomous characters to perform specific tasks To address this problem, we explore the use of cognitive

models Cognitive models go beyond behavioral models in that they govern what a character knows, how

that knowledge is acquired, and how it can be used to plan actions To help build cognitive models, we havedeveloped a cognitive modeling language (CML) Using CML , we can decompose cognitive modeling intofirst giving the character domain knowledge, and then specifying the required behavior The character’sdomain knowledge is specified intuitively in terms of actions, their preconditions and their effects Todirect the character’s behavior, the animator need only specify a behavior outline, or “sketch plan”, and thecharacter will automatically work out a sequence of actions that meets the specification A distinguishingfeature of CML is how we can use familiar control structures to focus the power of the reasoning engineonto tractable sub-tasks This forms an important middle ground between regular logic programming andtraditional imperative programming Moreover, this middle ground allows many behaviors to be specifiedmore naturally, more simply, more succinctly and at a much higher-level than would otherwise be possible

In addition, by using interval arithmetic to integrate sensing into our underlying theoretical framework, weenable characters to generate plans of action even when they find themselves in highly complex, dynamicvirtual worlds We demonstrate applications of our work to “intelligent” camera control, and behavioranimation for characters situated in a prehistoric world and in a physics-based undersea world

Keywords: Computer Animation, Knowledge, Sensing, Action, Reasoning, Behavioral Animation,

Cogni-tive Modeling

Modeling for computer animation addresses the challenge of automating a variety of difficult animationtasks An early milestone was the combination of geometric models and inverse kinematics to simplifykeyframing Physical models for animating particles, rigid bodies, deformable solids, and fluids offer copi-ous quantities of realistic motion through dynamic simulation Biomechanical modeling employs simulatedphysics to automate the realistic animation of living things motivated by internal muscle actuators Re-search in behavioral modeling is making progress towards self-animating characters that react appropriately

to perceived environmental stimuli

Trang 8

be used to plan actions Cognitive models are applicable to directing the new breed of highly autonomous,quasi-intelligent characters that are beginning to find use in animation and game production Moreover,cognitive models can play subsidiary roles in controlling cinematography and lighting.

We decompose cognitive modeling into two related sub-tasks: domain specification and behavior

speci-fication Domain specification involves giving a character knowledge about its world and how it can change.

Behavior specification involves directing the character to behave in a desired way within its world Likeother advanced modeling tasks, both of these steps can be fraught with difficulty unless animators are giventhe right tools for the job To this end, we develop a cognitive modeling language, CML

CML rests on solid theoretical foundations laid by artificial intelligence (AI) researchers This level language provides an intuitive way to give characters, and also cameras and lights, knowledge abouttheir domain in terms of actions, their preconditions and their effects We can also endow characters with

high-a certhigh-ain high-amount of “commonsense” within their domhigh-ain high-and we chigh-an even lehigh-ave out tiresome dethigh-ails fromthe specification of their behavior The missing details are automatically filled in at run-time by a reasoningengine integral to the character which decides what must be done to achieve the specified behavior

Traditional AI style planning certainly falls under the broad umbrella of this description, but the guishing features of CML are the intuitive way domain knowledge can be specified and how it affords ananimator familiar control structures to focus the power of the reasoning engine This forms an importantmiddle ground between regular logic programming (as represented by Prolog) and traditional imperativeprogramming (as typified by C) Moreover, this middle ground turns out to be crucial for cognitive mod-eling in animation and computer games In one-off animation production, reducing development time is,within reason, more important than fast execution The animator may therefore choose to rely more heavily

distin-on the reasdistin-oning engine When run-time efficiency is also important our approach lends itself to an mental style of development We can quickly create a working prototype If this prototype is too slow, itmay be refined by including more and more detailed knowledge to narrow the focus of the reasoning engine

Badler [3] and the Thalmanns [13] have applied AI techniques [1] to produce inspiring results with animatedhumans Tu and Terzopoulos [18] have taken impressive strides towards creating realistic, self-animatinggraphical characters through biomechanical modeling and the principles of behavioral animation introduced

in the seminal work of Reynolds [16] A criticism sometimes levelled at behavioral animation methods

is that, robustness and efficiency notwithstanding, the behavior controllers are hard-wired into the code.Blumberg and Galyean [6] begin to address such concerns by introducing mechanisms that give the animatorgreater control over behavior and Blumberg’s superb thesis considers interesting issues such as behaviorlearning [5] While we share similar motivations, our work takes a different route One of the features of ourapproach is that we investigate important higher-level cognitive abilities such as knowledge representationand planning

The theoretical basis of our work is new to the graphics community and we consider some novel cations We employ a formalism known as the situation calculus The version we use is a recent product

appli-of the cognitive robotics community [12] A noteworthy point appli-of departure from existing work in cognitiverobotics is that we render the situation calculus amenable to animation within highly dynamic virtual worlds

by introducing interval valued fluents [9] to deal with sensing

High-level camera control is particularly well suited to an approach like ours because there already exists

a large body of widely accepted rules that we can draw upon [2] This fact has also been exploited by tworecent papers [10, 8] on the subject This previous work uses a simple scripting language to implement

Trang 9

3 Theoretical Background

The situation calculus is a well known formalism for describing changing worlds using sorted first-orderlogic Mathematical logic is somewhat of a departure from the mathematical tools that have been used inprevious work in computer graphics In this section, we shall therefore go over some of the more salientpoints Since the mathematical background is well-documented elsewhere (for example, [9, 12]), we onlyprovide a cursory overview We emphasize that from the user’s point of view the underlying theory is

completely hidden In particular, a user is not required to type in axioms written in first-order mathematical

logic Instead, we have developed an intuitive interaction language that resembles natural language, but has

a clear and precise mapping to the underlying formalism In section 4, we give a complete example of how

to use CML to build a cognitive model from the user’s point of view

3.1 Domain modeling

A situation is a “snapshot” of the state of the world A domain-independent constantS 0 denotes the initial

situation Any property of the world that can change over time is known as a fluent A fluent is a function,

or relation, with a situation term as (by convention) its last argument For exampleBroken (x;s)is a fluentthat keeps track of whether an objectxis broken in a situations

Primitive actions are the fundamental instrument of change in our ontology The term “primitive” can

sometimes be counter-intuitive and only serves to distinguish certain atomic actions from the “complex”,compound actions that we will define in section 3.2 The situation s0

resulting from doing action a insituationsis given by the distinguished functiondo, such that,s0

=do(a;s) The possibility of performingactionain situation sis denoted by a distinguished predicatePoss (a;s) Sentences that specify what the

state of the world must be before performing some action are known as precondition axioms For example,

it is possible to drop an objectxin a situationsif and only if a character is holding it,Poss ( drop (x);s) ,

Holding (x;s) In CML , this axiom can be expressed more intuitively without the need for logical connectivesand the explicit situation argument 1

actiondrop(x) possible whenHolding(x)

The convention in CML is that fluents to the left of the when keyword refer to the current situation The

effects of an action are given by effect axioms They give necessary conditions for a fluent to take on a given

value after performing an action For example, the effect of dropping an objectxis that the character is nolonger holding the object in the resulting situation and vice versa for picking up an object This is stated inCML as follows

occurrencedrop(x) results in !Holding(x)

occurrencepickup(x) results inHolding(x)

What comes as a surprise, is that, a naive translation of the above statements into the situation calculusdoes not give the expected results In particular, there is a problem stating what does not change when anaction is performed That is, a character has to worry whether dropping a cup, for instance, results in a vaseturning into a bird and flying about the room For mindless animated characters, this can all be taken care ofimplicitly by the programmer’s common sense We need to give our thinking characters this same commonsense They have to be told that, unless they know better, they should assume things stay the same In AIthis is called the ”frame problem” [14] If characters in virtual worlds start thinking for themselves, then

1

To promote readability all CML keywords will appear in bold type, actions (complex and primitive) will be italicized, and

Trang 10

they too will have to tackle the frame problem Until recently, it is one of the main reasons why we have notseen approaches like ours used in computer animation or robotics.

Fortunately, the frame problem can be solved provided characters represent their knowledge in a certainway [15] The idea is to assume that our effect axioms enumerate all the possible ways that the worldcan change This closed world assumption provides the justification for replacing the effect axioms with

successor state axioms For example, the CML statements given above can now be effectively translated

into the following successor state axiom that CML uses internally to represent how the character’s world canchange It states that, provided the action is possible, then a character is holding an objectxif and only if itjust picked it up or it was holding it before and it did not just drop it,Poss (a;s) ) [ Holding (x;do(a;s)) ,

a= pickup (x) _ (a6= drop (x) ^ Holding (x;s))

3.1.1 Sensing

One of the limitations of the situation calculus, as we have presented it so far, is that we must always writedown things that are true about the world This works out fine for simple worlds as it is easy to place all therules by which the world changes into the successor state axioms Even in more complex worlds, fluentsthat represent the character we are controlling’s internal state are, by definition, always true Now imagine

we have a simulated world that includes an elaborate thermodynamics model involving advection-diffusionequations We would like to have a fluenttemp that gives the temperature in the current situation for thecharacter’s immediate surroundings What are we to do? Perhaps the initial situation could specify thecorrect temperature at the start? However, what about the temperature after asetFireToHouseaction, or a

spillLiquidHeliumaction, or even just twenty clock tick actions? We could write a successor state axiom thatcontains all the equations by which the simulated world’s temperature changes The character can thenperform multiple forward simulations to know the precise effect of all its possible actions This, however,

is expensive, and even more so when we add other characters to the scene With multiple characters, eachcharacter must perform a forward simulation for each of its possible actions, and then for each of the othercharacter’s possible actions and reactions, and then for each of its own subsequent actions and reactions, etc.Ignoring these concerns, imagine that we could have a character that can precisely know the ultimate effect

of all its actions arbitrarily far off into the future Such a character can see much further into its future than

a human observer so it will not appear as “intelligent”, but rather as “super-intelligent” We can think of anexample of a falling tower of bricks where the character precomputes all the brick trajectories and realizes

it is in no danger To the human observer, who has no clue what path the bricks will follow, a characterwho happily stands around while bricks rain around it looks peculiar Rather, the character should run forcover, or to some safe distance, based on its qualitative knowledge that nearby falling bricks are dangerous

In summary, we would like our characters to represent their uncertainty about some properties of the worlduntil they sense them

Half of the solution to the problem is to introduce exogenous actions (or events) that are generated by

the environment and not the character For example, we can introduce an actionsetTempthat is generated bythe underlying simulator and simply sets the temperature to its current value It is straightforward to modifythe definition of complex actions, that we give in the next section, to include a check for any exogenousactions and, if necessary, include them in the sequence of actions that occur (see [9] for more details).The other half of the problem is representing what the character knows about the temperature Justbecause the temperature in the environment has changed does not mean the character should know about it

until it performs a sensing action In [17] sensing actions are referred to as knowledge producing actions.

This is because they do not affect the world but only a character’s knowledge of its world The authors wereable to represent a character’s knowledge of its current situation by defining an epistemic fluentKto keeptrack of all the worlds a character thinks it might possibly be in Unfortunately, the approach does not lend

Trang 11

possible worlds that we potentially have to list out Once we start using functional fluents, however, thingsget even worse: we cannot, by definition, list out the uncountably many possible worlds associated with notknowing the value of a fluent that takes on values inR.

3.1.2 Interval-valued epistemic (IVE ) fluents

The epistemicK-fluent allows us to express an agent’s uncertainty about the value of a fluent in its world.Intervals arithmetic can also be used to express uncertainty about a quantity Moreover, they allow us to do

so in a way that circumvents the problem of how to use a finite representation for infinite quantities It is,therefore, natural to ask whether we can also use intervals to replace the troublesome epistemic K-fluent.The answer, as we show in [9], is a resounding “yes” In particular, for each sensory fluentf, we introduce

an interval-valued epistemic (IVE ) fluentI

f The IVE fluentI

f is used to represent an agent’s uncertaintyabout the value off Sensing now corresponds to making intervals narrower

In our temperature example, we can introduce an IVE fluent,I temp, that takes on values inI ?+ Notethat, I ?+ denotes the set of pairs hu;vi such that u;v 2 R

is the interval that is guaranteed to bound the temperature When the interval is less than a certain width

we say that the character “knows” the property in question We can then write precondition axioms basednot only upon the state of the world, but also on the state of the character’s knowledge of its world Forexample, we can state that it is only possible to turn the heating up if the character knows it is too cold Ifthe character does not “know” the temperature (i.e the intervalI temp (s)is too wide) then the character canwork out it needs to perform a sensing action In [9] we prove many important equivalences, and theoremsthat allow us to justify using our IVE fluents to completely replace the troublesomeK-fluent

Specifying behavior in CML capitalizes on our way of representing knowledge to include a novel approach

to high-level control It is based on the theory of complex actions from the situation calculus [12] Any

primitive action is also a complex action, and other complex actions can be built up using various control

Trang 12

Although the syntax may be similar to a conventional programming language, in terms of functionallyCML is a strict superset In particular, a behavior outline can be nondeterministic By this, we do notmean that the behavior is random, but that we can cover multiple possibilities in one instruction As weshall explain, this added freedom allows many behaviors to be specified more naturally, more simply, moresuccinctly and at a much higher-level than would otherwise be possible The user can design charactersbased on behavior outlines, or ”sketch plans” Using its background knowledge, the character can decidefor itself how to fill in the necessary missing details.

The complete list of operators for defining complex actions is defined recursively and is given below.Together, they define the behavior specification language used for issuing advice to characters The mathe-matical definitions for these operators are given in [12] After each definition the equivalent CML syntax isgiven in square brackets

(Primitive Action) If is a primitive action then, provided the precondition axiom states it is possible, do

the action [same except when the action is a variable when we need to use an explicit do];

(Sequence) o means do action, followed by action [same except that in order to mimic C statementsmust end with a semi-colon];

(Test)p?succeeds ifpis true, otherwise it fails [test(<EXPRESSION>)];

(Nondeterministic choice of actions)j means do actionor action [choose<ACTION>or<ACTION>];

(Conditionals) ifp else fi, is just shorthand forp?

oj (:p)?

o [if (<EXPRESSION>)<ACTION>

else<ACTION>];

(Non-deterministic iteration) ?, means dozero or more times [star<ACTION>];

(Iteration) whilepdood is just shorthand forp? ?[while (<EXPRESSION>)<ACTION>];

(Nondeterministic choice of arguments)(x)means pick some argumentxand perform the action(x)

[pick(<EXPRESSION>)<ACTION>];

procplanner (n)

goal ? j

[(n >0)?

o (a)( primitiveAction (a)?

oa) o

planner (n, 1);

Trang 13

Assuming we define what the primitive actions are, and the goal, then this procedure will perform adepth-first search for plans of length less thann We have written a Java applet, complete with documenta-tion, that is available on the World Wide Web to further assist the interested reader in mastering this novellanguage [11].

The following maze example is not meant to be a serious application It is a simple, short tutorialdesigned to explain how an animator would use CML

A maze is defined as a finite grid with some occupied cells We say that a cell is Freeif it is in the gridand not occupied A functionadjacentreturns the cell that is adjacent to another in a particular direction.Figure 2 shows a simple maze, some examples of the associated definitions, and the values of the fluents inthe current situation There are two fluents,positiondenotes which cell contains the character in the currentsituation, andvisiteddenotes the cells the character has previously been to

0

2

2 1

1

Occupied(1,1) size = 3 exit = (2,2) start = (0,0)

position = (2,1) visited = [(2,0),(1,0),(0,0)]

adjacent((1,1),n) = (1,2)

Figure 2: A simple mazeThe single action in this example is amoveaction that takes one of four compass directions as a param-eter It is possible to move in some directiond, provided the cell we are moving to is free, and has not beenvisited before

actionmove(d) possible whenc=adjacent(position,d) &&Free(c) && !member(c,visited);

Figure 3 shows the possible directions a character can move when in two different situations

move(east)

move(south) move(south)

move(north)

Figure 3: Possible directions to move

A fluent is completely specified by its initial value, and its successor state axiom For example, theinitial position is given as the start point of the maze, and the effect of moving to a new cell is to update theposition accordingly

initiallyposition=start;

occurrencemove(d) results inposition=adjacent(pol d,d) whenposition=pol d;

The fluent visited is called a defined fluent because it is defined (recursively) in terms of the previous

Trang 14

indicated with a “:=” Just as with regular fluents, anything to the left of a when refers to the previous

situation.3

initiallyvisited:= [];

visited:= [pol d

jvol d] whenposition=pol d&&visited=vol d;

The behavior we are interested in specifying in this example is that of navigating a maze The power ofCML allows us to express this fact directly as follows

while (position!=exit)

pick(d)move(d);

Just like a regular while loop, the above program expands out into a sequence of actions Unlike a

regular while loop it expands out not into one particular sequence of actions, but into all possible sequences

of actions A possible sequence of actions is defined by the precondition axioms that we previously stated,and the exit condition of the loop Therefore, any free path through the maze, that does not backtrack,and ends at the exit position meets the behavior specification This is what we mean by a nondeterministicbehavior specification language Nothing “random” is happening, we can simply specify a large number ofpossibilities all at once Searching the specification for valid action sequences is the job of an underlyingreasoning engine Figure 4 depicts all the behaviors that meet the above specification for the simple maze

we defined earlier.4

or

Figure 4: Valid behaviorsAlthough we disallow backtracking in the final path through the maze, the reasoning engine may usebacktracking when it is searching for valid paths In the majority of cases, the reasoning engine can usedepth-first search to find a path through a given maze in a few seconds To speed things up, we can easilystart to reduce some of the nondeterminism by specifying a “best-first” search strategy In this approach,

we will not leave it up to the character to decide how to search the possible paths, but constrain it to firstinvestigate paths that head toward the exit This requires extra lines of code but could result in fasterexecution

For example, suppose we add an actiongoodMove, such that it is possible to move in a directiondif it ispossible to “move” to the cell in that direction, and the cell is closer to the goal than we are now:

actiongoodMove(d) possible when possible(move(d)) &&Closer(exit,d,position);

Now we can rewrite our high-level controller as one that prefers to move toward the exit position wheneverpossible

while (position!=exit)

choose pick (d)goodMove(d);

or pick (d)move(d);

Trang 15

At the extreme, there is nothing to prevent us from coding in a simple deterministic strategy such asthe “left-hand” rule The important point is that our approach does not rule out any of the algorithms onemight consider when writing the same program in C Rather, it opens up new possibilities for very high-levelspecifications of behavior.

”actors” Moreover, CML is ideally suited to realizing this approach

Figure 5: Common camera placements

To appreciate what follows, the reader may benefit from a rudimentary knowledge of cinematography(see 5) The exposition given in section 2.3, “Principles of cinematography”, of [10] is an excellent startingpoint In [10], the authors discuss one particular formula for filming two characters talking to one another.The idea is to flip between “external” shots of each character, focusing on the character doing the talking

To break up the monotony, the shots are interspersed with reaction shots of the other character In [10], theformula is encoded as a finite state machine We will show how elegantly we can capture the formula usingthe behavior specification facilities of CML Firstly, however, we need to specify the domain In order to

be as concise as possible, we shall concentrate on explaining the important aspects of the specification, anymissing details can be found in [9]

5.1 Camera domain

We shall be assuming that the motion of all other objects in the scene has been computed Our task is todecide, for each frame, the vantage point from which it is to be rendered The fluentframe keeps track ofthe current frame number, and atick action causes it to be incremented by one The precomputed scene isrepresented as a lookup function, scene, which for each object, and each frame, completely specifies theposition, orientation, and shape

The most common camera placements used in cinematography will be modeled in our formalization

as primitive actions In [10], these actions are referred to as “camera modules” This is a good examplewhere the term “primitive” is misleading As described in [4], low-level camera placement is a complex andchallenging task in its own right For the purposes of our exposition here we shall make some simplifications

Trang 16

also make the simplifying assumption that the viewing frustrum is fixed Despite our simplifications, westill have a great deal of flexibility in our specifications We will now give examples of effect axioms forsome of the primitive actions in our ontology.

Thefixedaction is used to explicitly specify a particular camera configuration We can, for example, use

it to provide an overview shot of the scene:

occurrencefixed(e,c) results inlookFrom=e&&lookAt=c;

A more complicated action isexternal It takes two arguments, characterA, and characterBand placesthe camera so thatAis seen over the shoulder ofB One effect of this action, therefore, is that the camera

is looking at characterA:

occurrenceexternal(A,B) results inlookAt=pwhenscene(A(upperbody,centroid)) =p;

The other effect is that the camera is located above character B’s shoulder This might be accomplishedwith an effect axiom such as:

occurrenceexternal(A,B) results inlookFrom=p+k2

up +k3

normalize (p,c)when

scene(B(shoulder,centroid)) =p&&scene(A(upperbody,centroid)) =c;

wherek2andk3are some suitable constants

There are many other possible camera placement actions Some of them are listed in [10], others may

be found in [2]

The remaining fluents are concerned with more esoteric aspects of the scene, but some of their effectaxioms are mundane and so we shall only explain them in English For example, the fluentTalking(A,B)(meaning A is talking toB) becomes true after astartTalk(A,B) action, and false after a stopTalking(A,B)action Since we are currently only concerning ourselves with camera placement it is the responsibility ofthe application that is generating the scene descriptions to produce the start and stop talking actions moreinteresting fluent issilenceCount, it keeps count of how long it has been since a character spoke

occurrencetickresults insilenceCount=n, 1whensilenceCount=n&& !exists(A,B)Talking(A,B);

occurrencestopTalk(A,B) results insilenceCount=ka;

occurrencesetCountresults insilenceCount=ka;

Note that,kais a constant (ka

= 10in [10]), such that afterkaticks of no-one speaking the counter will

be negative A similar fluent filmCountkeeps track of how long the camera has been pointing at the samecharacter:

occurrencesetCount jj external(A,B) results infilmCount=kb whenTalking(A,B);

occurrencesetCount jj external(A,B) results infilmCount=kcwhen !Talking(A,B);

occurrencetickresults infilmCount=n, 1whenfilmCount=n;

kb and kc are constants (kb

= 30 and kc

= 15 in [10]) that state how long we can stay with the sameshot before the counter becomes negative Note that, the constant for the case of looking at a non-speakingcharacter is lower We will keep track of which constant we are using with the fluenttooLong

For convenience, we now introduce two defined fluents that express when a shot has become boringbecause it has gone on too long, and when a shot has not gone on long enough We need the notion of aminimum time for each shot to avoid instability that would result in flitting between one shot and anothertoo quickly

Trang 17

Finally, we introduce a fluentFilmingto keep track of who the camera is pointing at.

Until now, we have not mentioned any preconditions for our actions The reader may assume that, unlessstated otherwise, all actions are always possible In contrast, the precondition axiom for theexternalcameraaction states that we only want to be able to point the camera at characterA, if we are already filming A,and it has not got boring yet; or we not filmingA, andAis talking, and we have stayed with the current shotlong enough:

actionexternal(A,B) possible when (!Boring&&Filming(A))jj(Talking(A,B) && !Filming(A) && !TooFast;

We are now in a position to define the controller that will move the camera to look at the character doingthe talking, with occasional respites to focus on the other character’s reactions:

We now turn our attention behavioral animation, which is the other main application that we have discoveredfor our work The first example we consider is prehistoric world, and the second is an undersea world Theundersea world is differentiated by the complexity of the underlying model

6.1 Prehistoric world

In our prehistoric world we have a Tyrannosauras Rex (T-Rex) and some Velociprators (Raptors) Themotion is generated by some simplified physics and a lot of inverse kinematics The main non-aestheticadvantage the system has is that it is real-time on a Pentium II with an Evans and Sutherland RealImage3D Graphics card Our CML specifications were compiled into Prolog using our online applet [11] Theresulting Prolog code was then compiled into the underlying system using Quintus Prolog’s ability to linkwith Visual C++ Unfortunately, performance was too adversely affected so we wrote our own reasoningengine, from scratch, in Visual C++ Performance is real-time on average, but can be slightly jittery whenthe reasoning engine takes longer than usual to decide on some suitable behavior

So far we have made two animations using the dinosaurs The first one was to show our approach applied

to camera control, and the second has to do with territorial behavior Although some of the camera anglesare slightly different, the camera animation uses essentially the same CML code as the example given insection 5 The action consists of a T-Rex and a Raptor having a “conversation” Some frames from ananimation called “Cinemasauras” are shown in the color plates at the end of the paper

The territorial T-Rex animation was inspired by the work described in [7] in which a human tries using avirtual reality simulator to herd reactive characters Our challenge was to have the T-Rex eject some Raptorsfrom its territory The Raptors were defined to be afraid of the T-Rex (especially if it roared) and so theywould try and run in the opposite direction if it got too close The T-Rex, therefore, had to try and get inbehind the Raptors and frighten them toward a pass, while being careful not to frighten the raptors that were

Trang 18

this as a reactive system would be non-trivial The CML code we used was similar to the planner listed insection 3.2 except that it was written to search breadth-first To generate the animation all we essentiallyhad to do was to define the fluentgoalin terms of the fluent that tracks the number of Raptors heading inthe right direction, numRightDir In particular, goalis true if there are k more raptors heading in the rightdirection than there are currently.

goal:=numRightDir=n&&n0

+k <=nwhen initiallynumRightDir=n0

To speed things up we also defined a fluentbadSituationwhich we can use to prune the search space Forexample, if the T-Rex just roared, and no raptors changed direction, then we are in a bad situation:

badSituationafterroar:=noInWrongDir=n&&m=nwhennoInWrongDir=n0&&n0

+k <=n

If the T-rex cannot find a sequence of actions that it believes will get k raptors heading in the rightdirection, as long as it makes some partial progress, it will settle for the best it could come up with If itcannot find a sequence of actions that result in even partial progress (for example when the errant raptors aretoo far away) it looks for a simple alternative plan to just move closer to nearby Raptors that are heading inthe wrong direction The information computed to find a primary plan can still be used to avoid frighteningany raptors unnecessarily as it plans a path to move toward the errant ones

6.2 Undersea world

In our undersea world we bring to life some mythical creatures, namely “merpeople” The undersea world

is physics-based The high-level intentions of a merperson get filtered down into detailed muscle actionswhich cause reaction forces on the virtual water This makes it hard for a merperson to reason about itsworld as it is difficult to predict the ultimate effect of their actions A low-level reactive behavior systemhelps to some extent by providing a buffer between the reasoning engine and the environment Thus at thehigher level we need only consider actions such as “go left”, “go to a specific position”, etc and the reactivesystem will take care of translating these commands down into the required detailed muscle actions Even

so, without the ability to perform precise multiple forward simulations the exact position that a merpersonwill end up, after executing a plan of action, is hard for the reasoning engine to predict A typical solutionwould be to re-initialize the reasoning engine every time it is called, but this makes it difficult to pursue longterm goals as we are throwing out all the characters knowledge instead of just the knowledge that is out ofdate

The solution is for the characters to use the IVE fluents that we described in section 3.1.2 to representpositions After sensing the positions of all the characters that are visible are known The merperson can thenuse this knowledge to replan its course of action, possibly according to some long-term strategy Regularfluents are used to model the merperson’s internal state, such as its goal position, fear level, etc

6.2.1 Reasoning and Reactive system

The relationship between the user, the reasoning system and the reactive system is depicted in figure 6.The reactive system provides us with virtual creatures that are fully functional autonomous agents On

its own it provides an operational behavior system The system provides: A graphical display model that captures the form and appearance of our characters; A biomechanical model that captures the physical and

anatomical structure of the character’s body, including its muscle actuators, and simulates its deformation

and physical dynamics; A behavioral control model that is responsible for motor control, perception control

and behavior control of the character Although this control model is capable of generating some high-levelbehaviors, we need only the low-level behavior capabilities

Trang 19

REASONING ENGINE

Sensory

Low-level commands

Behavior specification

information

Domain specification

USER

Cognitive Model

about virtual world Information

REACTIVE SYSTEM

1) Preconditions for performing an action

3) The initial state of the virtual world

2) The effect that performing an action would have on the virtual world

Figure 6: Interaction between cognitive model, user and low-level reactive behavior system

to avoid the character doing anything “stupid” in the event that it cannot decide on anything “intelligent”.Behaviors such as “continue in the same direction”, “avoiding collisions” are examples of typical defaultreactive system behaviors

The complete listing of the code (using an older version of CML) that we used to generate the mations is available in appendix F of [9] The reasoning system runs on a Sun UltraSPARC, and the thereactive system runs simultaneously on an SGI Indigo 2 Extreme On its own the reactive system managesabout 3 frames per second and this slows to about 1 frame per second with reasoning A large part of theextra overhead is accounted for by reading and writing files that the reactive and reasoning system use tocommunicate

ani-6.2.2 Undersea Animations

The undersea animations revolve around pursuit and evasion behaviors The sharks try to eat the merpeopleand the merpeople try to use the superior reasoning abilities we give them to avoid such a fate For the mostpart, the sharks are instructed to chase merpeople they see If they cannot see any, they go to where they lastsaw one If all else fails they start to search systematically The “Undersea Animation” color plates at theend show selected frames from two particular animations

The first animations we produced were to verify that the shark could easily catch a merman swimming inopen water The shark is larger and swims faster, so it has no trouble catching its prey Next, we introducedsome obstacles Now when the merman is in trouble it can come up with short term plans to take advantage

of undersea rocks to frequently evade capture It can hide behind the rocks and hug them closely so thatthe shark has difficulty seeing or reaching it We were able to use the control structures of CML to encode

a great deal of heuristic knowledge For example, consider the problem of trying to come up with a plan

to hide from a predator A traditional planning approach will be able to perform a search of various pathsaccording to criteria such as whether it uses hidden positions, whether it is far from a predator, etc Unfor-tunately, this kind of planning is expensive and therefore can not be done over long distances By using thecontrol structures of CML , we can encode various heuristic knowledge to help overcome this limitation.For example, we can specify a procedure that encodes the following heuristic: if the current position isgood enough then stay where you are; otherwise search the area around you (the expensive planning part);otherwise check out the obstacles (hidden positions are more likely near obstacles); if all else fails panic and

go in a random direction With a suitable precondition forpickGoal, that prevents the merperson selecting agoal until it meets a certain minimum criteria, the following CML procedure implements the above heuristicfor characteri

Trang 20

We used these control structures to make the animation “The Great Escape” This was done by simplyinstructing the merman to avoid being eaten, and whenever it appears reasonably safe to do so, to make abreak for a particular rock in the scene The particular rock that we want to get the merman to go to has theproperty that it contains a narrow crack through which the merman can pass but through which the sharkcan not What we wanted was an animation in which the merman eventually gets to the special rock withthe shark in hot pursuit The merman’sevadeprocedure should then swing into action, hopefully causing it

to evade capture by slipping through the crack Although we do not know exactly when, or how, we have amechanism to heavily stack the deck toward getting what we want In our case, we got what we wanted firsttime but if it remained elusive we can carry on using CML , just like a regular programming language, toconstrain what happens all the way down to scripting an entire sequence if we have to

Finally, as an extension to behavior animation our approach inherits the ability to linearly scale a singlecharacter That is, once we have developed a cognitive model for one character, we can reuse the model

to create multiple characters Each character will behave autonomously, according to their own uniqueperspective of their virtual world

There is a large scope for future work We could integrate a mechanism to learn reactive rules that mimic thebehavior observed from the reasoning engine Other issues arise in the user interface As it stands CML is agood choice as the underlying representation a developer might want to use to build a cognitive model Ananimator, or other users, might prefer a graphical user interface as a front-end In order to be easy to use wemight limit the interaction to supplying parameters to predefined models, or perhaps we could use a a visualprogramming metaphor to specify the complex actions

In summary, CML always gives us an intuitive way to give a character knowledge about its world interms of actions, their preconditions and their effects When we have a high-level description of the ultimateeffect of the behavior we want from a character, then CML gives us a way to automatically search forsuitable action sequences When we have a specific action sequence in mind, there may be no point to haveCML search for one In this case, we can use CML more like a regular programming language, to expressprecisely how we want the character to behave We can even use a combination of these two extremes,and the whole gamut inbetween, to build different parts of one cognitive model It is this combination ofconvenience and automation that makes CML such a potentially important tool in the arsenal of tomorrow’sanimators and game developers

Trang 21

[1] J Allen, J Hendler, and A Tate, editors Readings in Planning Morgan Kaufmann, 1990.

[2] D Arijon Grammar of the Film Language Communication Arts Books, Hastings House, Publishers,

New York, 1976

[3] N I Badler, C Phillips, and D Zeltzer Simulating Humans Oxford University Press, 1993.

[4] J Blinn Where am i? what am i looking at? In IEEE Computer Graphics and Applications, pages

75–81, 1988

[5] B Blumberg Old Tricks, New Dogs: Ethology and Interactive Creatures PhD thesis, MIT Media

Lab, MIT, Boston, USA, 1996

[6] B M Blumberg and T A Galyean Multi-level direction of autonomous creatures for real-time

envi-ronments In R Cook, editor, Proceedings of SIGGRAPH ’95, pages 47–54 ACM SIGGRAPH, ACM

Press, Aug 1995

[7] D Brogan, R A Metoyer, and J K Hodgins Dynamically simulated characters in virtual

environ-ments In Animation Sketch, Siggraph ’97, 1997.

[8] D B Christianson, S E Anderson, L He, D H Salesin, D Weld, and M F Cohen Declarative

camera control for automatic cinematography In Proceedings of the Fourteenth National Conference

on Artificial Intelligence (AAAI–96), Menlo Park, CA., 1996 AAAI Press.

[9] —, — PhD thesis, 1997.—

[10] L He, M F Cohen, and D Salesin The virtual cinematographer: A paradigm for automatic real-time

camera control and directing In H Rushmeier, editor, Proceedings of SIGGRAPH ’96, Aug 1996 [11] — CML Compiler Applet, 1997.

[12] H Levesque, R Reiter, Y Lesp´erance, F Lin, and R Scherl Golog: A logic programming language

for dynamic domains Journal of Logic Programming, 31:59–84, 1997.

[13] N Magnenat-Thalmann Computer Animation: Theory and Practice Springer-Verlag, second edition

edition, 1990

[14] J McCarthy and P Hayes Some philosophical problems from the standpoint of artificial intelligence

In B Meltzer and D Michie, editors, Machine Intelligence 4, pages 463–502 Edinburgh University

Press, Edinburgh, 1969

[15] R Reiter The frame problem in the situation calculus In V Lifschitz, editor, Artificial Intelligence

and Mathematical Theory of Computation, pages 359–380,418–420 Academic Press, 1991.

[16] C W Reynolds Flocks, herds, and schools: A distributed behavioral model In M C Stone, editor,

Computer Graphics (SIGGRAPH ’87 Proceedings), volume 21, pages 25–34, July 1987.

[17] R Scherl and H Levesque The frame problem and knowledge-producing actions In Proceedings of

the Eleventh National Conference on Artificial Intelligence (AAAI–93), 1993 AAAI Press.

Trang 22

A Camera Code from [10]

Trang 23

SIGGRAPH 98 Course Notes (Part II)

by

John David Funge

Copyright c

Trang 25

John David Funge1998For applications in computer game development and character animation, recent work in behavioralanimation has taken impressive steps toward autonomous, self-animating characters It remains difficult,however, to direct autonomous characters to perform specific tasks We propose a new approach to high-level control in which the user gives the character a behavior outline, or “sketch plan” The behavioroutline specification language has syntax deliberately chosen to resemble that of a conventional imperativeprogramming language In terms of functionality, however, it is a strict superset In particular, a behavioroutline need not be deterministic This added freedom allows many behaviors to be specified more naturally,more simply, more succinctly and at a much higher-level than would otherwise be possible The characterhas complete autonomy to decide on how to fill in the necessary missing details

The success of our approach rests heavily on our use of a rigorous logical language, known as the situationcalculus The situation calculus is well-known, simple and intuitive to understand The basic idea is that acharacter views its world as a sequence of “snapshots” known as situations An understanding of how theworld can change from one situation to another can then be given to the character by describing what theeffect of performing each given action would be The character can use this knowledge to keep track of itsworld and to work out which actions to do next in order to attain its goals The version of the situationcalculus we use incorporates a new approach to representing epistemic fluents The approach is based oninterval arithmetic and addresses a number of difficulties in implementing previous approaches

Trang 27

1.1 Previous Models 11.2 Cognitive models 21.3 Aims 31.4 Challenges 31.5 Methodology 51.6 Overview 6

2.1 Kinematics 92.1.1 Geometric Constraints 92.1.2 Rigid Body Motion 92.1.3 Separating Out Rigid Body Motion 102.1.4 Articulated Figures 10Forward Kinematics 11Inverse Kinematics 122.2 Kinematic Control 122.2.1 Key-framing 122.2.2 Procedural Control 132.3 Noninterpenetration 132.3.1 Collision Detection 132.3.2 Collision Resolution and Resting Contact 142.4 Dynamics 142.4.1 Physics for Deformable Bodies 152.4.2 Physics for Articulated Rigid Bodies 15Lagrange’s Equation 15Newton-Euler Formulation 152.4.3 Forward Dynamics 162.4.4 Inverse Dynamics 162.4.5 Additional Geometric Constraints 172.5 Realistic Control 172.5.1 State Space 172.5.2 Output Vector 172.5.3 Input Vector 17

Trang 28

2.5.4 Control Function 18Hand-crafted Controllers 18Control Through Optimization 19Objective Based Control 192.5.5 Synthesizing a Control Function 202.6 High-Level Requirements 212.7 Our Work 22

3.1 Sorts 253.2 Fluents 263.3 The Qualification Problem 263.4 Effect Axioms 273.5 The Frame Problem 273.5.1 The Ramification Problem 283.6 Complex Actions 283.7 Exogenous Actions 303.8 Knowledge producing actions 303.8.1 An epistemic fluent 313.8.2 Sensing 313.8.3 Discussion 32Implementation 32Real numbers 333.9 Interval arithmetic 333.10 Interval-valued fluents 343.11 Correctness 363.12 Operators for interval arithmetic 383.13 Knowledge of terms 393.14 Usefulness 403.15 Inaccurate Sensors 433.16 Sensing Changing Values 443.17 Extensions 45

4.1 Methodology 474.2 Example 474.3 Utilizing Non-determinism 484.4 Another example 494.4.1 Implementation 524.4.2 Intelligent Flocks 544.5 Camera Control 544.5.1 Axioms 554.5.2 Complex actions 56

Trang 29

5 Physics-based Applications 615.1 Reactive System 615.2 Reasoning System 625.3 Background Domain Knowledge 625.4 Phenomenology 635.4.1 Incorporating Perception 64Rolling forward 64Sensing 655.4.2 Exogenous actions 655.5 Advice through “Sketch Plans” 655.6 Implementation 665.7 Correctness 685.7.1 Visibility Testing 705.8 Reactive System Implementation 735.8.1 Appearance 733D Geometric Models 74Texture Mapping 755.8.2 Locomotion 75Deformable models 765.8.3 Articulated Figures 785.8.4 Locomotion Learning 785.8.5 Perception 785.8.6 Behavior 78Collision Avoidance 795.9 Animation Results 805.9.1 Nowhere to Hide 825.9.2 The Great Escape 825.9.3 Pet Protection 845.9.4 General Mˆel´ee 87

6.1 Summary 896.2 Future Work 896.3 Conclusion 91

Trang 30

E.2 Sequences 103E.3 Tests 104E.4 Conditionals 105E.5 Nondeterministic iteration 107E.6 While loops 108E.7 Nondeterministic choice of action 109E.8 Nondeterministic choice of arguments 109E.9 Procedures 110E.10 Miscellaneous features 112

F.1 Procedures 115F.2 Pre-condition axioms 116F.3 Successor-state axioms 116

Trang 31

List of Figures

1.1 Shifting the burden of the work 21.2 Many possible worlds 51.3 Interaction between CDW, the animator and the low-level reactive behavior system 62.1 Kinematics 102.2 Joint and Link Parameters 112.3 “Muscle” represented as a spring and damper 182.4 Design Space 223.1 After sensing, only worlds where the light is on are possible 314.1 Some frames from a simple airplane animation 494.2 A simple maze 494.3 Visited cells 504.4 Choice of possibilities for a next cell to move to 514.5 Just one possibility for a next cell to move to 514.6 Updating maze fluents 514.7 A path through a maze 534.8 Camera placement is specified relative to “the Line” (Adapted from figure 1 of [He96]) 555.1 Cell A and B are “completely visible” from one another 715.2 Cell A and B are “completely occluded” from one another 715.3 Cell A and B are “partially occluded” from one another 725.4 Visibility testing near an obstacle 735.5 The geometric model 745.6 Coupling the geometric and dynamic model 745.7 Texture mapped face 755.8 A merperson swimming 765.9 The dynamic model 765.10 The repulsive potential 805.11 The attractive potential 815.12 The repulsive and attractive potential fields 815.13 Nowhere to Hide (part I) 825.14 Nowhere to Hide (part II) 835.15 The Great Escape (part I) 83

Trang 32

5.16 The Great Escape (part II) 845.17 The Great Escape (part III) 855.18 The Great Escape (part IV) 855.19 Pet Protection (part I) 865.20 Pet Protection (part II) 865.21 Pet Protection (part III) 875.22 General Mêlée (part I) 885.23 General Mêlée (part II) 88

Trang 33

Chapter 1

Introduction

Computer animation is concerned with producing sequences of images (or frames) that when displayed

in order, at sufficiently high speed, give the illusion of recognizable components of the image moving inrecognizable ways It is possible to place requirements on computer animations such as “objects should lookrealistic”, or “objects should move realistically” The traditional approach to meeting these requirementswas to employ skilled artists and animators The talents of the most highly skilled human animators maystill equal or surpass what might be attainable by computers However, not everyone who wants to, or needs

to, produce good quality animations has the time, patience, ability or money to do so Moreover, for certaintypes of applications, such as computer games, human involvement in run-time satisfaction of requirementsmay not be possible Therefore, in computer animation we try to come up with techniques whereby we canautomate parts of the process of creating animations that meet the given requirements

Generating images that are required to look realistic is normally considered the precept of computergraphics, so computer animation has focused on the low-level realistic locomotion problem For example,

“determine the internal torques that expend the least energy necessary to move a limb from one configuration

to another” is an example of a level control problem While there are still many open problems in level control, researchers are increasingly starting to focus on other requirements such as “characters shouldbehave realistically” By this we mean that we want the character to perform certain recognizable sequences

low-of gross movement This is commonly referred to as the high-level control problem With new applications,such as video games and virtual reality, it seems that this trend will continue

In character animation and in computer game development, exerting high-level control over a character’sbehavior is difficult A key reason for this is that it can be hard to communicate our instructions This isespecially so if the character does not maintain an explicit model of its view of the world Maintaining asuitable representation allows high-level intuitive commands and queries to be formulated As we shall show,this can result in a superior method of control

The simplest solution to the high-level control problem is to ignore it and rely, entirely, on the hardwork and ingenuity of the animator to coax the computer into creating the correct behavior This is theapproach most widely used in commercial animation production In contrast, the underlying theme of thisdocument is to continue the trend in computer animation of building computational models The idea is thatthe model will make the animator’s life easier by providing the right level of abstraction for interacting withthe computer characters The computational aspect stems from the fact that, in general, using such modelswill involve shifting more of the burden of the work from the animator to the computer Figure 1.1 gives agraphical depiction of this process

1.1 Previous Models

In the past there has been much research in computer graphics toward building computational models

to assist an animator The first models used by animators were geometric models Forward and inversekinematics are now widely used tools in animation packages The computer maintains a representation of

Trang 34

Chapter 1 Introduction

Figure 1.1: Shifting the burden of the work

how parts of the model are linked together and these constraints are enforced as the animator pulls theobject around This frees the animator from, necessarily, having to move every part of an articulated figureindividually

Similarly, using the laws of physics can free the animator from implicitly trying to emulate them whenthey generate motion Physical models are now being incorporated into animation packages One reasonableway to do this is to build a computer model that explicitly represents intuitive physical concepts, such asmass, gravity, moments of inertia etc

Physical models have allowed the automation of animating passive objects, such as falling chains, andcolliding objects For animate objects an active area of research is how to build biomechanical models So far,

it has been possible to use simplified biomechanical models to automate the process of locomotion learning

in a variety of virtual creatures, such as fish, snakes, and some articulated figures

Our work comes out of the attempt to further automate the process of generating animations by buildingbehavior models Within computer animation, the seminal work in this area was that of Reynolds [96] His

“boids” have found extensive application throughout the games and animation industry Recently the work

of Tu and Terzopoulos [114], and Blumberg and Galyean [23], has extended this approach to dealing withsome complex behaviors for more sophisticated creatures The idea is that the animator’s role may becomemore akin to that of a wildlife photographer This works well for background animations For animations

of specific high-level behaviors, things are more complicated

We refer to behaviors that are common in many creatures and situations as low-level behaviors Examples

of low-level behaviors include obstacle avoidance and flocking behavior Behaviors that are specific to aparticular animal or situation we refer to as high-level behaviors Examples include creature-specific matingbehaviors and “intelligent” behavior such as planning Many of the high-level behaviors exhibited by thecreatures in previous systems suffer from the problem of being hard-wired into the code This makes itdifficult to reconfigure or extend behaviors

Some of the work done by the logical programming community has some overlap with our work Insection 2.6, we shall discuss some of the achievements and limitations of that field The main problem,however, is the lack of any satisfactory model of a character’s cognitive process Consequently it might behard for them to extend their work to deal with important issues such as sensing, and multiple agents

1.2 Cognitive models

Cognitive models are the next logical step in the hierarchy of models that have been used for computeranimation By introducing such models we make it easier to produce animations by raising the level ofabstraction at which the user can direct animated characters This level of functionality is obtained by

Trang 35

Aims 1§3

enabling the characters themselves to do more of the work

It is important to point out that we do not doubt the ability of skillful programmers to put together aprogram that will generate specific high-level behavior Our aim is to build models so that skillful program-mers may work faster, and, less skilled programmers might be afforded success incommensurate with theirability Thus, cognitive models should play an analogous role as might physics or geometric models That

is, they are meant to provide a more suitable level of abstraction for the task in hand – they are not, per se,designed to replace the animator

Building cognitive models is very much a research area at the forefront of artificial intelligence research Itwas thus to cognitive robotics that we turned for inspiration [68] The original application area was roboticsbut we have adapted their theory of action to related cognitive modeling problems in computer animation.One of the key ideas we have adopted is that knowledge representation can play a fundamental role inattempting to build computational models of cognition We believe that the way a character represents itsknowledge is important precisely because cognitive modeling is (currently) such a poorly defined task If

a grand unifying theory of cognition is one day invented then the solution can be hard-coded into somecomputer chips and our work will no longer be necessary Until that day, general purpose cognitive modelswill be contentious or non-existent It would therefore seem wise to be able to represent knowledge simply,explicitly and clearly If this is not the case then it may be hard to understand, explain or modify thecharacter’s behavior We therefore choose to use regular mathematical logic to state behaviors Of course, infuture it may turn out that it is useful, or even necessary, to resort to more avant-garde logics We believe,however, that it makes sense to push the simplest approach as far as it can go

Admittedly, real animals do not appear to use logical reasoning for many of their decision making cesses However, we are only interested in whether the resulting behavior appears realistic at some level

pro-of abstraction For animation at least, faithfulness to the underlying representations and mechanisms webelieve to exist in the real-world are not what is important By way of analogy, physics-based animation

is a good example of how the real-world need not impinge on our research too heavily To the best of ourcurrent knowledge the universe consists of sub-atomic particles affected by four fundamental forces Forphysics-based animation, however, it is often far more convenient to pretend that the world is made of solidobjects with a variety of forces acting on them For the most part this results in motion that appears highlyrealistic There are numerous other examples (the interested reader is referred to [35]) but we do not wish

to wallow any further in esoteric points of philosophy We merely wish to quell, at an early stage, lines ofinquiry that are fruitless to the better understanding of this document

By choosing a representation with clear semantics we can clearly convey our ideas to machines, and ple Equally important, however, is the ease with which we are able to express our ideas Unfortunately,convenience and clarity are often conflicting concerns For example, a computer will have no problem un-derstanding us if we write in its machine code, but this is hardly convenient At the other extreme, naturallanguage is obviously convenient, but it is full of ambiguities The aim of our research is to explore how wecan express ourselves as conveniently as possible, without introducing any unresolvable ambiguity in howthe computer should interpret what we have written

peo-1.4 Challenges

The major challenge that faced us in achieving our aims was that we wanted our characters to displayelaborate behavior whilst, possibly, situated in unpredictable and complex environments This can makethe problem of reasoning about the effect of actions much harder This is a possible stumbling block in theunderstanding of our work, so we want to make our point as clear as possible

A computer simulated world is driven by a mathematical model consisting of rules and equations Aforward simulation consists of applying these rules and equations to the current state to obtain the newstate If we re-run a simulation with exactly the same starting conditions we expect to obtain exactly the

Trang 36

same sequence of events as the last time we ran it Therefore, it might not be clear to the reader in whatsense the character’s world is “unpredictable” To explain, let us imagine a falling stack of bricks, the topone of which is desired by some character In the real world, it is almost impossible to predict, with anyaccuracy, where the bricks will come to rest It makes more sense to have the character wait (preferably at

a safe distance) and see where the desired brick ends up In a simulated world, we can, in principle, runthe forward simulation once to see where the bricks end up Then we can re-run the simulation and tellthe character where the brick will end up We can even give the character’s themselves the ability to runforward simulations Such a character would essentially be clairvoyant, it could pre-compute the final brickpositions and go and quietly wait for its desired brick The point we wish to make is that this approach

is complicated, inefficient and (for non-clairvoyant characters) will result in unnatural behavior Thus, therepresentation used for simulating the falling bricks is not necessarily the appropriate one for our character

to have Indeed, it is highly unlikely that the representation we use for simulating the character’s world will

be appropriate for the character’s internal representation of that world used for deciding how to behave

A key consequence of this approach, however, is that events the character did not expect will occur in thecharacters’ world Hence, when we refer to the character’s world as complex and unpredictable, we meanthat it is so from the character’s point of view

It is worth pointing out that there should be nothing shocking in having more than one representation forthe same thing The practice is commonplace, and is central to computer graphics For example, considerthe process of rendering a geometric model For building the model we may choose to represent the model as

a parametric surface To take advantage of commonly available graphics hardware accelerators we may thenmove to a representation in terms of strips of triangles in three dimensions At the final stage of renderingthe objects will be represented as rows of pixels of differing intensities The point is that at each stage adifferent representation of the same thing is appropriate

Having multiple characters in a scene opens up possibilities for cooperative and competitive behaviors.However, even in purely kinematic worlds, this greatly increases the difficulty of predicting future events.That is, if one character is going to know what all the other characters are going to do in advance then itneeds to have a complete and accurate representation of all the other characters’ internal decision makingprocesses Even worse, it must be able to predict how they all react with each other, with the environment,and to itself If the reader remains unconvinced then we need only consider the addition of user interaction tocompletely dispel all hopes of a character being able to predict all elements of its environment with completeaccuracy Perhaps even more damning is the observation that super-intelligent characters that can peer intoeach other’s minds and predict every event that occurs in their world are not desirable That is, we supposethat the majority of animations will want to reflect the fact that real world creatures are not able to exactlypredict the future

To summarize, the key point is that in one animation the representation of the world used by each ofthe animated characters may vary, and it will not necessarily coincide with the representation for otherpurposes Thus, when we say the world is “unpredictable”, we mean that it is hard to predict using therepresentation the character has of that world It had, of course, better be entirely predictable from thesimulation viewpoint!

Moreover, even if all characters represent the same type of things, what each of them actually “knows”about the world may be quite different That is to say each character will be autonomous By makingthem independent, self-contained entities, we replicate the situation in the real-world and thus ensure thelevel of realism normally required in animations We also simplify the task of instructing them since weneed only concern ourselves with them one at a time As we pointed out above the correlation between thecharacter’s representation of its world and the representation for simulation should not solely be maintained

by reasoning The solution is that sensing must play a crucial role

Figure 1.2 depicts a scene in which a character is facing away from a light bulb The character doesnot know if the light bulb is on or not It imagines many possible worlds, in some of which the light is on,

in some of which it is off Of course, the worlds are also distinguished by the other things the characterbelieves about its world Perhaps, in some of the worlds, the character imagines it has selected a winninglottery number, in some of them not Regardless, from the point of graphical rendering the light bulb isindeed switched on The correspondence between the character’s view of the world and what is actually thecase (which is presumably what we want to render) can be established by the character turning around and

Trang 37

Methodology 1§5

Figure 1.2: Many possible worlds

looking The act of sensing forces the character to discard, as impossible, the worlds in which it imaginedthe light was on We shall return to this topic in more detail in section 3.8

We have achieved our aims for high-level control of animated characters by adopting an unambiguous tics for a character’s representation of its dynamic world In particular, we propose an approach in which theuser gives the character a behavior outline, or “sketch plan” The behavior outline specification language hassyntax deliberately chosen to resemble that of a conventional imperative programming language In terms

seman-of functionality, however, it is a strict superset In particular, a behavior outline need not be deterministic.This added freedom allows many behaviors to be specified more naturally, more simply, more succinctly and

at a much higher-level than would otherwise be possible The character has complete autonomy to decide

on how to fill in the necessary missing details For example, with some basic background information we canask a character to search for any path through a maze That is, we do not initially have to give an explicitalgorithm for how to find a particular path Later we may want to speed up the character’s decision makingprocess by giving more detailed advice

Although the underlying theory is completely hidden from the user, the success of our approach restsheavily on our use of a rigorous logical language, known as the situation calculus Aside from being powerfuland expressive, the situation calculus is well-known, simple and intuitive to understand The basic idea isthat a character views its world as a sequence of “snapshots” known as situations An understanding ofhow the world can change from one situation to another can then be given to the character by describingwhat the effect of performing each given action would be The character can use this knowledge to keeptrack of its world and to work out which actions to do next in order to attain its goals The version of thesituation calculus we use is inspired by new work in cognitive robotics [68] By solving the well-known “frameproblem”, it allows us to avoid any combinatorial explosion in the number of action-effect rules In addition,this new work incorporates knowledge-producing actions (like sensing), and allows regular programmingconstructs to be used to specify sketch plans All this has enabled us to propel our creatures out of the realm

of background animation and into the world of character animation Finally, there is active research intoextending the expressiveness of the situation calculus, and this makes it an exciting choice for the future

We have developed a character design workbench (CDW) that is both convenient to use, and results

in executable behavioral specifications with clear semantics We can use the system to control multiplecharacters in realistic, hard to predict, physics-based, dynamic worlds Interaction takes place at a level

Trang 38

that has many of the advantages of natural language but avoids the associated ambiguities Meanwhile,the ability to omit details from our specifications makes it straightforward to build, reconfigure or extendthe behavior control system of the characters The use of logical reasoning to shift more of the burden forgenerating behavior from the animator, to the animated characters themselves, our system is ideal for rapidprototyping and producing one-off animations Naturally, our system allows for fast replay of previouslygenerated control decisions However, when the speed of the initial decision making process is crucial, theuser can easily assume more responsibility for efficiency In particular, the behavioral controller can begradually fine-tuned to remove the non-determinism by adding in more and more algorithmic details

SYSTEM REACTIVE

Information virtual world about

1) Preconditions for performing an action

3) The initial state of the virtual world

2) The effect that performing an action would have on the virtual world

Build up a database the virtual world

of facts about

outline Behavior

Sensory information

ENGINE REASONING

In chapter 2 we will provide a categorization of some previous work that is mainly concerned with modelingthe virtual world for simulation purposes This will be important for readers unfamiliar with computeranimation It will provide the necessary background information to understand the ideas that follow Wewill also show how behavior animation came out as a natural consequence of continuing animation research

We will talk about some previous work and show how it fits into computer animation research as a whole.Chapter 3 will discuss the theoretical foundations of our work We shall discuss previous work fromwhich we obtain the basic concepts These concepts will be defined and explained with examples Particularattention will be paid to our approach to sensing to deal with the correspondence between the agent’srepresentation of its world and the representation the computer maintains for simulation purposes

The predictability of kinematic worlds means that our character-oriented view of the world, and oursimulation-oriented view of the world may coincide In chapter 4 we will show how the situation calculuscan be used to control a character’s behavior in a very direct way That is, it can be conveniently used rightdown to the locomotion level We shall demonstrate how the nondeterminism in our specification languagecan be used to succinctly specify behaviors An interpreter for our language can then automatically searchfor behaviors that conform to the specification We conclude the chapter with a discussion of an excitingapplication of our work to cinematography

Trang 39

Overview 1§6

Naively applying the situation calculus to physics-based applications leads to problems In particular

it is unlikely that we would want to produce a complete axiomatization of the virtual worlds’ complicatedcausal laws as embodied in the state equations Therefore we use the situation calculus to model theagent’s high-level knowledge of some of the relevant casual relationships in its domain For the sake ofrealism, efficiency, or both, we may choose to leave some of those relationships unspecified For example,

we probably don’t want to axiomatize the laws of physics by which a ball in our domain moves It thusbecomes imperative that we incorporate sensing to obtain information about those aspects we choose toomit Chapter 5 therefore seeks to exemplify the use of sensing to allow reasoning in unpredictable worlds

We give an example application of a physics-based underwater simulated world that is populated by, amongother things, merpeople The underlying physical model is that of [113] The underlying model gives usthe required level of unpredictability and allows us to produce visually appealing animations Currently themerpeople can hide from predators (e.g a large shark) behind obstacles that they “reasons” will obscure itfrom the predator’s view

Chapter 6 concludes the document, with a reverberant look at what has been accomplished and pointsout some promising directions for future work

Trang 40

Hardcore AI for Computer Games and Animation< /h3>

John David Funge199 8For applications in computer. .. [10]

Trang 23

Hardcore AI for Computer Games and Animation< /h3>

SIGGRAPH 98 Course Notes (Part II)... the character does not maintain an explicit model of its view of the world Maintaining asuitable representation allows high-level intuitive commands and queries to be formulated As we shall show,this

Tiêu đề	Hardcore ai for computer games and animation
Tác giả	John David Funge
Trường học	King's College London
Chuyên ngành	Computer Science
Thể loại	Bài luận
Năm xuất bản	1998
Thành phố	London

Định dạng
Số trang	162
Dung lượng	18,63 MB