The structure of a triangle mesh in M3G is as shown in Figure 5.1: vertex coordinates, other per-vertex attributes, and triangle indices are stored in their respective buffers, while ren
Trang 1Morphing is a good technique for producing complex deformations, such as facial
expressions, that are not easily reproduced by simple transformations The downside
of morphing is that it takes a lot of memory per keyframe, so the number of base
shapes should be kept relatively small for complex meshes Alternatively, morphing
can be used to define lots of keyframes for very simple meshes to get fairly complex
animation that is still computationally cheap to render in real time
4.2.2 SKINNING
For complex characters with lots of vertices and more or less arbitrary numbers of possible
poses, morphing quickly becomes inefficient Another way to deform meshes is to assign
vertices to the joints of an articulated skeleton, animate them, and connect the vertices
with a skin of polygons [Cat72] However, that still leads to sharp changes at joints During
the 1990s the gaming and 3D modeling industry generalized this approach and started
calling it skinning [Lan98] The idea is that each vertex can be associated with several
joints or bones, weighted by linear weights This technique is sometimes referred to as
subspace surface deformation, or linear blend skinning; we simply call it skinning It is so
commonly used today that we can call it the de facto standard of character animation.
The general idea behind skinning is that instead of transforming the whole mesh with
a single transformation matrix, each vertex is individually transformed by a weighted
blend of several matrices as shown in Figure 4.7 By assigning different weights to different
vertices, we can simulate articulated characters with soft flesh around rigid bones
The skeleton used in skinning stands for a hierarchy of transformations An example
hierarchy can be seen in Figure 4.7 The pelvis is the root node, and the rest of the body
parts are connected to each other so that the limbs extend deeper into the hierarchy
Each bone has a transformation relative to the parent node—usually at least translation
and rotation, but scaling can be used, for example, for cartoon-like animation The
hier-archy also has a rest pose (also known as bind pose) in which the bone transformations
are such that the skeleton is aligned with the untransformed mesh
Having the skeleton hierarchy, we can compute transformations from the bones to the
common root node This gives us a transformation matrixT i for each bone i The matrices
for the rest pose are important, and we denote thoseB i
The relative transformation that takes a rest poseB to a target pose T is TB−1 From
this, and allowing a vertexv to have weighted influence w ifrom several bones, we get the
skinning equation for a transformed vertex
Note that we can either transform the vertex with each matrix, then compute a blend of
the transformed vertices, or compute a blend of the matrices and transform the vertex just
once using the blended matrix The latter can in some cases be more efficient if the inverse
Trang 2DEFORMING MESHES
F i g u r e 4.7: Left: A skeletally animated, or skinned, character Each arrow ends in a joint, and each
joint has a bone transformation, usually involving at least translation and rotation Right: A close-up
of one animated joint, demonstrating vertex blending The vertices around the joint are conceptually transformed with both bone transformations, resulting in the positions denoted by the thin lines and black dots The transformed results are then interpolated (dotted line, white dots) to obtain the final skins (thick lines).
transpose matrix is needed for transforming vertex normals Also, the modelview matrix
can be premultiplied into each matrix T ito avoid doing the camera transformation as a separate step after the vertex blending
With hardware-accelerated skinning, using either vertex shaders or the OpenGL matrix palette extension (Section 10.4.3), the vertices will be transformed each time the mesh is rendered With multi-pass rendering in particular, the mesh will therefore be transformed multiple times A software implementation can easily perform the calculations only when necessary and cache the results, but this will still place a considerable burden on the CPU
As an animated mesh typically changes for each frame, there is usually no gain from using software skinning if hardware acceleration is available, but it is worth keeping the option
in mind for special cases
The animation for skinning can come from a number of sources It is possible to use keyframe animation to animate the bones of the skeleton, with the keyframes modeled by hand or extracted from motion capture data Another possibility is to use physics-based animation Rigid body dynamics [Len04] are often used to produce “ragdoll” effects, for example when a foe is gunned down and does a spectacular fall from height in a shooter game Inverse kinematics (IK) [FvFH90, WW92] can also be used to make hands touch scene objects, align feet with the ground, and so forth Often a combination of these tech-niques is used with keyframe animation driving the normal motion, rigid body dynamics
Trang 3stepping in for falling and other special effects, and IK making small corrections to avoid
penetrating scene geometry
4.2.3 OTHER DYNAMIC DEFORMATIONS
Naturally, dynamic deformation of meshes need not be limited to morphing and skinning
As we can apply arbitrary processing to the vertices, either in the application code or, more
commonly, in graphics hardware, almost unlimited effects are possible
One common example of per-vertex animation is water simulation By applying
displace-ments to each vertex based on a fluid simulation model, a convincing effect can be created
Different kinds of physics-based deformation effects include soft body modeling, whereby
the mesh deforms upon contact based on, for example, a mass-and-spring simulation A
variation of this is cloth modeling, where air density plays a more important role The
details of creating these and other effects are beyond the scope of this book For further
information, refer to the bibliography ([WW92, EMP+02])
Once the vertex data is dynamically modified by the application, it needs to be fed to the
rendering stage Most graphics engines prefer static vertex data, which allows for
opti-mizations such as precomputing bounding volumes or optimizing the storage format and
location of the data Vertex data that is dynamically uploaded from the application
pro-hibits most such optimizations, and it also requires additional memory bandwidth to
transfer the data between application and graphics memory Therefore, there is almost
always some performance reduction associated with dynamically modifying vertices The
magnitude of this performance hit can vary greatly by system and application—for
exam-ple, vertex shaders in modern GPUs can perform vertex computations more efficiently
than application code because there is no need to move the data around in memory, and
it has an instruction set optimized for that particular task This is also the reason that
modern rendering APIs, including both OpenGL ES and M3G, have built-in support for
the basic vertex deformation cases—to enable the most efficient implementation for the
underlying hardware
Trang 4SCENE MANAGEMENT
By dealing with individual triangles, matrices, and disparate pieces of rendering state, you are in full control of the rendering engine and will get exactly what you ask for However, creating and managing 3D content at that level of detail quickly becomes a burden; this typically happens when cubes and spheres no longer cut it, and graphic artists need to get involved Getting their animated object hierarchies and fancy materials out of 3ds Max or Maya and into your real-time application can be a big challenge The task is not made any easier if your runtime API cannot handle complete objects, materials, characters, and scenes, together with their associated animations The artists and their tools deal with higher-level concepts than triangle strips and blending functions, and your runtime engine should accommodate to that
Raising the abstraction level of the runtime API closer to that of the modeling tools facilitates a content-driven approach to development, where designers can work inde-pendently of programmers, but it has other benefits as well It flattens the learning curve, reduces the amount of boilerplate code, eliminates many common sources of error, and in general increases the productivity of both novice and expert programmers
A high-level API can also result in better performance, particularly if you are not already
a 3D guru with in-depth knowledge of all the software and hardware configurations that your application is supposed to be running on
In this chapter, we take a look at how 3D objects are composed, how the objects can
be organized into a scene graph, and how the scene graph can be efficiently rendered and updated Our focus is on how these concepts are expressed in M3G, so we do not
Trang 5SCENE MANAGEMENT
cover the whole spectrum of data structures that have been used in other systems or
that you could use in your own game engine For the most part, we will use terminology
from M3G
5.1 TRIANGLE MESHES
A 3D object combines geometric primitives and rendering state into a self-contained visual
entity that is easier to animate and interact with than the low-level bits and pieces are 3D
objects can be defined in many ways, e.g., with polygons, lines, points, B´ezier patches,
NURBS, subdivision surfaces, implicit surfaces, or voxels, but in this chapter we
concen-trate on simple triangle meshes, as they are the only type of geometric primitive supported
by M3G
A triangle mesh consists of vertices in 3D space, connected into triangles to define a
surface, plus associated rendering state to specify how the surface is to be shaded The
structure of a triangle mesh in M3G is as shown in Figure 5.1: vertex coordinates, other
per-vertex attributes, and triangle indices are stored in their respective buffers, while
rendering state is aggregated into what we call the appearance of the mesh Although
this exact organization is specific to M3G, other scene graphs are usually similar We
will explain the function of each of the mesh components below
VertexBuffers are used to store per-vertex attributes, which, in the case of M3G,
include vertex coordinates (x, y, z), texture coordinates (s, t, r, q), normal vectors
(n x , n y , n z ), and colors (R, G, B, A) Note that the two first texture coordinates (s, t) are
Mesh VertexBuffer
Texture Coordinates
Coordinates Normals Colors
Texture Coordinates VertexArray
VertexArray
VertexArray VertexArray VertexArray Appearance
Appearance Appearance
IndexBuffer IndexBuffer
IndexBuffer
F i g u r e 5.1:The components of a triangle mesh in M3G.
Trang 6TRIANGLE MESHES
enough for typical use cases, but three or four can be used for projective texture mapping and other tricks
The coordinates and normals of a triangle mesh are given in its local coordinate system—
object coordinates—and are transformed into eye coordinates by the modelview matrix.
The mesh can be animated and instantiated by changing the modelview matrix between frames (for animation) or between draw calls (for instantiation) Texture coordinates are also subject to a4 × 4 projective transformation This allows you to scroll or otherwise animate the texture, or to project it onto the mesh; see Section 3.4.1 for details
IndexBuffersdefine the surface of the mesh by connecting vertices into triangles, as shown in Figure 5.2 OpenGL ES defines three ways to form triangles from consecutive indices—triangle strips, lists, and fans—but M3G only supports triangle strips There
may be multiple index buffers per mesh; each buffer then defines a submesh, which is the
basic unit of rendering in M3G Splitting a mesh into submeshes is necessary if different parts of the mesh have different rendering state; for example, if one part is translucent while others are opaque, or if the parts have different texture maps
The Appearance defines how a mesh or submesh is to be shaded, textured, blended, and so on The appearance is typically divided into components that encapsulate coher-ent subsets of the low-level rendering state: Figure 5.3 shows how this was done for M3G The appearance components have fairly self-explanatory names: the Texture2D object, for instance, contains the texture blending, filtering, and wrapping modes, as well as the
4 × 4 texture coordinate transformation matrix The texture image is included by refer-ence, and stored in an Image2D object Appearances and their component objects can
vertices
coordinates
triangles
texcoords
indices
colors
normals
nx ny nz nx ny nz nx ny nz nx ny nz
2
1
T1 3 T2
T3 4
F i g u r e 5.2: Triangle meshes are formed by indexing a set of vertex arrays Here the triangles are organized into a triangle list, i.e., every three indices define a new triangle For example, triangle T2 is formed by the vertices 2, 4, and 3.
Trang 7SCENE MANAGEMENT
Fog Compositing Mode Polygon Mode
Image2D
Image2D
Texture2D
Texture2D
F i g u r e 5.3:The appearance components in M3G Implementations may support an arbitrary number
of texturing units, but the most common choice (two units) is shown in this diagram.
be shared between an arbitrary number of meshes and submeshes in the scene graph The
appearance components of M3G are discussed in detail in Chapter 14
5.2 SCENE GRAPHS
Rendering a single 3D object may be useful in a demo or a tutorial, but to create something
more exciting you will need a number of 3D objects in a particular spatial and logical
arrangement—a 3D scene.
3D scenes can be organized into many different data structures that are collectively
referred to as scene graphs The term is decidedly vague, covering everything from simple
lists of objects up to very sophisticated spatial databases In this section we aim to
char-acterize the design space of scene graphs, progressively narrowing down our scope to the
small subset of that space that is relevant for M3G
5.2.1 APPLICATION AREA
When setting out to design a scene graph system, the first thing to decide is what it is for
Is it for graphics, physics, artificial intelligence, spatial audio, or a combination of these?
Is it designed for real-time or offline use, or both? Is it for a specific game genre, such
as first-person shooters or flight simulators, or maybe just one title? A unified scene
rep-resentation serving all conceivable applications would certainly be ideal, but in practice
we have to specialize to avoid creating a huge monolithic system that runs slowly and is
difficult to use
Trang 8SCENE GRAPHS
Typical scene graphs strike a balance by specializing in real-time animation and rendering, but not in any particular application or game genre This is also the case with M3G Physics, artificial intelligence, audio, user interaction, and everything else is left for the user, although facilitated to some extent by the ability to store metadata and invisi-ble objects into the main scene graph Adjunct features such as collision detection are included in some systems to serve as building blocks for physics simulation, path find-ing, and so on M3G does not support collision detection, but it does provide for simple
picking—that is, shooting a ray into the scene to see which object and triangle it first
inter-sects This can be used as a replacement to proper collision detection in some cases
5.2.2 SPATIAL DATA STRUCTURE
Having decided to go for a rendering-oriented scene graph, the next step is to pick the right spatial data structure for our system The application areas or game genres that we have in mind play a big role in that decision, because there is no single data structure that would be a perfect fit for all types of 3D scenes
The main purpose of a spatial data structure in this context is visibility processing, that
is, quickly determining which parts of the scene will not contribute to the final rendered image Objects may be too far away from the viewer, occluded by a wall, or outside the
field of view, and can thus be eliminated from further processing This is called visibility culling In large scenes that do not fit into memory at once, visibility processing includes paging, i.e., figuring out when to load each part of the scene from the mass storage device,
and which parts to remove to make room for the new things
Depending on the type of scene, the data structure of choice may be a hierarchical space
partitioning scheme such as a quadtree, octree, BSP tree, or kd-tree Quadtrees, for
exam-ple, are a good match with terrain rendering Some scenes might be best handled with portals or precomputed potentially visible sets (PVS) Specialized data structures are available for massive terrain scenes, such as those in Google Earth See Chapter 9 of
Real-Time Rendering [AMH02] for an overview of these and other visibility processing
techniques
Even though this is only scratching the surface, it becomes clear that having built-in support for all potentially useful data structures in the runtime engine is impossible Their sheer number is overwhelming, not to mention the complexity of implementing them Besides, researchers around the world are constantly coming up with new and improved data structures
The easy way out, taken by M3G and most other scene graphs, is to not incorporate any
spatial data structures beyond a transformation hierarchy, in which scene graph nodes
are positioned, oriented, and otherwise transformed with respect to their scene graph parents This is a convenient way to organize a 3D scene, as it mirrors the way that things are often laid out in the real world—and more important, in 3D modeling tools
Trang 9SCENE MANAGEMENT
The solar system is a classic example of hierarchical transformations: the moons orbit
the planets, the planets orbit the sun, and everything revolves around its own axis The
solar system is almost trivial to set up and animate with hierarchical transformations, but
extremely difficult without them The human skeleton is another typical example
Visibility processing in M3G is limited to view frustum culling that is based on a bounding
volume hierarchy; see Figure 5.4 While the standard does not actually say anything about
bounding volumes or visibility processing, it appears that all widely deployed
implemen-tations have independently adopted similar means of hierarchical view frustum culling
We will discuss this in more detail in Section 5.3
Implementing more specialized or more advanced visibility processing is left for the user
Luckily, this does not mean that you would have to ditch the whole scene graph and start
from scratch if you wanted to use a quadtree, for instance You can leverage the built-in
scene tree as a basis for any of the tree structures mentioned above Also, the same triangle
meshes and materials can often be used regardless of the higher-level data structure
The fact that typical scene graphs are geared toward hierarchical view frustum culling and
transformations is also their weakness There is an underlying assumption that the scene
graph structure is a close match to the spatial layout of the scene To put it another way,
nodes are assumed to lie close to their siblings, parents, and descendants in world space
Violating this assumption may degrade performance If this were not the case, you might
want to arrange your scene such that all nonplayer characters are in the same branch of
the graph, for instance
The implicit assumption of physical proximity may also cause you trouble when nodes
need to be moved with respect to each other For instance, characters in a game world
A
C
D
D
F i g u r e 5.4: A bounding volume hierarchy (BVH) consisting of axis-aligned bounding boxes, illustrated in two dimensions for clarity The bounding volume of node A encloses the bounding volumes of its children.
Trang 10SCENE GRAPHS
may be wandering freely from one area to another The seemingly obvious solution is
to relocate the moving objects to the branches that most closely match their physical locations However, sometimes it may be difficult to determine where each object should
go Structural changes to the scene graph may not come for free, either
5.2.3 CONTENT CREATION
Creating any nontrivial scene by manually typing in vertices, indices and rendering state bits is doomed to failure Ideally, objects and entire scenes would be authored in commer-cial or proprietary tools, and exported into a format that can be imported by the runtime engine M3G defines its own file format to bridge the gap between the runtime engine and DCC tools such as 3ds Max, Maya, or Softimage; see Figure 5.5 The file format is a precise match with the capabilities of the runtime API, and supports a reasonable subset
of popular modeling tool features
From the runtime engine’s point of view, the main problem with DCC tools is that they are so flexible The scene graph designer is faced with an abundance of animation and rendering techniques that the graphics artists would love to use, but only a fraction of which can be realistically supported in the runtime engine See Figure 5.6 to get an idea
of the variety of features that are available in a modern authoring tool
DCC tool
Exporter
Intermediate Format (e.g., COLLADA)
Optimizer &
Converter
Delivery Format (M3G)
M3G Loader
Runtime Scene Graph
F i g u r e 5.5: A typical M3G content production pipeline None of the publicly available exporters that we are aware of actually use COLLADA as their intermediate format, but we expect that to change in the future.