Design and architecture of a portable and extensible multiplayer 3d game engine thesis

Design and Architectureof a Portable and Extensible Multiplayer 3D Game Engine by Markus Hadwiger Institute of Computer Graphics Vienna University of Technology... This thesis is about t

Trang 1

1 Introduction 1

4 Game Technology Evolution 13

6 Architecture Overview 33

7 Architecture Details 41

Trang 2

8 Low-level Subsystem Specifications 50

10 The ITER Interface 70

12 The Particle System 100

13 Networking 121

14 The Command Console 127

PART III: Conclusion

15 Concluding Remarks 136

16 Bibliography 138

Trang 3

Design and Architecture

of a Portable and Extensible Multiplayer 3D Game Engine

by Markus Hadwiger

Institute of Computer Graphics Vienna University of Technology

Trang 4

I would like to dedicate this thesis and corresponding work to my parents,

Dr Alois and Ingrid Hadwiger, without whom nothing of this would have been possible.

Trang 5

This thesis is about the design and architecture of a 3D game engine – the technology behind

a computer game featuring three-dimensional graphics.

Over the last couple of years, the development of 3D computer games and graphics research have come ever closer to each other and already merged in many respects Game developers are increasingly utilizing the scientific output of the research community, and graphics researchers have discovered computer games to have become an important application area of their scientific work.

As the technology employed by computer games becomes more and more involved, the design and architecture of the underlying engines attains a crucial role Increasingly, extremely modular designs encapsulating system-dependencies are chosen in order to allow for portability from one target platform to another Extensibility has also become important, since many game engines are used for more than one game It also helps foster a community

of game players that develops extensions for their favorite game and keeps the game alive for

a much longer period of time.

The topic of this thesis is a particular game engine – the Parsec Engine –, focusing on architectural and design aspects This engine emphasizes and allows for portability by employing a very modular architecture It is also a multiplayer game engine, i.e., it contains a powerful networking component through which multiple players can play together.

Extensibility is achieved in many respects, although the Parsec Engine is not a completely general game engine aiming to be a framework for every kind of game imaginable, but rather attempts to achieve high performance for a specific type of game Since it has been developed

in close conjunction with an actual computer game – Parsec, a multiplayer 3D space combat game – it favors outer space settings with spacecraft.

While the focus is on architecture and design, another goal has been to describe crucial aspects at a level of detail that allows for using the engine to develop new game play features and special effects, as well as extending the engine itself.

Trang 6

Diese Diplomarbeit handelt von Design und Architektur einer 3D Computerspiele-Engine – jener Technologie, die hinter einem Computerspiel steht, das dreidimensionale Graphik einsetzt.

In den letzten Jahren haben sich die Entwicklung von 3D Computerspielen und die Computergraphik-Forschung immer mehr angenähert und sind in vielerlei Hinsicht bereits verschmolzen Spieleentwickler nutzen in zunehmendem Ausmaß die wissenschaftlichen Resultate der Forschergemeinschaft, und auch die Forscher in der Computergraphik haben entdeckt, daß Computerspiele mittlerweile ein wichtiges Anwendungsgebiet ihrer Arbeit geworden sind.

Da die Technologie die in Computerspielen eingesetzt wird immer komplexer wird, kommt dem Design und der Architektur der zugrundeliegenden Engines eine immer bedeutendere Rolle zu Zusehends werden extrem modulare Designs gewählt, die Abhängigkeiten vom eigentlichen Computersystem kapseln und so Portierbarkeit von einer Zielplattform zu einer anderen ermöglichen Erweiterbarkeit ist ebenso wichtig geworden, da viele Spiele-Engines für mehr als nur ein Spiel eingesetzt werden Diese Eigenschaft hilft ebenso beim Aufbau von Spielergemeinschaften, in denen Spieler Erweiterungen für ihr Lieblingsspiel entwickeln und

es so für einen wesentlich längeren Zeitraum am Leben erhalten.

Das Thema dieser Diplomarbeit ist eine ganz bestimmte Spiele-Engine, die „Parsec Engine“, wobei sie sich auf Architektur und Design konzentriert Diese Engine betont und erlaubt Portabilität, indem sie eine sehr modulare Architektur einsetzt Weiters handelt es sich um eine Mehrspieler-Engine, d.h sie enthält eine leistungsfähige Netzwerkkomponente, durch die mehrere Spieler miteinander spielen können.

Erweiterbarkeit wurde in vielerlei Hinsicht erreicht, obwohl die Parsec Engine keine komplett allgemeine Engine mit dem Ziel, für alle möglichen Arten von Spielen verwendbar zu sein, ist, sondern versucht, hohe Leistung bei einem bestimmten Spieltyp zu erzielen Da sie eng zusammen mit einem vollständigen Computerspiel entwickelt wurde – Parsec, ein Mehrspieler-3D-Weltraumkampf-Spiel – bevorzugt sie Weltraum-Szenen mit Raumschiffen Obwohl das Hauptaugenmerk auf Architektur und Design liegt, war es ebenso ein Ziel die wesentlichsten Aspekte in einem Detailgrad zu beschreiben, der es erlaubt, die Engine zur Entwicklung neuer Spielfeatures und Spezialeffekte zu verwenden und auch die Engine selbst

zu erweitern.

Trang 7

I would like to express a very special thank you to my “coach” Dieter Schmalstieg for

supervising this thesis, supporting my vision and approach to it, many helpful comments and remarks, and putting up with me as a student in general And – I will not forget my promise of

an extended session of playing Parsec.

Since Parsec has become a real team effort over the years, I would like to express my thanks and admiration to all the talented members of the Parsec team for their great work and dedication:

Andreas Varga, Clemens Beer, and Michael Wögerbauer, who are all great programmers and

have contributed a lot to the spirit of the Parsec project.

Alex Mastny, who has truly astounded me with his ability to create ever more impressive

artwork and designs.

Stefan Poiss, whose musical abilities are still a miracle to me after all those years, and the

resulting music always managed to cheer me up and motivate me.

Karin Baier, for giving voice to an entire universe.

Thanks to Thomas Bauer for support in numerous and invaluable ways, and Uli Haböck for

many insightful discussions on mathematics late at night.

Thanks to Thomas Theußl for tolerating the presence of gametech-talking nerds invading his

office, occupying his machine, and for always being there to share a cup of hot chocolate.

Thanks to Karin Kosina for a fine sense of humor, her spirit, and her outstanding tastes in music – Kaos Keraunos Kybernetos!

Thanks to Helwig Hauser for prodding me to give numerous talks, CESCG, and the Computer Graphics Group – la tante est morte, vive la tante !

Thanks to Gerd Hesina for a lot of help on numerous occasions.

Thanks to Anna Vilanova i Bartrolí and Zsolt Szalavári for being a lot of fun.

Last, but not least, I would like to thank all the Parsec fans out there, who are eagerly awaiting the final release we are working on it!

Trang 8

Introduction 1

1 Introduction

1.1 3D Computer game engines and graphics research

About ten years ago computer graphics research and computer games development were two totally separateareas with not much common ground Computer games have always been almost exclusively targeted at theconsumer market, and the graphics capabilities of consumer hardware at that time were practically non-existentfrom the point of view of computer graphics researchers Thus, researchers were using expensive high-endgraphics workstations, and game developers were targeting entirely different platforms This had the implicationthat algorithms and techniques developed and used by the computer graphics research community could notreally be employed by game developers Graphics researchers, on the other hand, also did not view computergames as an application area of their scientific work

However, the last decade, and the last couple of years in particular, have seen low-cost consumer graphicshardware reach – and in many areas even surpass – the capabilities of extremely expensive high-end graphicshardware from just a short time ago This has led to computer game developers increasingly utilizing thescientific output of the research community, with an ever diminishing delay between introduction of a newtechnique and its actual use in a consumer product Computer games research and development has thus become

a very important application area for graphics researchers

The game engines driving today’s 3D computer games are indeed an incredibly interesting venue wherecomputer graphics research and engineering meet However, although the graphics component of current gameengines is probably the most visible one, a game engine is not only graphics One could say that the term gameengine subsumes all the technology behind a game, the framework which allows an actual game to be created,but viewed separately from the game’s content This also includes networking, AI, scripting languages, sound,and other technologies

Figure 1.1: Two highly successful game engines Left: Quake Right: Unreal

In addition to driving highly successful games in their own right, the most prominent game engines of the lastseveral years, like the Quake [Quake] and Unreal [Unreal] engines, have been remarkable licensing successes.Although both were originally created for a particular game, these engines are ideal to be used for the creation ofother games with entirely new content like game play, graphics, sound, and so forth In fact, the effort necessary

to create a top-notch game engine has become so tremendous, that it is also very sensible from a business point

of view to make one’s engine available for licensing to other companies, allowing them to focus more on actualcontent creation than on the technology itself This is even more important for companies that are simply notable to develop an entire engine and a game from scratch at the same time, than for the company allowinglicensing of its engine

In contrast to game engines that have been developed for a specific game or series of games, a small number ofcompanies focus on developing only an engine, without also creating a game at the same time An example forthis approach of creating flexible game engine technology exclusively for licensing, is Numerical Design Ltd.’sNetImmerse [NetImm] Due to its flexibility such an engine could be called a generic game engine, whereas theQuake engine, for instance, is definitely more bound to an actual type of game, or at least a specific gaming

Trang 9

Introduction 2

genre On the other hand, this flexibility is also one of the major problems inherent in generic engines Sincethey are not tightly focused on a rather narrowly defined type of game, they are not easily able to achieve thelevels of performance of engines specifically tailored to a certain genre Another problem of generic engines isthat most game development houses prefer licensing an engine with already proven technology, i.e., a highlysuccessful game that has sold at least on the order of one million units

1.2 What this thesis is about

This thesis contains a lot of material pertaining to the architecture of computer game engines in general, but most

of all it is about the design and architecture of a specific engine This engine – the Parsec engine – is a highlyportable and extensible multiplayer game engine The design of this engine was guided by certain key designcriteria like modularity, flexibility, extensibility, and – most of all – portability

As opposed to generic game engines, the Parsec engine belongs to the class of game or genre-bound engines.This is also to say that it has been developed in parallel with an actual computer game, not coincidentally calledParsec – there is no safe distance [Parsec]

Figure 1.2: Parsec – there is no safe distance

Parsec is a multiplayer 3D space combat game, specifically targeted at Internet game play Basically, players canjoin game play sessions in solar systems, and also jump from solar system to solar system using so-calledstargates, interconnecting them The emphasis is on fast-paced action, spectacular graphics, and backgroundsound and music supporting the mood of the vast outer space setting Chapter 5 contains a detailed overview ofthe game and its underlying concepts

The Parsec engine offers high performance graphics utilizing an abstract rendering API, and a modularsubsystem architecture encapsulating system-dependencies in order to achieve high portability The networkingsubsystem is also independent from the underlying host system and transport protocol used The currentimplementation supports Win32, Linux, and the MacOS as host systems, OpenGL and Glide as graphics APIs,and TCP/IP and IPX as networking protocols Since one of the major goals in the development of the Parsecengine was to achieve high portability, porting to additional host systems, graphics APIs, and networkingprotocols is a rather straightforward process

In the description of the Parsec engine we will emphasize architectural and design aspects Nevertheless, wehave also tried to provide sufficient detail in order to support developers interested in using, extending, andadding to Parsec, the game, as well as the Parsec engine itself

Trang 10

Introduction 3

1.3 How this thesis is structured

This thesis consists of three major parts

Part One – Foundations and Related Work – covers fundamental issues like some very important graphics

algorithms used in many game engines It also contains a short review and introduction to the most importantlow-level graphics APIs used in 3D computer games, like OpenGL and Glide The related work aspect is mostprevalent in an overview of the evolution of 3D computer games and consumer 3D hardware over the lastdecade In this part we will cover a lot of background material about computer games and the algorithms andtechniques they use or have used in the past

Part Two – Design and Architecture of Parsec – covers the main topic of this thesis, the design and architecture

of the Parsec engine Although the focus is on architectural and design aspects, many implementation issues arecovered in detail The key design goal of portability is central to the entire architecture of the engine We willfirst review the game Parsec, then give an overview of its architecture and continue with more details about thearchitecture and its implementation After these more general parts we will be devoting a single chapter to each

of the major engine components

Part Three – Conclusion – concludes with a comprehensive bibliography and offers some concluding remarks.

This thesis is also intended for providing background and reference information to developers intending to workwith the Parsec SDK

The implementation of all of these components currently consists of about 180,000 lines of code, written in a like subset of C++, contained in about 900 source files, which amount to about 7MB of code Supported hostsystems are Win32, MacOS, and Linux Both OpenGL and Glide (3dfx) are supported as target graphics APIs onall of these host systems The available implementations of the networking component support the TCP/IP suite

C-of protocols and Novell’s IPX

The code is very modular, especially with respect to portability between different host systems and graphicsAPIs It transparently supports both little-endian (e.g., Intel x86) and big-endian (e.g., PowerPC) systems, andencapsulates many other details differing between target platforms

It is contained in a hierarchically structured source tree, whose layout corresponds to the subsystem architecture,and host system and graphics API targets, as will be described in the main part of the thesis (Part Two) Since thesystem-dependent parts are cleanly separated and isolated, it is easily possible to extract only those source filesneeded for a specific target platform and/or graphics API

A more detailed overview of the implementation is contained in chapter 5, as well as the following chapters onarchitecture, and the chapters devoted to specific subsystems

Trang 11

PART I:

Foundations and Related Work

Trang 12

z-the complexity of z-the part of z-the scene being actually visible Algorithms with this property are called occlusion culling algorithms or output sensitive visibility algorithms [Sud96].

In this overview, the distinction between objects and polygons in a scene is somewhat put aside, as we oftenassume objects to consist of polygons and consider objects at either the object level or the polygon level, asappropriate Despite the emphasis on polygonal scenes, since they are simply the most important ones ininteractive systems, some of the concepts and algorithms we discuss are also applicable to other objectrepresentations−e.g., implicit surfaces, constructive solid geometry (CSG), or objects modeled as collection ofspline surface patches−if only objects’ bounding volumes are considered

Nevertheless, the most advanced and also most recent work mostly considers purely polygonal scenes Ofcourse, all algorithms are trivially applicable if a non-polygonal scene is converted to an explicit polygonalrepresentation in a preprocessing step Hence, we do not explicitly consider non-polygonal scene or objectrepresentations in our discussion

There are two large steps which need to be taken in order to prune a given scene to a manageable level ofcomplexity:

The first step is to cull everything not intersecting the viewing frustum at all Since that part of the scene cannotcontribute to the generated image in any way it should be eliminated from processing by the graphics pipeline asearly as possible Additionally, those parts of the scene have to be culled in large chunks, because we do notwant to clip everything and conclude that nothing at all is visible afterwards Thus, there has to be some scheme

to group polygons and objects together, ideally in a hierarchical fashion

The second step, which needs to be taken after all polygons outside the viewing frustum have already beenculled, is the elimination of unnecessary overdraw Especially in complex, heavily occluded scenes1the number

of polygons contained in the viewing frustum will still be too high to be handled by z-buffering alone.Everything still present at this step is potentially visible, but only with respect to the entire viewing frustum Alarge number of polygons, depending on the scene, will be drawn into the frame buffer only to be overdrawn bypolygons being nearer to the viewpoint later on And, depending on drawing order, a large number of polygonswill be rasterized by the hardware only to be separately rejected at every single pixel as being invisible, because

a nearer polygon has already been drawn earlier on and filled the z-buffer accordingly So, if those polygonscould be identified before they are submitted to the hardware, a huge gain in performance would be possible,again, depending on the scene In heavily occluded scenes the speedup gained can be enormous Naturally, theefficiency of this identification process is an important economical factor

To summarize, polygons have to be culled in essentially two steps First, everything not even partially contained

in the viewing frustum is identified and culled Second, everything that is entirely occluded by other objects orpolygons is culled The remaining polygons can be submitted to the hardware z-buffer to resolve the visibilityproblem at the pixel level, or any other algorithm capable of determining exact visibility can be used.Alternatives to z-buffering for resolving exact visibility are mostly important in software-only renderingsystems, though

1

These are scenes with a high depth complexity I.e., if all polygons comprising such a scene are rendered into the graphics

hardware’s frame buffer as seen from any given viewpoint, pixels are drawn many times over, instead of only once If no occlusion culling is employed each pixel is contained in the projection of many polygons and they all have to be rendered in order to, say, have a z buffer determine which single polygon is really visible at that particular pixel Hence a major goal of occlusion culling is to reduce a given scene’s depth complexity as perceived by most levels of the graphics pipeline.

Trang 13

Algorithms 6

2.1 Hierarchical Subdivision

An important scheme employed by virtually every occlusion culling algorithm is hierarchical partitioning ofspace If a partition high up in such a spatial hierarchy can be classified as being wholly invisible, the hierarchyneed not be descended any further at that point and therefore anything subdividing that particular partition to afiner level need not be checked at all If a significant part of a scene can be culled at coarse levels of subdivision,the number of primitives needing to be clipped against the viewing frustum can be greatly reduced

Hierarchical subdivision is the most important approach to exploiting spatial coherence in a synthetic scene

We are now going to look at some of the most common spatial partitioning schemes employing some sort ofhierarchy and their corresponding data structures

Hierarchical Bounding Boxes

A simple scheme to impose some sort of hierarchy onto a scene is to first construct a bounding box for eachobject and then successively merge nearby bounding boxes into bigger ones until the entire scene is contained in

a single huge bounding box This yields a tree of ever smaller bounding boxes which can be used to discardmany invisible objects at once, provided their encompassing bounding box is not visible [Möl99]

However, this simple approach is not very structured and systematic and therefore useful heuristics as to whichbounding boxes to merge at each level are difficult to develop The resulting hierarchy also tends to perform wellfor certain viewpoints and extremely bad for other locations of a synthetic observer

To summarize, the achievable speedup for rendering complex scenes by using such a scheme alone for occlusionculling is highly dependent on the given scene and, worse, on the actual viewpoint, generally ratherunpredictable and therefore simple hierarchical bounding boxes are very often not sufficient as sole approach toocclusion culling, especially if observer motion is desired That said, some variant of this idea can nevertheless

be very useful when used to provide auxiliary information or setup data

Octrees

The octree is a well known spatial data structure that can be used to represent a hierarchical subdivision of dimensional space At the highest level, a single cube represents the entire space of interest A cube for which acertain subdivision criterion is not yet reached is further subdivided into eight equal-sized cubes−their parentcube’s octants This process can be naturally represented by a recursive data structure commonly called octree.Each node of an octree has from one to eight children if it is an internal node; otherwise it is a leaf node

three-In the context of occlusion culling, octrees are usually used to hierarchically partition object or world space,respectively For example, each object could be associated with the smallest fully enclosing octree node Cullingagainst the viewing frustum can then be done by intersecting frustum and octree, culling everything contained innon-intersecting nodes This intersection operation can also easily exploit the hierarchy inherent in the octree.First, the root node is checked for intersection If it intersects the viewing frustum, its children are checked andthis process is repeated recursively for each node A node not intersecting the viewing frustum at all can beculled together with its entire suboctree

This scheme can also be employed to cull entire suboctrees occluded by other objects, provided such anocclusion check for octree cubes is possible and can be done effectively

One of the most important efficiency problems with using octrees for spatial subdivision of polygonal scenes iscaused by the strictly regular subdivision at each level Since the possible location of octree nodes is ratherinflexible, certain problem cases can occur Imagine, for instance, a subdivision where each polygon isassociated with the smallest enclosing octree cube This process is very dependent on the location of each of thepolygons If a polygon is located about the center of an octree cube it has to be associated with that particularcube, even if its size is very small compared to the size of the cube itself Even though schemes to alleviate thisproblem have been proposed, this and similar problems are inherent shortcomings of a regular subdivision, even

if it is hierarchical

Trang 14

Algorithms 7

The two-dimensional version of an octree is called quadtree and can be used to hierarchically subdivide imagespace, for instance Incidentally, image space pyramids and image space quadtrees, although very similar, are notthe same A pyramid always subdivides to the same resolution in all paths, i.e it cannot adapt to different inputdata as easily; in particular, it is not possible to use different resolutions for different areas of an image

For an in-depth explanation of octrees and quadtrees, their properties and corresponding algorithms, as well asdetailed descriptions of other hierarchical data structures, see [Sam90a] and [Sam90b]

K-D trees have a number of applications−e.g., n-dimensional range queries; see [Ber97], for instance−, but themost important areas of application as pertaining to occlusion culling are three dimensional k-D trees tohierarchically group objects in object and world space, respectively, and two dimensional k-D trees for imagespace subdivision

BSP Trees

Binary Space Partitioning or BSP trees, as detailed in [Fuc80], can be seen as a generalization of k-D trees.Space is subdivided along arbitrarily oriented hyperplanes (planes in 3-D, lines in 2-D, for instance) as long as acertain termination criterion is not yet reached The recursive subdivision of space into two halfspaces at eachstep produces a binary tree where each node corresponds to the partitioning hyperplane Normally, a sceneconsisting of polygons is subdivided along the planes of these polygons until every polygon is contained in itsown node and the children of leaf nodes are empty halfspaces Every time space is subdivided, polygonsstraddling the separating plane need to be split in two The newly created halves are assigned to the positive andnegative halfspaces, respectively This leads to the problem that the construction of a BSP tree may introduce

O(n 2) new polygons into a scene, although this normally only occurs for extremely unfortunate subdivisions.Primarily, a BSP tree is not used to merely partition space in a hierarchical fashion; it’s a very versatile datastructure that can be used in a number of ways The most important thereof is exact visibility determination forarbitrary viewpoints For entirely static polygonal scenes a BSP tree can be precalculated once and its traversal atrun time with respect to an arbitrary viewpoint yields a correct back-to-front or front-to-back drawing order inlinear time

At each node it must be determined if the viewpoint lies in the node’s positive or negative halfspace,respectively If the viewpoint is contained in the node’s positive halfspace everything in the negative halfspacehas to be drawn first Then, polygons2attached to the node itself may be drawn Last, everything in the positivehalfspace is drawn This process is repeated recursively for each node and yields a correct drawing order for allpolygons contained in the tree

An important observation to make is that each node implicitly defines a convex polytope, namely the intersection

of all the halfspaces induced by those hyperplanes encountered when traversing the BSP tree down to the node

of interest This property is very useful with respect to the subdivision of space into convex cells, an operation

often needed by occlusion culling algorithms If you have, for instance, a polygonal architectural model, a BSPtree can be used to subdivide the entire model into a set of convex polytopes−the cells−which, in the 3-D case,are convex polyhedrons For such an application it might be sensible to not attach single polygons to the BSP

2

A BSP tree node may contain one or more polygons as long as they are all contained in the node’s plane It’s often useful to maintain two lists of polygons for each node One list contains polygons facing in the same direction as the separating plane, whereas polygons contained in the other list are all backfacing with respect to the separating plane With such a scheme, backface culling can be applied to entire lists of polygons at the same time.

Trang 15

Algorithms 8

tree’s nodes, but to only attach separating planes to internal nodes and entire convex cells together with theirconstituting polygons to leaf nodes The model’s polygons would then be contained in the boundaries of thoseconvex cells (their ‘walls’) Such a modified BSP tree approach yields explicitly described convex polytopesinstead of their implicit counterparts in an ordinary BSP tree

During the walkthrough phase, a BSP tree may be used to easily locate the convex cell containing the syntheticobserver, cull entire subtrees against the viewing frustum, and obtain a correct drawing order for whole cells[Nay97] The walls of those cells themselves may be drawn in any order, since they correspond to the boundaries

of convex polyhedrons and thus cannot obscure one another

For purposes of occlusion culling it can be extremely useful to attach probably axial bounding boxes to each ofthe BSP tree’s nodes, encompassing all children of both subtrees and the node itself During tree traversal eachnode’s bounding box is checked against the viewing frustum and−provided that no intersection is detected−theentire subtree can be culled

Also, BSP trees can be effectively combined with the notion of potentially visible sets, see below, to obtain acorrect drawing order for the set of potentially visible cells

2.2 Occlusion Culling

This section reviews several key ideas and data structures used in occlusion culling

Potentially Visible Sets

Many occlusion culling algorithms use the notion of potentially visible sets (PVS), first mentioned by [Air90].This term denotes the set of polygons or cells in a scene which is potentially visible to a synthetic observer Apolygon or cell contained in the PVS is only potentially visible, which is to say it is not entirely sure that it reallycan be seen from the exact viewpoint at any specific instant The reason for this is that most algorithms trade theexact determination of visibility for improvements in computation time and an overall reduction of complexity ofthe algorithm The result of a visibility query is not the exact set of visible polygons or cells, but a reasonablytight conservative3estimate For instance, in approaches that precalculate potentially visible sets for each cell of

a scene subdivided into convex cells, the viewing frustum cannot be taken into account, since the orientation ofthe observer at run time is not known beforehand But, the set of all polygons that can be seen from anyviewpoint with any orientation in the convex cell is a conservative estimate for the actual set of visible polygonsonce the exact position and orientation of the observer are known Moreover, due to algorithm complexity andprocessing time considerations, maybe not even the former set is calculated exactly but rather estimated in aconservative manner

If, as mentioned before, cell-to-cell visibility information is precomputed and attached to each cell, it can beretrieved at run time very quickly to establish a first conservatively estimated PVS This PVS can subsequently

be further pruned by using the then known exact location and orientation of the synthetic observer in a specificcell [Tel91, Tel92b] This can be done by simply culling the PVS against the actual viewing frustum, forexample

Portals

Intuitively, a portal [Lue95] is a non-opaque part of a cell’s boundary, e.g a section of a wall where one can seethrough to the neighboring cell If the entire scene is subdivided into cells, the only possibility for a sightline toexist between two arbitrary cells is the existence of a portal sequence that can be stabbed with a single linesegment So, if a synthetic observer in cell A is able to see anything in cell B there must exist a sequence ofportals Piwhere a single line pierces all of these portals to connect a point in A with a point in B

Rephrased, if we want to determine if anything located in a given cell (including the cell’s walls) is visible,through any portals, from the cell the current viewpoint is contained in, we need to calculate if such a sightlineexists [Fun96]

3

A conservative estimate is an overestimation of the set of actually visible polygons Although this estimate may be unnecessarily large, it is guaranteed to be a superset of the set of all visible polygons, i.e no visible polygons can be misclassified as being invisible.

Trang 16

Algorithms 9

Another way to tackle the concept of portals is to imagine a portal as an area light source where one wants todetermine if any of the portal’s emitted light is able to reach any point in a given cell If this is the case, asightline between those two portals exists and therefore an observer in cell A is possibly able to see some part orall of cell B [Tel92a]

One use of portals is the offline determination of the PVS for a given cell, namely the set of all cells for which astabbing sequence from a portal of the source cell to a portal of the destination cell exists, regardless of the exactlocation of the viewpoint within the source cell This information can be precomputed for each cell in a modeland used at run time to instantly retrieve a PVS for any cell of interest

Alternatively, the concept of portals can be used to dynamically determine a PVS for any given cell entirely atrun time, without the use of any precomputed information The next section conceptually compares both of theseapproaches

Static vs Dynamic Visibility Queries

A very important property of an algorithm dealing with the determination of potentially visible sets is its ability

or inability to handle dynamic changes in a scene efficiently This depends on whether the algorithm computesthe PVS offline or dynamically at run time

If dynamic scene changes need to be accommodated quickly−i.e., without a lengthy preprocessing step−thealgorithm has to support dynamic visibility queries

Naturally, dynamic visibility queries suffer from a performance penalty as opposed to static visibility queries.First, PVS information cannot simply be retrieved from a precomputed table but has to be computed on the fly

To alleviate this problem some algorithms try to exploit temporal coherence (see next section)

Second, due to processing time constraints, the PVS generated by dynamic visibility queries may be not as tight

as the PVS generated by table lookups and pruned to the viewing frustum at run time This incurs a performancepenalty at the rasterization level, as the low level visibility determination−in most cases a z-buffer−has to dealwith additional invisible polygons

One promising approach employed by more recent algorithms is to combine both static and dynamic information

to determine a PVS Some amount of a priori information is precomputed and used to speed dynamic visibilityqueries and enhance their effectiveness, respectively

Spatial and Temporal Coherence

In the attempt to make occlusion culling algorithms more effective, it is important to take advantage of anypotential coherence as much as possible

The term spatial coherence subsumes both object space and image space coherence Object space coherence

describes the fact that nearby objects and polygons in object space very often belong to the same ‘class’, say,with respect to their visibility A hierarchical subdivision of object space can directly take advantage of thiscoherence The hierarchy is used to ensure that nearby objects are considered together first − to be onlyconsidered at a finer level if their invisibility cannot be proved at the coarser level

Image space coherence is the analog to object space coherence in the two dimensional, already projected, image

of a scene It can be exploited by a subdivision of image space using, e.g., an image space BSP tree.Alternatively, coherence at the pixel level might also be used to advantage by, say, using the fact that adjacent z-buffer entries normally do not have greatly differing depth values

Temporal coherence can, for example, be exploited by caching some of the information calculated by an

algorithm from one frame to subsequent frames, and using this cached information to make an educated guess as

to what parts of the scene will actually be visible in the next frame Such information can be used directly, or togenerate some sort of setup information to enhance the performance of a particular algorithm

Trang 17

High-performance Low-level Graphics APIs 10

3 High-performance Low-level Graphics APIs

Over the last few years, the focus of graphics development in the area of entertainment software has clearlyshifted from exclusive software rendering to the support of consumer-level 3D hardware accelerators Veryprominent examples of such accelerators are boards featuring a chipset of the Voodoo Graphics family by 3dfxInteractive Although powerful graphics boards are already widespread throughout the potential customer basefor computer games, most companies still offer optional software rendering in their products The mostimportant reason for this is that the majority of desktop computers is still not equipped with special-purpose 3Dhardware Nevertheless, this situation is changing rapidly and many future releases will require hardwareacceleration

A very important issue in the development of contemporary computer games is the choice of the underlyinggraphics API that is used to access the facilities offered by the graphics hardware One criterion for this decision

is whether to support only a single family of accelerators through a proprietary API like 3dfx’s Glide, or amultitude of different hardware vendors’ products through an industry-standard API like OpenGL or Direct3D.From a marketing point of view, this decision might seem to be easy Clearly, the bigger the customer base, thebetter A very important issue, however, is support and quality of an API−in particular, the availability andquality of the drivers necessary to access the target hardware The choice of API has always been far from clear-cut and is still crucial For this reason, many developers adopt the approach of supporting more than one API,letting the user select at installation or startup time, maybe even on-the-fly from within the program

If one decides to support more than one graphics API, maybe in addition to proprietary software rendering, theissue of how to effectively develop in such a scenario becomes critically important Since the Parsec enginesupports OpenGL and Glide transparently, this issue is of special interest in the context of this thesis

In order to support several graphics APIs in the Parsec engine, we have designed an abstract interface layercomprised of several subsystems This abstract interface is implemented for all supported APIs However, allother graphics code resides entirely above this interface layer and is therefore independent from the graphics APIactually used at any one time Supporting additional graphics APIs is possible by simply implementing thefunctionality required by the interface for the new target Thus, the vast majority of the code does not need to bechanged

This chapter briefly reviews some issues regarding the most important graphics APIs in the current desktopconsumer-market

In addition to being very high performance, Glide is also very ease to use It offers an easy to understand Capplication programming interface, and – definitely a crucial factor for its tremendous success with gamedevelopers – a very good SDK containing all necessary documentation that is available for free to anyoneinterested There is no special registration or fee necessary to get at the SDK, 3dfx made it freely available fordownload almost right from the beginning [Glide]

By and large, the Glide rasterization API was definitely a huge factor in 3dfx’s market dominance in 1996,which stayed almost untouched by competitors until 1998 3dfx’s Voodoo Graphics hardware was the prevalent

Trang 18

hardware accelerator platform, and Glide was the prevalent high-performance, low-level, consumer 3D graphicsAPI

However, starting with NVIDIA’s Riva TNT in 1998, many new low-cost graphics accelerators offering veryhigh performance have become available, and the driver situation has also improved tremendously Thus, amigration from using Glide to APIs like OpenGL or Direct3D, that support a multitude of different hardwareaccelerators, can be observed

3.2 OpenGL

OpenGL [Seg98] is a very powerful and widely used graphics API that evolved from Silicon Graphics’ IrisGL.Originally meant for programming workstation-class 3-D hardware, high-quality OpenGL drivers are availablefor the most important consumer products of today This is also due to the fact that workstation and consumerhardware are in the process of converging in many respects A very important property of OpenGL is that it issupported on many platforms Graphics code written for OpenGL can easily be ported to a PC, a Macintosh, and

− of course− an SGI workstation, among others OpenGL offers an easy to use C application programminginterface, and is also an extremely well documented API

For more information about OpenGL see [Seg98, Woo99, OGL] The SIGGRAPH course on advanced graphicsprogramming techniques using OpenGL [McR00], that has been held over several consecutive years, also offers

a wealth of information on OpenGL and graphics programming on contemporary graphics hardware in general

3.3 Direct3D

Direct3D, which is part of [DirectX], also supports a multitude of graphics hardware, although it is exclusivelyrestricted to Microsoft Windows platforms Nevertheless, these comprise the most important segment of thecomputer games market for the desktop, and Direct3D is widely supported by game developers

A very important difference between OpenGL and Direct3D is that an OpenGL implementation is required tosupport the entire feature set as defined by the standard, whereas Direct3D exports the functionality of thehardware in the form of capability bits This implies that OpenGL must emulate missing hardware functionality

in software With Direct3D, the programmer has to explicitly determine what to do should desired features befound missing

3.4 Geometry Processing

A crucial issue is whether an API supports geometry processing, or is rasterization-only Since Glide exportsonly the features of the actual hardware and all Voodoo Graphics accelerators to date do not support geometryacceleration, the API does not support three-dimensional operations like transformation, clipping, and projection.All coordinates in Glide are screen-coordinates together with additional attributes that the hardware interpolatesover a triangle, e.g., texture coordinates (U,V,W) and color (R,G,B,A) That is, coordinates are already in post-perspective space As soon as accelerators supporting geometry processing in hardware become widespread, thisposes a problem to the API There are already two major versions of Glide − 2.x and 3.x, the latter alreadysupporting a homogeneous clipping space−, and the API will probably continue to evolve with the capabilities

of 3dfx hardware

Both OpenGL and Direct3D support geometry processing, allowing to exploit geometry processors if available.Current consumer hardware is still restricted to rasterization, but this will probably change in the not-too-distantfuture There is also another issue related to whether an API supports geometry processing; if it does, and there is

no geometry processing hardware, the driver has to do all geometry calculations This has importantimplications First, the performance of a program will depend even more on the quality of the driver However,driver-quality is already quite high and developers are hard-pressed to outperform them with their own code.This is particularly true when special purpose features of the host CPU are leveraged The MMX extensions tothe instruction set of the Pentium processor are not really well suited to geometry code due to their integernature, but AMD’s 3DNow! [AMD] and Intel’s Streaming SIMD Extensions [Intel], introduced with thePentium III, feature SIMD floating point instructions that can be put to good use in 3-D graphics drivers.Supporting all of these extensions directly in a program requires a lot of development effort that might not be

Trang 19

justified The most important reason, however, to use the geometry support of a graphics API is to be preparedfor the future widespread availability of hardware with dedicated geometry processing support

That said, Parsec currently does most geometry calculations before submitting primitives to the driver OpenGLhas to be configured to use an orthogonal projection to only use it for rasterization [Woo99]

Trang 20

Game Technology Evolution 13

4 Game Technology Evolution

This chapter provides an overview of the technology employed in 3D computer games over the last decade.Technology in this context means both software technology, e.g., the algorithms and techniques employed, aswell as hardware technology, which has become important with the advent of powerful low-cost consumergraphics hardware in 1996

Section 4.1 covers seminal computer games, and section 4.2 is devoted to 3D hardware accelerators

4.1 Seminal 3D Computer Games

In this section, we illustrate the evolution of 3D computer games over the last eight years by looking at severalseminal games of that period Prior to 1992, computer games were either not three-dimensional at all, or usedsimple wireframe rendering, or maybe flat-shaded polygons In some cases, 3D information was pre-rendered ordrawn into bitmaps, and used at run-time to produce a three-dimensional impression, although there was noactual 3D rendering code used in the game itself

Ultima Underworld (Looking Glass Technologies, 1992)

In Ultima Underworld, a role-playing game set in the famous Ultima universe created by Origin Systems,players were able to walk around in a fully texture-mapped [Hec89, Hec91] 3D world for the very first time.Most role-playing games prior to Ultima Underworld were not three-dimensional at all, e.g., using a top-down orisometric perspective and tile-based 2D graphics like the earlier Ultima games One could not walk around inthese worlds in the first-person perspective, so the feeling of “actually being there” was rather limited

Another earlier approach was to combine hand-drawn graphics tiles with a very restricted first-person view,which was originally made popular in the 1980s by the game Dungeon Master, by FTL However, the world inDungeon Master was not actually 3D; player positions were constrained to a predefined grid and the onlyallowed rotation of the view was in steps of 90 degree angles, so one could only look straight ahead, straight tothe left or right, or to the back This approach creates a limited three-dimensional impression without using anyactual 3D calculations at run-time

The world in Ultima Underworld, however, contained basically no technological viewpoint restrictions and was– at least to a certain extent – fully 3D The player position was not constrained to a simple grid anymore,players were able to seamlessly walk over actual texture-mapped floor polygons Rotations about the principalaxes were also allowed, most notably looking to the left or right with arbitrary rotation angles, and looking up ordown with proper perspective fore-shortening of polygons This is especially remarkable since it wasn’t untilseveral years later that most 3D games were using an actual rotation about the horizontal axis for this purpose,instead of not allowing the player to look up or down at all, or faking the rotation by using a sheartransformation

However, the flexibility allowed for by Ultima Underworld’s 3D engine had its price in terms of performance.Since the whole world was texture-mapped and polygons could be viewed at arbitrary oblique angles, thetexture-mapper was simply not able to texture the entire screen in real-time, at least not on most consumerhardware available in 1992 For this reason, the game used a smaller window for rendering the 3D view of theworld, and used the remaining area of the screen for user interface elements like icons, the player character’sinventory, an area for text output, and the like

For a role-playing game like Ultima Underworld its rather slow speed was no problem, however, since it wasn’t

a fast-paced action game after all, emphasizing game play and design much more than many 3D action games ofthe following years

Characters and objects in Ultima Underworld were not rendered as actual 3D objects, but as simple dimensional sprites (billboards) instead That is, they consisted of bitmaps that were scaled in size according tothe viewing distance, and animation was done using a sequence of pre-drawn frames, like in traditional 2Danimation

Trang 21

two-Game Technology Evolution 14

See figure 4.1 for a screenshot of Ultima Underworld

Figure 4.1: Ultima Underworld (1992)

Wolfenstein 3D (id Software, 1992)

Shortly after Ultima Underworld, a small game developer named id Software released what would eventuallyfound a new computer game genre: the first-person shooter, commonly abbreviated as FPS This game,Wolfenstein 3D, emphasized fast-paced action above all, where the player was able to run around in first-personperspective with high frame rates, and the goal of the game was primarily to shoot everything in sight However,this simple formula was easy to grasp and the fast action game play combined with equally fast frame ratescontributed to a remarkable success

The most important contribution of Wolfenstein 3D, however, was that it was to become the prototype forgraphically and technologically much more sophisticated first-person shooters in the following years, like Doomand Quake So, the questionable game content notwithstanding, Wolfenstein had a much higher impact on games

in the following years in terms of technology than Ultima Underworld

In contrast to Ultima Underworld, the game world in Wolfenstein was highly restricted The player had onlythree degrees of freedom (DOF); two degrees for translation, and one degree for rotation That is, movement wasconstrained to a single plane and the view point could only be rotated about the vertical axis So, it was possible

to look to the left and right with arbitrary viewing angles, but nothing else

In order to avoid the negative performance impact of texturing the entire screen, only wall polygons weretexture-mapped The floors and ceilings were simply filled with a solid color

See figure 4.2 for a screenshot of Wolfenstein 3D

Figure 4.2: Wolfenstein 3D (1992)

Trang 22

In addition to texturing only the walls, the placement of these walls was restricted in a way that all of them wereplaced in 90 degree angles to each other, and all walls were of the same height This restriction of the worldgeometry, combined with the restriction on the allowed motion of the player, also made use of an extremelysimplified texture-mapper possible Walls could be rendered as a series of pixel columns, where the depthcoordinate was constant for each column So, the perspective correction had to be performed just once for eachcolumn, leading to an extremely optimized (and equally restricted) texture-mapper

The visibility determination in a world like Wolfenstein’s also profited tremendously from the simplicity of thegame world The game used a simple ray-casting algorithm, where a single ray was cast to determine the entirevisibility (and horizontal texture coordinate) of an entire screen column Since all walls were of the same height,there was always exactly one wall visible per screen column, or none at all Thus, as soon as the cast ray hit thenearest wall, the visibility determination for the corresponding column was done

Similar to Ultima Underworld, all characters and objects in Wolfenstein 3D were two-dimensional sprites

Doom (id Software, 1993)

When Doom first appeared in 1993, it managed to lift many of the restrictions that made its direct predecessorWolfenstein 3D a very simple and tremendously restricted game It is still one of the most successful computergames of all time

Doom combined fast game play and high frame rates with a fully texture-mapped world like in UltimaUnderworld, although the world geometry was still much more constrained The basic premise was similar toWolfenstein, but the complexity of Doom’s levels was truly remarkable for a game running on consumerhardware at the time

Doom’s levels were represented as a single large 2D BSP tree, which was most of all used for visibilitydetermination purposes Levels could be constructed using some kind of two-dimensional floor plan, i.e., a top-down view, where each line segment corresponds to a wall and is displayed as a polygon in 3D by the gameengine at run-time Of course, this underlying 2D nature of Doom’s levels led to a lot of necessary restrictionslike “no rooms above rooms,” where it was impossible to model multi-storied buildings or anything else wherethere are two floor or ceiling heights at the same position of a top-down view Sometimes, such a combination of2D geometry actually used to define some sort of restricted 3D geometry via lofting or extrusion is called 2.5D.Doom was able to texture-map the entire screen at really high frame rates by once again cleverly exploiting thecombination of a constrained world geometry with restrictions on the allowed rotation of the view point Sinceall wall polygons were extruded orthogonally to the floor from the level representation at run-time, all visiblepolygons were either parallel to the floor plane, or orthogonal to it Combined with the fact that view pointrotation was only possible about the vertical axis, this yields the observation that the depth coordinate is alwaysconstant for an entire horizontal or vertical polygon span; for floor/ceiling and wall polygons, respectively.Sometimes, such an approach to texture-mapping is called “constant z texture-mapping.”

See figure 4.3 for a screenshot of Doom

Figure 4.3: Doom (1993)

Trang 23

When using a BSP tree for visibility determination there are two basic approaches in which order the tree can betraversed in order to obtain correct visibility The simplest is back-to-front rendering, in which the polygons aresimply drawn from farthest to closest, in essence using a painter’s algorithm [Fol90] Since the BSP tree yields acorrect visibility order for an arbitrary viewpoint, it is easy to let nearer polygons overdraw polygons fartheraway and thus obtain a correct image

The problem with back-to-front rendering, however, is that it draws a lot of unnecessary polygons, since it startswith the polygons that are most likely to not be visible at all Most polygons of a level are invisible most of thetime, because they are overdrawn by nearer polygons completely overlapping them

A much better, but also more sophisticated, approach is to use front-to-back rendering instead With this order oftraversal, drawing starts with the nearest polygon and progresses to the farthest polygon The major difference isthat as soon as the entire screen has been filled by polygons drawn up to a certain position in the BSP tree,further traversal is completely unnecessary, since all visible polygons have already been drawn at that time.Thus, Doom used front-to-back traversal of the 2D BSP tree representing the entire level, and kept track ofcovered pixel spans – which was necessary to prevent occluded parts of polygons farther away fromoverdrawing visible parts of polygons nearer to the view point – and stopped traversal as soon as it knew that theentire screen had been covered This tracking of already covered screen areas/spans was much simpler inDoom’s 2.5D case than it would have been in the general 3D case

Doom’s use of a BSP tree for level representation had the consequence that dynamically moving geometry wasnot really possible Doors, for instance, were actually modeled as very small rooms, where the ceiling height wasmodified on-the-fly It wouldn’t have been possible to model swinging doors, or the like This was due to thefact that BSP trees are not really suited to handling dynamic geometry, especially not in real-time Also, thebuilding process of a BSP tree – the so-called BSP tree compilation – is very time-consuming, so there was asignificant delay between designing a level and actually being able to try it out in the game

Similarly to Ultima Underworld and Wolfenstein 3D, Doom still used simple sprites for characters and objects.Another important aspect of Doom, albeit not graphics-related, was that it supported Novell’s IPX protocol fornetwork game play on local area networks Thus, it became the first multiplayer game with sophisticated 3Dgraphics that was played by a staggering number of people all over the world

Another trend that started with Doom and had an extremely wide impact on computer games being developedafter it, was that it was also highly user-extensible Players could easily substitute graphics, sound effects, andthe like, and even create their own levels with a wide variety of level editors that were developed by communitymembers and made available for free in most cases Today, many games come with a level editor out of the boxand a thriving community is working on new content, as well as game modifications using source code released

by the game developers themselves

Descent (Parallax Software, 1994)

In 1994, a previously unknown company called Parallax Software released a game that featured a fully 3D, degree, six degrees of freedom action game – Descent

360-Descent was the first computer game that was able to render arbitrarily oriented polygons with perspectivecorrect texture mapping at frame rates high enough for an action game Most astoundingly, Descent featured atrue three-dimensional world where the player could navigate a small spaceship through texture-mappedunderground mines It was a first-person game, but the fact that the player was controlling a spacecraft, instead

of a character walking on foot, allowed all axes to be basically equal

The visibility determination in Descent was not done with a variant of the BSP tree approach, but instead with asubdivision of the entire world into convex cells interconnected via portals A cell in Descent consisted of a solid

“six-face,” a cube that could be distorted arbitrarily, as long as it stayed a convex solid Thus, each cell hadexactly six walls, which could be either solid (mapped with a texture and impassable), or a portal, where theplayer could see through and fly to the neighboring cell

Trang 24

See figure 4.4 for a screenshot of Descent

Figure 4.4: Descent (1994)

In order for a cell structure connected via portals to be traversed effectively, some kind of adjacency graph isusually used Each node in the graph corresponds to a cell of the subdivision, and each edge describes aconnection between two adjacent portals Using this information it is very easy to start in a specific cell andjump from cell to cell by passing through interconnecting portals

In such a system, traversal starts with the cell that contains the view point and some kind of depth-first algorithm

is employed to recursively render the entire world After rendering all solid walls of the cell containing the viewpoint, all portals of this cell are touched and for each one of them everything is rendered that is visible throughthat particular portal

Cells as building blocks of a world can also be used to store additional attributes, like lighting information, orwhether that area (volume) is poisonous, has different gravity, etc In Descent, the level designers could assign

an ambient light intensity for each of the eight cell vertices At run-time, vertices could be lit dynamically bymoving light-sources like lasers, and the like, combining the dynamic lighting value with the static ambient lightintensity on-the-fly Lit polygons were rendered by applying Gouraud-shading [Rog98] to the textures of a cell’swalls

One of the major properties of Descent’s world representation was that the cell subdivision was not built during

a post-process after designing a level, as would be done in a BSP-based structure Instead, the cell subdivisionwas an intrinsic part of the representation and the design process itself The data structure used at run-time wasbasically constructed in its entirety while a level was being built That is, since the level designers used the exactsame geometric building blocks the engine itself used, and designated cell walls manually as either being solid(by assigning a texture), or being a portal (by not assigning a texture), no post-process subdivision and portal-extraction was necessary The adjacency information of cells was also manually created while constructing alevel, since new cells had to be explicitly attached to already existing cells when it should be possible to passfrom one cell to another There were also no problems with t-junctions [McR00], say, an edge of one celltouching a face of another cell in its middle This was achieved by requiring that neighboring cells always shareexactly four vertices, so the connection would be created from one face to another face, over the entire surface ofeach of these faces

Many of these constraints on the design process could be viewed as a liability, but in the case of Descent it wasvery much tailored to the type of game With such a “distorted-cube”-based approach to level-building it wasactually very easy to build the interior of mines, mostly consisting of rather long tunnels with many twists andbends All in all, Descent was a unique combination of game-design and technology, where each of these crucialparts was ideally tailored to its counterpart

The characters in Descent, mostly hostile robots, were also not restricted to two-dimensional sprites anymore,like in nearly all texture-mapped 3D games that preceded it Instead, the robots prowling Descent’s sprawlingunderground mines consisted of texture-mapped and diffusely lit polygons In addition to these polygonal 3Dobjects, however, Descent still used billboards for some objects, like power-ups floating in its tunnels, andweapons projectiles, for instance

Trang 25

Descent used also many other clever restrictions in order to achieve its high performance For example, all cellwalls were textured using textures of a single size, namely 64x64 This allowed the use of texture-mappingroutines hard-wired for this size, and made a lot of other administrational tasks with respect to textures easier.Texture-mapping was done using a linear interpolation of texture-coordinates for sub-spans of the horizontalspans of each polygon [Wol90], usually performing the perspective division only every 32 pixels There werealso a lot of highly specific assembly routines for texture-mapping, like one texture-mapper using a texturewithout any lighting, one for texture-mapping with lookup into a global ambient light or fading table, one fortexturing and Gouraud-shading at the same time, and so on

Descent still used fixed point and integer arithmetic for all of its geometrical computations like transformations,

as did practically all of its predecessors

Quake (id Software, 1996)

Quake was the first first-person shooter that used true three-dimensional environments It was also able to rendergeometry that was several orders of magnitude more complex than what Descent had been able to use In order

to achieve this, its developers had to use quite a lot of 3D graphics techniques that had previously never beenused in a computer game

See figure 4.5 for a screenshot of Quake

Figure 4.5: Quake (1996)

The basic level structure in Quake used one huge 3D BSP tree representing the entire level In some ways thiscould be viewed as being simply an extension from using a 2D BSP tree in Doom for 2.5D geometry, to usingthe 3D variant for arbitrary 3D geometry In reality, however, 3D BSP trees are much more complicated to usefor real-time rendering of complex environments than their 2D counterparts This is true for many aspects of thetransition from a rather restricted 2D-like environment to full 3D

First, all polygons can basically be oriented arbitrarily, which mandates a texture-mapper that is able to map suchpolygons in real-time Quake achieved this by performing the necessary perspective division only every 16pixels and linearly interpolating in between The difference between this approximation and the true perspectivehyperbola was virtually not noticeable, since the length of these subspans was chosen accordingly

Prior to Quake, not many games requiring extremely high frame rates were using floating point arithmeticinstead of fixed point arithmetic, since the FPU of the processors at the time was simply not fast enough In

1996, however, it became feasible to use floating point operations heavily even in computer games, and Quakedid so, all the way down to the subspan-level of its texture-mapper The perspective division for texture mappingwas still done using the FPU, linear interpolation and pixel-arithmetic was then done in fixed point or integerarithmetic Nearly all higher-level operations like transformation, clipping, projection, were done entirely infloating point

Trang 26

There are two things to keep in mind with respect to this and the primary target platform of Quake, the IntelPentium processor First, conversion from floating point to integer (or fixed point) values, was always (and stillis) an inherently slow operation Thus, actual pixel arithmetic was always faster using integers, even if theperformance of the Pentium processor’s FPU was comparable or faster than integer operations After all, in orderfor a frame buffer or texture to be accessed one always needs integer coordinates (offsets) Second, there was avery specific reason for this exact point of transition from floating point to integer arithmetic The Pentiumprocessor was able to interleave floating point and integer operations, since its FPU and integer unit are separateand able to operate in parallel Actually, Quake’s texture-mapper was able to interleave the perspective divisionwith the interpolation and rendering of the actual pixels for a 16-pixel subspan, getting the much-feared divisionessentially for free

Second, although a 3D BSP tree yields a correct visibility order for polygons, there are a lot of polygons that arecontained in the view frustum but still not visible, because they are occluded by nearer polygons A verypowerful solution to this problem is the notion of potentially visible sets (PVS) With potentially visible sets, foreach cell in the world (each leaf of the BSP tree), a set of other – potentially visible – cells is computed in apreprocess and stored along with the level data At run-time, the set of potentially visible cells is retrieved for thecell containing the viewpoint and only polygons belonging to those cells are even considered for rendering ThePVS information is a so-called conservative estimate for visibility That is, polygons not actually visible may becontained in such a set, but at least all visible polygons are required to be contained Naturally, the tighter thevisibility estimate is, the fewer unnecessary polygons will be rendered at run-time

Another challenging problem with using a 3D BSP tree is how to render dynamically moving objects The treecan be used to obtain a correct visibility order for the static world polygons, but polygons that are not part of thetree structure cannot be handled easily One approach is to clip dynamic polygons into the leafs (convex cells) ofthe BSP tree, rendering them when the cell is rendered This is a rather time-consuming process, however.Therefore, Quake uses a different approach to handle dynamically moving objects, like enemy characters.Before we discuss how moving objects are handled in Quake, it is important to realize that standard z-bufferingwas basically never used for visibility determination in fast, software-rendered computer games All the morecomplicated approaches like using BSP trees, portals, and the like, are – apart from other very important useslike collision detection, and higher level occlusion culling – in a way a faster but more complicated approach tovisibility determination than the simple z-buffer approach However, z-buffering with the necessary comparisonper pixel, interpolation of z values, conditional stores into the z-buffer, etc was in almost all cases too slow to beused in software rendering (Note that this changed significantly with the introduction of hardware accelerators,which made the use of simple z-buffering for visibility detection at the pixel level feasible for the first time.)Quake, however, used a clever variant of z-buffering to combine moving objects with the BSP-rendered world.While rendering the static world, filling the entire screen, a z-buffer was filled at the same time From aperformance point of view this is vastly different from a full z-buffering approach, since each entry in the z-buffer gets written exactly once, and there are no z value comparisons and z-buffer reads necessary This ispossible since the depth information is not used for rendering the world After the entire screen has been filledwith world polygons, the z-buffer contains a valid z-footprint for the entire scene The relatively small number ofpixels of moving objects’ polygons can then be rendered using standard z-buffering

Quake also introduced an entirely new approach to lighting a level, using so-called light maps Static lightinginformation was precomputed for patches covering all polygons of the level and stored in special texture maps,

at a much lower resolution than the texture maps themselves, say, a light map texel every 32 texture map texels

In the original Quake simple light casting was used for calculating light maps, however, a full radiosity [Ash94]preprocess was employed later on

In every computer game another important aspect is the approach to building a level As has also becomeapparent in the section on Descent, level construction and representation is a very important aspect of everyrendering engine Quake introduced the CSG (constructive solid geometry) modeling paradigm to computergames When building a Quake level, the designer is always using solid building blocks and combining themwith the already constructed part of the level via boolean set operations (like in CSG trees) This is a verypowerful modeling paradigm, easy to use, and among other useful properties guarantees that the entire level will

be solid in the end, since single polygons are never used during construction

Quake (more specifically, QuakeWorld) also extended the multiplayer game play of Doom to full-blown Internetgame play, which was played by thousands of people all over the world, at the same time

Trang 27

GLQuake (id Software, 1996)

Quake used a proprietary software-renderer that had been developed from scratch, like practically all computergames at that and earlier times Quite soon after the original Quake had been released, however, id Softwarereleased a Quake executable that was able to take advantage of the hardware acceleration offered by boardsusing the Voodoo Graphics accelerator by 3dfx Interactive To distinguish this version of Quake from thestandard version it was called GLQuake, since it used the OpenGL API that had previously been almostexclusively used on expensive high-end graphics workstations

The timing was almost perfect, and GLQuake together with Voodoo Graphics accelerators achieved thebreakthrough for consumer hardware acceleration in 1996 Previous attempts of hardware accelerators to take ahold in the consumer marketplace failed, most of all due to mediocre performance and the lack of a killerapplication, i.e a top-notch computer game GLQuake, however, became that killer application Software-rendered Quake had severe performance problems in resolutions significantly higher than 320x200, say,640x480, on all but the fastest computers of its time This was mostly a problem of the lowest level of thegraphics pipeline, i.e the rasterizer – turning polygons into actual pixels and mapping them with a texture TheVoodoo Graphics accelerator was perfect for this kind of work and relieved the host CPU of the burden ofperforming a perspective division for each pixel (or every n-th pixel) which was the most important factorpreventing 3D computer games from going to resolutions of 640x480 and higher

See figure 4.6 for a direct comparison of software-rendered Quake and hardware-rendered GLQuake, for thesame frame rate

Figure 4.6: Quake vs GLQuake (1996)

In addition to taking over the main load of rasterization, the Voodoo Graphics accelerators were also able to usebilinear filtering instead of point-sampling textures, which led to a much higher image quality MIP-mappingcould also be performed in real-time for each pixel individually, instead of just choosing an approximatelysuitable MIP-map level for an entire polygon, as software-rendered Quake had done for performance reasons

In contrast to software-rendered Quake where light maps were combined with the base texture maps beforeactually using them for texturing using a sophisticated surface caching system, GLQuake rendered the light mapsdirectly as a second pass, using alpha-blending If the hardware supported it (Voodoo 2, Riva TNT, ) basetextures and light maps could even be rendered in a single pass, using single-pass multi-texturing

Quake 3 Arena (id Software, 1999)

The rendering engine of Quake 3 Arena is without a doubt the current state of the art in real-time 3D renderingtechnology in computer games, and consumer applications in general It is also the first high-profile computergame that requires a 3D hardware accelerator – software-rendering is not even optional anymore

Trang 28

Quake 3 still uses much of the underlying technology introduced with the original Quake for the first time This

is largely due to the fact that it is the same type of game, at least from a graphics technology point of view.Regarding game play, Quake 3 is no typical single-player game with a story to guide the player along, but isinstead geared entirely towards multiplayer Internet gaming Single-player is only possible insofar as themultiplayer game can be played against computer-controlled bots

The entire world is still represented using a single, huge 3D BSP tree, and potentially visible set (PVS)information is the most important approach to occlusion culling For the first time in a game, Quake 3 usescurved surfaces as a fundamental building block in addition to standard polygonal solid blocks

In contrast to most applications offering curved surfaces, Quake 3 employs quadratic bézier patches with 3x3control points per patch, where usually cubic patches are used (needing 4x4 control points) Quadratic patchesare of course faster to evaluate than their cubic counterparts, but the developers of Quake 3 have stated that one

of the main reasons for using them is also ease of use for the level designers, since it is easier to work with asmaller number of control points Naturally, this also conserves memory Note that these quadratic patches are ofcourse converted to triangles before they are submitted to the hardware accelerator for actual rendering

Quake 3 Arena is a very good example for the current trend of not only going into the direction of using evermore polygons to provide smoother surfaces as well as more surface detail, but of aggressively going into thedirection of higher rendering quality, with the use of multi-pass rendering

The appearance of surfaces is not anymore mostly described by a single texture, a light map, and maybe a couple

of other attributes, but is defined by a very flexible and general shading description – a so-called shader Quake3’s shaders are a simple shading language targeted at real-time rendering Where in a full-blown shadinglanguage like Pixar’s RenderMan [Ups90], used for feature films, the shader (a function written in aprogramming language) is most of the time evaluated for each pixel, the textual definitions of Quake 3’s shadersare parsed at loading time, converted to a form suitable to real-time rendering, and then used to control thegraphics hardware at run-time

See figure 4.7 for a screenshot of Quake 3

Figure 4.7: Quake 3 Arena (1999)

Quake 3’s shaders are most of all a very flexible way to control how a surface should be rendered using multiplepasses of maybe different kinds Say, one pass for the light map, four passes for an animated base texturecombining four textures on-the-fly, one pass for a reflected image of the surrounding environment, and one passfor volumetric fogging The shading language can also be used for all kinds of animations, not only textureanimations, where several textures can be transformed and composited in real-time, but also color animations,animation of the opacity value, and even vertex animations

Trang 29

In principle, shaders do not really offer anything that would not have been possible with custom C code prior totheir introduction in Quake 3 However, they represent a tremendous increase in flexibility and give much morepower and freedom to the artists and level designers Without shaders, an artist would always have to find aprogrammer to implement a specific special effect If a certain effect can be achieved with a shader, however, theartists can immediately do this themselves Of course, a simple but powerful custom-language for shadingspecification is much easier to use for everyone, including the programmers, so they open up a much wider fieldfor experimentation with high-quality surface rendering

4.2 Consumer 3D Hardware

In this section, we illustrate the evolution of consumer 3D hardware accelerators over the last four years, startingwith the seminal Voodoo Graphics chip introduced by 3dfx Interactive in 1996

Voodoo Graphics (3dfx Interactive, 1996)

As already touched upon in the section about GLQuake, the Voodoo Graphics accelerator was the first consumer3D hardware to actually get used by many computer games and achieve a breakthrough for 3D hardwareacceleration in the consumer marketplace in general One of the major properties of this accelerator that madethis possible in 1996 was that boards employing a Voodoo Graphics chipset offered no support for 2D graphics

at all They were exclusive 3D boards and could only be used in conjunction with conventional 2D graphicscards The video signal of the 2D board was either passed through to the display monitor, or not used at all, andthe video signal of the 3D board sent to the monitor instead Of course, in such a configuration it was impossible

to render into a window on the 2D desktop But since computer games are normally played in fullscreen modeanyway, this was no big problem It restricted use of these accelerators largely to games, however Graphicsapplications like 3D modeling programs, programs for scientific visualization, and the like, were usually not able

to take advantage of these early consumer 3D hardware accelerators

The Voodoo Graphics featured dedicated frame buffer and texture memory, as opposed to a unified memoryarchitecture where frame buffer and texture memory share the same RAM Thus, it was not possible to increasethe amount of texture memory when the frame buffer memory was not used in its entirety This was no realrestriction at the time, though, since the only two screen resolutions supported by the first cards featuring aVoodoo Graphics chipset were 512x384 and 640x480 These boards had 2MB frame buffer, and 2MB texturememory Dedicated memory for textures has a major performance advantage, since the memory bandwidth toand from texture RAM is exclusively available for texture accesses In unified memory configurations, on theother hand, the bandwidth needs to be shared between frame buffer and texture traffic This can be alleviatedusing a more sophisticated scheme for RAM access, however, in 1996 it was a very sensible decisionperformance-wise Naturally, if the texture memory and frame buffer are not shared, it is not possible to renderdirectly into a texture That is, for rendering into a texture there is no way around reading back the frame bufferand copying it into texture memory to achieve the desired effect Computer games in 1996, and this still holdstrue for most of today’s games, did not use algorithms requiring to render into a texture each frame (e.g.,dynamic creation of environment maps) So, this was practically no restriction at all at the time

With respect to color and depth buffer precision, the Voodoo Graphics offered a 16-bit color buffer, and a 16-bitdepth buffer In 1996, this was indeed a tremendous step ahead, since practically all 3D games were using 8-bitcolor indexes in conjunction with a 256-entry palette, prior to the widespread use of hardware 3D accelerators In3D applications, the issue of color resolution doesn’t only pertain to the resolution of the colors as they arestored in the color buffer, but also to the precision of colors contained in texture maps Usually, the maximumcolor resolution of the color buffer and the texture data is the same In the case of palettized 8-bit games thismeant only using textures containing 8-bit indexes into a global palette The Voodoo Graphics supported anumber of texture map formats and also featured an on-board hardware palette for use of palettized textures Thehighest supported color resolution was 16-bits per texel, in 565 or 1555 configurations, respectively This,together with the elimination of a global palette shared by the entire screen, led to tremendously increased imagequality However, the first games exploiting hardware acceleration on Voodoo Graphics boards still retainednearly all the restrictions that came from using a global palette in the software-renderer, for the simple reasonthat this issue is very central to the design of a graphics engine and the artwork used and couldn’t be changedvery easily or quickly So, it took quite some time until most games really started exploiting the capabilitiesoffered by then-recent graphics accelerators

Textures were supported up to a resolution of 256x256 with a couple of additional restrictions The aspect ratiocould not exceed 1:8 or 8:1, respectively, so it was not possible to use a 256x16 texture, for instance

Trang 30

Furthermore, texture widths and heights always had to be a power of two This makes tiling textures (wrappingthem around) a lot easier, since the wrap-around can be achieved with simple bit masking, instead of using amodulo operation Constraints like this only became real restrictions a couple of years later, say in 1999, whenthe first games without any software-rendering started to appear Previously, most constraints had been presentdue to the use of a software-renderer anyway

A very important reason why the Voodoo Graphics was able to achieve the breakthrough for consumer 3Daccelerators, where all earlier attempts had failed, was that nearly all of it features could be used without anyperformance hit at all That is, practically all of the flashy new features could be turned on simultaneouslywithout decreasing the frame rate Earlier accelerators were generally much slower and also got, say, four timesslower as soon as depth buffering was turned on This was not the case on the Voodoo Graphics, and so most ofthese features got actually used

Voodoo 2 (3dfx Interactive, 1998)

The Voodoo 2 was the logical successor to the original Voodoo Graphics chipset, offering a lot of improvements

in detail, but most of all offering tremendously increased performance

The most important feature introduced with the Voodoo 2 was single-pass multi-texturing By offering twotexture mapping units (TMUs), the Voodoo 2 was able to texture each pixel with two textures at the same time.This feature was mostly used for light-mapping, made popular with the original GLQuake in 1996, andhenceforth used by a very high number of games, many of them using the licensed Quake engine

With single-pass multi-texturing it was also possible to use trilinear filtering without a frame rate-hit, where oddMIP-map levels had to be accessed by one TMU and even levels by the other TMU, using blending betweenthese two levels for going from bilinearly filtered textures to trilinear filtering with MIP mapping

The standard configuration of Voodoo 2 boards was 4MB frame buffer memory and two times (for each TMU)2MB or 4MB texture memory, yielding 8MB and 12MB video memory configurations This led to an increasedscreen resolution of 800x600 and a lot more textures being able to be resident in texture memory at the sametime

The Voodoo 2 also offered a couple of additional features like enhanced dithering to 16-bit colors (color anddepth buffer precision was still 16-bits), and the possibility for SLI (scan-line interleaving) configurations, wheretwo boards could be combined for effectively doubling the fill-rate, one board rendering even scan-lines, theother board rendering odd scan-lines But SLI was a pure brute-force approach, requiring twice the texturememory without any real benefit (identical textures had to be resident on each board), and also using two PCIslots (in addition to the one slot needed by the 2D accelerator anyway)

Riva TNT (NVIDIA Corporation, 1998)

In 1998, NVIDIA was the first competitor of 3dfx that was able to come close in terms of performance, and evensurpass the line of Voodoo Graphics accelerators in terms of features, with its Riva TNT graphics accelerator.The Riva TNT was the first consumer graphics chip that was able to offer high-performance, high-qualityrendering with OpenGL

The TNT, which stands for twin-texel, was a single-pass multi-texturing architecture offering a 32-bit colorbuffer, a 24-bit depth buffer, and even an 8-bit stencil buffer It used a unified memory architecture, and mostTNT boards featured 16MB memory, allowing for resolutions of up to 1280x1024 In contrast to the VoodooGraphics, these boards were combined 2D and 3D graphics boards, so they required only a single PCI or AGPslot for a full 2D and 3D acceleration solution More importantly, they allowed hardware-accelerated renderinginto a window on the normal 2D desktop This, in combination with support of a fully compliant and quite robustimplementation of OpenGL 1.1, brought the breakthrough for consumer hardware acceleration in otherapplications than 3D games The TNT also offered maximum texture sizes of up to 2048x2048

In retrospect, the two main reasons why OpenGL was able to establish itself as a feasible high-performance API

in the game development community were probably Quake (GLQuake, Quake 2, ) on the one hand, andNVIDIA’s TNT accelerator on the other GLQuake was the first high-profile game using OpenGL instead ofGlide or Direct3D The Voodoo Graphics, however, although it supported a subset of OpenGL to allow Quake torun, did not support a full OpenGL implementation until 1999, on the Voodoo 3 Thus, a major factor for

Trang 31

OpenGL’s widespread use was the availability of a high-performance, high-quality, fully compliant OpenGLplatform, in the form of the Riva TNT and its successors

GeForce 256 (NVIDIA Corporation, 1999)

The next major breakthrough for consumer 3D hardware accelerators came in late 1999, with the introduction ofthe GeForce 256 accelerator by NVIDIA, which is still the state of the art The most important property of theGeForce is that it features full geometry acceleration for the first time in the consumer price range Previousaccelerators were basically all rather simple rasterizers, in a way more 2D than 3D, where most of the 3D workhad to be done by the driver, i.e in software With geometry acceleration, however, actual 3D primitives andtransformation matrices get submitted to the hardware, which takes over most transformations, clipping, andprojection, as well as rasterization of primitives

By now, NVIDIA is probably the most important driving force behind the evolution of OpenGL andconsequently the GeForce supports a huge number of OpenGL extensions Among them, for instance, hardwarebump mapping (which was first supported on a consumer card by the Matrox G400) and support of cubicenvironment maps So, The GeForce hardware is apparently able to perform texture lookups into texture mapsthat actually consists of six textures (the six sides of the cube that represents the cubic environment map) in real-time

The breakthrough achieved by the GeForce 256 is especially significant for computer graphics professionals,since it marks the time where consumer graphics hardware has definitely surpassed extremely expensive high-end workstation hardware of just a short time ago, with a product carrying a price tag that makes it evenattractive for just playing computer games

Figure 4.8 shows a highly-tessellated sphere environment-mapped using a cubic environment map, which issupported in hardware by the GeForce 256, and the corresponding environment map itself

Figure 4.8: Cubic environment-mapping on NVIDIA‘s GeForce

Trang 32

PART II:

Design and Architecture of Parsec

Trang 33

Overview 26

5 Overview

This chapter provides background information on the Parsec engine, the Parsec game, and the vision of theParsec project, as well as some thoughts on what is needed to build a computer game and a game engine ingeneral

We then proceed by giving an overview of the chapters that comprise this part of the thesis, which constitutes itsmain part

5.1 The engine

The Parsec engine is a high performance multiplayer 3D game engine that emphasizes portability andextensibility It has been developed in parallel with an actual computer game – Parsec [Parsec] Parsec is a fast-paced multiplayer cross-platform 3D space combat game The development of the Parsec engine and Parsec is acollaborative effort involving several programmers, as well as an artist, and a musician – collectively referred to

as the Parsec project

The Parsec engine solves both the problems of portability between different host systems, as well as differentgraphics APIs Naturally, there always has to be at least some system-dependent code which is not portable atall However, by encapsulating this type of code beneath accordingly designed abstract interfaces, it is possible

to minimize the amount of system-dependent code, making the majority of the code-base immediately portable

In order to port to a new target platform in such a scenario, only these abstract interfaces have to be implementedfor the new target Thus, although there is no such thing as immediate portability of high performance 3D gamecode, it is possible to come very close with an appropriate design In fact, if the interfaces hiding system-dependencies have been designed with minimization of porting work in mind, this process can be quite fast.The Parsec engine offers powerful features for rendering with an abstract graphics API, in order to beindependent from API-specifics This API offers lots of low-level features, comparable to what OpenGL offers,for instance, as well as additional high-level facilities A powerful example of this is Parsec’s shading language,which can be used to specify the appearance of objects in verbal form An important consequence of this is that itallows an artist to directly specify how certain parts of an object should be shaded and animated, without theneed for writing special-purpose OpenGL or other graphics code The area of real-time shading languages hasrecently become an area of active research [Pee00]

Developing for an abstract graphics API allows for developing graphics code that immediately runs on allsupported actual graphics APIs That is, this approach makes it easily possible to develop for multiple high-performance graphics APIs simultaneously, which is also covered in [Had99]

Host system-dependencies are encapsulated in several subsystems There are subsystems with correspondingabstract interface specifications for audio (sound and music output), input (support for input devices likekeyboard, mice, and joysticks), networking (essential for a multiplayer game engine), video (management ofresolution and color depth, as well as fullscreen and windowed modes), and miscellaneous system services(timing, file path conventions, etc.)

All in all, this architecture employed by the Parsec engine yields an extremely portable game engine withoutcompromising performance

Trang 34

Overview 27

Extensibility

The Parsec engine is extensible in the sense that new game code and facilities may easily be added, as long asthese fit into the framework provided Extensibility in our case means extending the game by adding new specialeffects, weapons, types of objects, console commands, heads-up displays, and even new modes of game play.Even further, so-called total conversions are theoretically possible, which refers to game code that has beenchanged to such an extent as to almost create an entirely new game

The engine itself may also be extended with new features However, extensibility of the Parsec engine does notmean that it is easily possible to add general game engine features for constructing an entirely different kind ofgame than the engine was initially built for

The intended purpose of the so-called Parsec SDK, a major portion of the game and engine source code, as well

as tools, is to allow interested programmers to extend Parsec and the Parsec engine in exactly the way describedabove

The Parsec engine is tightly bound to a specific game genre, or type of game, since a major goal was to providethe technology necessary to create Parsec, not to develop a game engine for arbitrarily general purposes Thus,the engine is closely coupled with the genre of space combat games It is not a general game engine that caneasily be used for a wide variety of games In this respect it is similar to other game engines that focus on aspecific type of game, like the Quake, Quake II, or Quake 3 Arena engines [Quake], or the Unreal [Unreal]engine These engines, however, are rather tightly bound to the genre of first-person shooters, i.e., they areprimarily indoor engines The setting preferred by the Parsec engine is a huge outer space arena with severalspacecraft travelling through it

In contrast to game or genre-bound engines, general game engines like NetImmerse [NetImm] can potentially beused for a wide variety of games The major difference between these two types of engines is the level ofabstraction and type of services provided, whether they are high-level and very general, or rather low-level andmore application-specific There are many trade-offs involved in this decision, mostly concerning performanceand flexibility

In the case of the Parsec engine, the engine code itself is also intertwined to a certain extent with the game code.The modularity achieved through the subsystem structure is primarily used to ensure portability and extensibility

of the engine and game as a whole, but not meant to provide an isolated framework with the actual game codeentirely factored out Therefore, the Parsec SDK contains the game code of Parsec, as well as the high-levelengine code, in order to allow extensions and game code changes to take place wherever they are needed

A major reason for this decision is historical, since the initial goal was to create a game and game engine at thesame time, which should allow to be ported easily as a whole That is, the goal was not to develop a generalframework and build a game on top of that at a later time This has certainly influenced the architecture a lot and

is very similar to the approach taken by the Quake engine [Quake], for instance

Engine code

The entire code of the Parsec engine, as well as the game, is written in a C-like subset of C++ This means thatthe code employs things like C++ comments and inline variable declaration, but does not use C++ classes,templates, and the like However, it makes use of derived structures containing member variables, but nomember functions

Since all engine functions are embedded in the global namespace, information which subsystem or logical group

of functions a function belongs to is encoded in the name of the function itself This is achieved by using uniquefunction name prefixes

Using C++ simply as a more powerful version of the C programming language has advantages with respect toportability between different compilers and may have certain performance benefits, but is also very much amatter of personal preference Although the Parsec engine is not object-oriented in the sense that it employsclasses with member functions, integrating code with data, its overall architecture could nevertheless beconsidered as being very object-oriented, and it is also extremely modular

Trang 35

Overview 28

Since the Parsec engine should not be viewed as being entirely separate from Parsec, the game, it is certainlyinteresting to have an overview of the game itself The next section tries to summarize the background andintentions of Parsec – the game

5.2 The game

Parsec most of all is a fast-paced space shooter, where the emphasis is more on fun and action than on realism

As far as game play itself is concerned, it has probably more in common with the network game play of person shooters like Quake, than with sophisticated space simulations like Elite

first-It is currently a pure multiplayer game, i.e., there are no AI-controlled bots or other spacecraft not navigated byhumans The main arena of game play is the global Internet; more specifically, a distributed universe of Parsecgameservers and players able to compete in numerous galaxies and solar systems

After starting up the game, the player simply chooses a ship and selects a galaxy to play in Players can do this in

a number of ways They can either select the desired galaxy in the starmap, enter stellar coordinates, choose agameserver from a list of servers, or enter the DNS/IP address of a specific gameserver manually A gameserverdirectly corresponds to a galaxy in the Parsec universe

This is the procedure of joining a client-server game There is also a much simpler peer-to-peer game for localarea networks (LANs), where the game automatically finds other players on the local segment via broadcasting

In this scenario, there is no need for a central server at all

Every ship has some basic equipment right from the start, but to get most of the interesting gear like additionalweapons, energy, invulnerability, and the like, players have to pick up power-ups that are floating through outerspace When a player is shot down, most of the collected weapons and devices are lost and the ship is once againback down to the basic equipment Other players can then reclaim the power-ups that have been lost by thefallen After being shot down, players can of course immediately join the action again, as is customary for actiongames

The basic mode of game play is death-match, where players simply try to collect as many kills as possible.Basically, there are no explicit game sessions, players can join and leave a game at any time However, it ispossible to set kill and time limits on the server for specific solar systems, or for peer-to-peer play in general, inorder to determine the winner of a game after one of these limits has been reached

In the game, most of the screen is dominated by an outer space backdrop, comprised of large area nebulae,planets, and thousands of tiny stars This background is rendered using an image-based rendering approach, asdetailed in [Had98]

See figure 5.1 for a view from within the cockpit of a Parsec spacecraft outwards to the vast emptiness of outerspace

Figure 5.1: View from the cockpit of one of Parsec’s spacecraft

Trang 36

Overview 29

The most important geometric objects in Parsec are player spacecraft, followed by hundreds of power-ups.Weapons and special effects are rendered as a mixture between geometry and particle systems, or particlesystems alone The red sphere that surrounds the enemy spacecraft in the center of the screen of figure 5.1depicts its protective shield and consists entirely of particles

5.3 The vision

As already mentioned, the Parsec project is a collaborative effort of several people, including programmers,artists, and musicians The goal was to develop a high performance multiplayer game engine and a 3D actiongame that is easily portable between different hardware platforms Since the main target of the multiplayer game

is the global Internet, this emphasis on portability was not hard to decide upon In the Parsec universe, players onmany different platforms should be allowed to play together seamlessly

The architecture was designed in a way to allow easy porting, but actual ports were planned to be done by theinitial project members The major platforms targeted are Win32, MacOS, and Linux Support for both theOpenGL and Glide graphics APIs has also been implemented

The client-server game exclusively uses the TCP/IP protocol suite However, for playing on LANs the IPXprotocol is also supported In order to be a part of the global Parsec universe TCP/IP is mandatory, though.Another goal of the Parsec project was that sufficiently programming-savvy players should be able to extend thegame, as well as the engine, for themselves People not able to extend the game in the way they would like canthen also download and use extensions developed by others This kind of user-extensibility has been spear-headed by the original Quake engine and gained wide support among developers and game players alike

In order to facilitate user-developed extensions, we have developed the Parsec SDK, which contains all independent game and engine code This kind of code constitutes the majority of the code-base Additions,extensions, and modifications made using the SDK should be able to immediately run on all supported platformsand graphics APIs, provided certain minimal rules are obeyed

platform-5.4 How to build a game

This section gives a very brief overview of what goes into developing a computer game

Naturally, the first thing that is needed is an idea of what kind of game it should become, what the game play

should be like, etc This process is commonly referred to as game design, and for the game itself is of much

higher importance than technological considerations [Rol99] focuses on game design, as well as managementaspects like team building and motivation [Sal99] also contains useful material on the topic of game design.This thesis, however, is exclusively concerned with the technology behind a computer game, which is usuallysynonymous with the term game engine Still, the features that must be supported by the engine that should drivethe game are tightly related to the kind of game it is, maybe only the general genre, maybe even more specific

In order to develop a game, an already existing game engine, like the Unreal engine [Unreal], might be used Thegame engines driving today’s must successful 3D action games are available for licensing Or, a custom gameengine might be developed from scratch Hybrid approaches are of course also possible Many game companieslicense existing engine code and add new features and functionality from there, to gain an advantage and head-start in development time and effort, but still not losing the flexibility of customizing the engine for themselves.This kind of hybrid approach was chosen for the Half-Life game engine developed by Valve Software[HalfLife], which is based upon the Quake II engine [Quake]

For the development of Parsec we chose to create our own game engine entirely from scratch, building upon nopreviously developed material This has the major advantage of yielding an engine that is ideally suited for theactual game However, this approach of course incurs significant development time

Hence, we will now continue our discussion with how to build a game engine

Trang 37

Overview 30

5.5 How to build an engine

The development of a game engine is first guided by the most fundamental decision whether it should become anentirely general game engine that can subsequently be used to develop almost every kind of game, or should befocused on a single genre, or maybe even developed for only a single game

We have chosen to focus on the genre of space combat games, and, more specifically, to tailor the engine to theexact kind of game we had in mind

The second extremely important decision is what the architecture of the game should look like An importantconsideration in this respect is whether the engine will be developed exclusively for a single target platform,Win32 and Direct3D, for instance, or should target multiple platforms simultaneously, maybe even be portable

in a very general sense This decision guides the design of the architecture in many ways

We have chosen to put a strong emphasis on portability, which has probably become the single most importantinfluence on the architecture of the Parsec engine The architecture consists of several subsystems andcorresponding abstract interface specifications Everything else is also decoupled from actual system-dependencies via the concepts of abstraction and modularity

Another important decision is what programming language, or maybe set of programming languages, to use.This also by and large decides whether the engine code will be procedural or object-oriented The Quake, Quake

II, and Quake 3 Arena engines, for instance, deploy ANSI C and a procedural programming paradigm However,the Unreal engine, for example, is constructed in an extremely object-oriented way and uses C++ as itsprogramming language of choice, supplemented by a special-purpose scripting language called UnrealScript.The Half-Life engine uses a kind of hybrid approach, extending the inherently procedural Quake II engine withobject-oriented game code written in C++

For the Parsec engine, we have chosen a procedural programming paradigm and a C-like subset of C++, which isalmost identical to ANSI C, as programming language However, its architecture uses many object-orientedconcepts

Parts of a game engine

A game engine naturally consists of many parts and has to offer a wide variety of facilities One of the mostvisible parts is certainly the graphics code However, audio services for playing sound and music, networkingservices, and user input services supporting different input devices are also very important In pure multiplayergames, the networking component might even be considered to be more important than all the flashy graphics.Still, the most prevalent part of a high-profile game is almost always its graphics, which is especially true for thetechnology-driven genre of 3D action games

At a somewhat higher level of abstraction, a game engine has to manage the world consisting of different kinds

of objects, be able to perform collision detection between these objects, maybe even feature sophisticatedphysics Computer-controlled opponents are very important for most games, if they are not pure multiplayergames Therefore, an AI component may be essential

Many games employ scripting languages to create content related to game play Instead of using the rather level programming languages the engine itself is built in, many engines support a scripting language which iseasier to use, more powerful, and can often be used by the level designers themselves, instead of requiring theassistance of a dedicated programmer Things implemented in a scripting language might range fromdetermining what switch has to be activated in order to open a certain door, to interactively progressing through

Trang 38

Overview 31

5.6 Chapters outline

This section gives an overview of each of the following chapters in this part of the thesis – Part II: Design and Architecture of Parsec.

Chapter 6 (Architecture Overview) provides a high-level overview of the architecture of the Parsec engine It

describes the separation into game code and engine code It then continues by outlining the structure of theengine code, its separation into several subsystems, describing the major components, and how this structure isutilized to achieve portability through modularity

Chapter 7 (Architecture Details) continues the architectural discussion with a more detailed coverage of specific

topics, like code structure and naming conventions, how subsystems may be bound dynamically in order to allowon-the-fly switching, and considerations concerning portability and extensibility It also describes the centralgame loop, which provides the backbone for what would otherwise be somewhat random use of engine features.Furthermore, it contains a detailed discussion of how developing for multiple graphics APIs in an abstract andportable manner is possible and achieved within the Parsec engine The notion of shaders is very important inthis respect and will be introduced on this occasion

Chapter 8 (Low-level Subsystem Specifications) describes the part of the Parsec engine which is responsible for

encapsulating all system-dependencies and thus key to its portability This part consists of several subsystemscategorized into two different types The first type is used to isolate code depending on the underlying hostsystem (Win32 or Linux, for instance) The second type encapsulates code depending on the underlying graphicsAPI (OpenGL or Glide, for example) These two different kinds of subsystems comprise practically everythingthat enables the Parsec engine to be portable between host systems, as well as graphics APIs They are whatallows the majority of the engine and game code to be entirely independent from such low-level details, and thusimmediately port over to other target platforms, provided the low-level subsystems have been implemented forthem Note that this is achieved without compromising performance, which is an extremely crucial issue in anygame engine

Chapter 9 (Managing Objects) is the first of several chapters devoted to specific aspects and components of the

Parsec engine It describes the object subsystem, which is responsible for managing geometric objects and theentire world geometry This includes the actual data structures used to represent objects, how and when they arerendered, what capabilities are offered, and so on This chapter also includes a detailed discussion of the APIoffered by the object subsystem

Chapter 10 (The ITER Interface) is devoted to the abstract immediate-mode rendering interface which is used to

write graphics code that can then potentially run on any underlying graphics API That is, the ITER interface is

an internal graphics API which can be used to write graphics code independent of any actual API present at time The term immediate-mode denotes that the level of abstraction is conceptually on the same level as, forinstance, the OpenGL API, which basically means at the single-primitive level, like single triangles, strips oftriangles, and the like This is in contrast to retained-mode, or scene graph, APIs, where entire scenes aresubmitted to the API for rendering as a whole The ITER interface also employs an abstract and API-independent specification of how primitives should be shaded A much more powerful version of these so-calledshaders is also available, and the subject of the following chapter

run-Chapter 11 (Shaders) builds upon the abstract specification of how to shade primitives already introduced by the

previous chapter and extends this notion to a simple, yet powerful, shading language This chapter specifies anddescribes this language and its components like color animations and texture animations Shaders are especiallyuseful to give objects a more dynamic appearance, since they offer a lot of facilities for animating the surface ofprimitives, instead of simply using lighted, but otherwise static, textures for shading them

Chapter 12 (The Particle System) describes the quite powerful particle system of the Parsec engine, which is

used to render many special effects, from explosions to missile propulsion fumes It introduces the notions ofparticles and particle systems, and also includes a detailed discussion of how the subsystem works, as well ascoverage of the corresponding API functions available

Trang 39

Overview 32

Chapter 13 (Networking) briefly summarizes the networking subsystem, which enables multiplayer game play It

introduces the basic notions of state information vs remote-events, and describes the three major layers thatcomprise the structure of this subsystem The game-code interface layer constitutes the networking API for thegame code The Parsec protocol layer encapsulates the underlying game protocol, which might be a peer-to-peerprotocol, a pure client-server protocol (the so-called gameserver protocol), or a mixture of both (the so-calledslotserver protocol) Finally, the packet API layer encapsulates the underlying transport protocol, TCP/IP or IPX,for instance

Chapter 14 (The Command Console) introduces the command console in detail, as well as the API that can be

used by game code to register new console commands and variables Parsec’s command console basically is atranslucent window that can be overlayed onto the screen at any time, to be able to enter commands at a UNIX-like console prompt It enables access to a lot of engine features and game play variables, allows data to beloaded, listed, and manipulated, among lots of other things The console is a very important facility duringdevelopment and debugging, but also offers a lot of functionality ideal for power users

Trang 40

Architecture Overview 33

6 Architecture Overview

This chapter provides a high-level overview of the architecture of the Parsec engine Later chapters will then beconcerned with more detailed descriptions of individual engine components and subsystems, as well as supplyfurther rationale for the architectural decisions that have been made

The Parsec engine can conceptually be viewed as consisting of two major parts, although in reality they areintertwined to a certain extent:

• Game-specific code (GAME CODE)

• General engine code (ENGINE CODE)

The game code contains everything directly related to game play, like the actual game logic and glue to make

everything work together properly That is, it contains what can be considered being the actual game, buildingupon the engine code which is much more general in nature

The engine code itself is likewise comprised of two major parts The first part, called the engine core, contains system-independent code, providing general services and facilities The second part, called low-level subsystems,

contains dependent code residing beneath an abstract interface structure encapsulating all dependent details

system-Figure 6.1 depicts this very high-level view of Parsec’s structure

Figure 6.1: High-level Parsec structure

This structure is very modular and cleanly encapsulates all system-dependencies in largely autonomoussubsystems The game code, as well as the engine core are written in a portable manner and do not depend on thehost system or the underlying graphics API in any way Since they constitute the majority of the code-base,porting effort is thus minimized

As we have already stated in the previous chapter, the Parsec SDK contains practically all system-independentcode in order to allow the development of extensions and modifications In the terminology we have justdescribed, that means that it is comprised of the entire game code and the engine core

The following sections are going to look at each of the components shown in figure 6.1 in more detail

6.1 The game code

This part of the code basically contains everything that transforms a somewhat general engine into an actualgame Thus, it takes facilities provided by the engine code and uses them to create an interactive gamingexperience, where the technology underpinnings represented and comprised by the engine step back, and thecontent delivered to the player becomes important

The game code determines everything that constitutes the behavior of the game What the player can do; howmany missiles can be fired at what rate; how many hitpoints they cause; even if there are any missiles at all andwhat other types of weapons are available; what the player has to do in order to get them; things like these All ofthe above is part of the game code

Định dạng
Số trang	147
Dung lượng	2,09 MB