Game Sound System Architecture Just like a graphics subsystem, audio subsystems can have a few different implemen-tations.. bool m_bInitialized; // has the sound been initialized WAVEFOR
Trang 1experi-The same is true for games Done well, sound and music convey critical information
to the player as well as incite powerful emotional reactions One of my favorite ples of powerful music in any game is the original Halo from Bungie When themusic segues into a driving combat tune, you can tell what is coming up—lots ofcarnage, hopefully on the Covenant side of things!
exam-I’m biased, of course, but an excellent example of sound design and technologycomes from Thief: Deadly Shadows by Ion Storm This game integrated the physics,portal, and AI subsystems with the sound system AI characters would receive prop-agated sound effect events that happened anywhere near them, and they would reactaccordingly If you got clumsy and stumbled Garrett, the main character in Thief,into a rack of swords, AI characters around the corner and down the hall wouldhear it, and they’d come looking for you
Another great example is from Mushroom Men: The Spore Wars for the Wii by RedFly Studio In this game, the sound system was actually integrated into the graphicsand particles system, creating a subtle but effective effect that had each sparkle of aparticle effect perfectly timed with the music They called this the “Metronome.”
391
Trang 2In this chapter, I’ll take you as far as I can into the world of sound We’ll exploreboth sound effects and music With a little work and imagination, you should beable to take what you learn here and create your own sound magic.
How Sound Works
Imagine someone on your street working with a hammer Every time the hammerstrikes a nail, or perhaps the poor schmuck’s finger, a significant amount of energy
is released, causing heat, deformation of the hammer, deformation of whatever washit, and vibrations in all the objects concerned as they return to an equilibriumstate A more complete description of the situation would also include high-amplitude vibration of Mr Schmuck’s vocal cords Either way, those vibrations arepropagated through the air as sound waves
When these sound waves strike an object, sometimes they make the object vibrate atthe same frequency This only happens if the object is resonant with the frequency ofthe sound waves Try this: Go find two guitars and make sure they are properlytuned Then hold them close together and pluck the biggest, fattest string of one ofthem You should notice that the corresponding string on the second guitar willvibrate, too, and you never touched it directly
The experiment with the guitars is similar to how the mechanical parts of your earwork Your ears have tiny hairs, each having a slightly different length and resonantfrequency When sound waves get to them and make different sets of them vibrate,they trigger chemical messages in your brain, and your conscious mind interprets thesignals as different sounds Some of them sound like a hammer striking a nail, andothers sound more like words you’d rather not say in front of little kids
The tone of a sound depends on the sound frequency, or how fast the vibrations hityour ear Vibrations are measured in cycles per second, or Hertz (abbreviated Hz).The lowest tone a normal human ear can hear is 20Hz, which is so low you almostfeel it more than you hear it! As the frequency rises, the tone of the sounds getshigher until you can’t hear it anymore The highest frequency most people can hear
is about 20,000Hz, or 20 kiloHertz (KHz)
The intensity of a sound is related to the number of air molecules pushed around bythe original vibration You can look at this as the“pressure” applied to anything by asound wave A common measurement of sound intensity is the decibel, or dB Thismeasurement is on a logarithmic scale, which means that a small increase in the dBlevel can be a dramatic increase in the intensity of the sound Table 13.1 shows the
dB levels for various common sounds
392 Chapter 13 n Game Audio
Trang 3The reason the scale is a logarithmic one has to do with the sensitivity of your ears.
Normal human hearing can detect sounds over an amazing range of intensity, with
the lowest being near silence and the highest being something that falls just shy of
blowing your eardrums out of your head The power difference between the two is
over one million times Since the range is so great, it is convenient to use a nonlinear,
logarithmic scale to measure the intensity of sound
Did you ever wonder why the volume knob on expensive audio gear is marked with
negative dB? This is because volume is actually attenuation, or the level of change of
the base level of a sound Decibels measure relative sound intensity, not absolute
intensity, which means that negative decibels measure the amount of sound
reduc-tion Turning the volume to 3dB lower than the current setting reduces the power
to your speakers by half Given that, and I can put this in writing, all the stereo
heads out there will be happy to know that if you set your volume level to 0dB,
you’ll be hearing the sound at the level intended by the audio engineer This is, of
course, usually loud enough to get complaints from your neighbors
Digital Recording and Reproduction
If you happen to have some speakers with the cones exposed, like my nice Boston
Acoustics setup, you can watch these cones move in and out in a blur when you
crank the music It turns out that the speakers are moving in correlation to the plot
of the sound wave recorded in the studio
Table 13.1 Decibel Levels for Different Sounds
150 Mr Mike screaming when he beats his nephew Chris at Guitar Hero
How Sound Works 393
Trang 4You’ve probably seen a graphic rendering of a sound wave; it looks like some randomup-and-down wiggling at various frequencies and amplitudes (see Figure 13.1).This scratching is actually a series of values that map to an energy value of the sound
at a particular moment in time This energy value is the power level sent into aspeaker magnet to get the speaker cone to move, either in or out The frequency, ortone, of the sound is directly related to the number of up/down wiggles you see in thegraphic representation of the waveform The speaker is reproducing, to the best of itsability, the identical waveform of the sound that was recorded in the studio
If you zoom into the waveform, you’ll see these energy values plotted as points aboveand below the X-axis (see Figure 13.2)
If all the points were in a straight line at value 0.0f, there would be complete silence.The odd thing is, if all the points were in a straight line at 1.0, you would get a little
“pop” at the very beginning and silence thereafter The reason is the speaker conewould sit at the maximum position of its movement, making no vibrations at all.The amplitude, or height, of the waveform is a measure of the sound’s intensity.Quiet sounds only wiggle close to the 0.0 line, whereas loud noises wiggle all theway from 1.0f to -1.0f You can also imagine a really loud noise, like an explosion,has an energy level that my Boston Acoustics can’t reproduce and can’t be accuratelyrecorded anyway because of the energies involved Figure 13.3 shows what happens
to a sound wave that fails to record the amplitude of a high-energy sound
Instead of a nice waveform, the tops and bottoms are squared off This creates anasty buzzing noise because the speaker cones can’t follow a nice smooth waveform
Figure 13.2
A closer view of a sound wave.
Figure 13.1
A typical sound wave.
394 Chapter 13 n Game Audio
Trang 5Audio engineers say that a recording like this had the “levels too hot,” and they had
to re-record it with the input levels turned down a bit If you ever saw those
record-ing meters on a mixrecord-ing board, you’d notice that the input levels jumped into the red
when the sound was too hot, creating the clipped waveforms The same thing can
happen when you record sounds straight to your desktop with a microphone, so
keep an eye on those input levels
Crusty Geezers Say the Wildest Things
On the Microsoft Casino project, the actors were encouraged to come up with
extemporaneous barks for their characters Not surprisingly, some of them had
to be cut from the game One was cut by Microsoft legal because they thought
it sounded too much like the signature line, “I’ll be back,” from Arnold
Schwarzenegger Another was cut because it made disparaging remarks
toward the waitresses at the Mirage Resorts My favorite one of all time,
though, was a bit of speech from a crusty old geezer, “You know what I
REALLY love about Vegas??? The hookers!!!”
Sound Files
Sound files have many different formats, the most popular being WAV, MP3, OGG,
and MIDI The WAV format stores raw sound data, the aural equivalent of a BMP
or TGA file, and is therefore the largest MP3 and OGG files are compressed sound
file formats and can achieve about a 10:1 compression ratio over WAV, with only a
barely perceptible loss in sound quality MIDI files are almost like little sound
pro-grams and are extremely tiny, but the sound quality is completely different—it sounds
like those video games from the 1980s So why would you choose one over the other?
MIDI was popular for downloadable games and games on handheld platforms
because they were so small and efficient These days MIDI is more a choice for
style than anything else, since even handheld devices are fully capable of playing
most sound formats The WAV format takes a lot of memory, but it is incredibly
easy on your CPU budget MP3s and OGGs will save your memory budget but will
hit your CPU for each stream you decompress into a hearable sound
Figure 13.3
A clipped sound wave.
How Sound Works 395
Trang 6If you’re short on media space, you can store everything in MP3 or OGG anddecompress the data in memory at load time This is a pretty good idea for shortsound effects that you hear often, like weapons fire and footsteps Music and back-ground ambiance can be many minutes long and are almost always played in theircompressed form.
Always Keep Your Original High-Fidelity Audio Recordings
Make sure that all of your original sound is recorded in high-resolution WAV
format, and plan to keep it around until the end of the project If you convert
all your audio to a compressed format such as MP3, you’ll lose sound quality,
and you won’t be able to reconvert the audio stream to a higher bit-rate if the
quality isn’t good enough This is exactly the same thing as storing all your
artwork in high-resolution TGAs or TIFFs You’ll always have the original work
stored in the highest possible resolution in case you need to mess with it later.
A Quick Word About Threads and Synchronization
Sound systems run in a multithreaded architecture I’m talking about real threading here and not the cooperative multitasking What’s the difference? Youshould already be familiar with the Process and ProcessManager classes fromChapter 7, “Controlling the Main Loop.” These classes are cooperative, whichmeans it is up to them to decide when to return control to the calling routine Forthose of you who remember coding in the old DOS or Windows 3.x days, this is all
multi-we had without some serious assembly level coding In a way, it was a lot safer, forreasons you’ll see in a minute, but it was a heck of a lot harder to get the computer toaccomplish many tasks at once
A classic task in games is to play some neat music in the background while you areplaying the game Like I said at the start of this chapter, sound creates emotion inyour game But what is really going on in the background to make sound come out
of your speakers?
Sound data is pushed into the sound card, and the sound card’s driver software verts this data into electric signals that are sent to your speakers The task of readingnew data into the sound card and converting it into a usable format takes some CPUtime away from your computer While modern sound cards have CPUs of their own,getting the data from the digital media into the sound card still takes your main CPU.Since sound data is played at a linear time scale, it’s critical to push data into thesound card at the right time If it is pushed too early, you’ll overwrite music that isabout to be played If it is pushed too late, the sound card will play some musicyou’ve already heard, only to skip ahead when the right data gets in place
con-396 Chapter 13 n Game Audio
Trang 7This is the classic reader/writer problem, where you have a fixed memory area with a
writer that needs to stay ahead of the reader If the reader ever overtakes the writer or
vice versa, the reader reads data that is either too old or too new When I heard
about this in college, the example presented was always some horribly boring data
being read and written, such as employee records or student class enrollment records
I would have paid a lot more attention to this class if they had told me the same
solutions could be applied to computer game sound systems
What makes this problem complicated is there must be a way to synchronize the
reader and writer to make sure the writer process only writes when it knows it is
safely out of the reader’s way Luckily, the really nasty parts of this problem are
han-dled at a low level in DirectSound, but you should always be aware of it so you don’t
pull the rug out from the sound system’s feet, so to speak Let me give you an
example
In your game, let’s assume there’s a portable stereo sitting on a desk, and it is playing
music You take your gun and fire an explosive round into the radio and destroy the
radio Hopefully, the music the radio is playing stops when the radio is destroyed,
and the memory used by the music is returned to the system You should be able to
see how order-dependent all this is If you stop the music too early, it looks like the
radio was somehow self-aware and freaked out just before it was sent to radio
nir-vana If you release all the radio’s resources before you notify the sound system, the
sound system might try to play some sound data from a bogus area of memory
Worse still, because the sound system runs in a different thread, you can’t count on a
synchronous response when you tell the sound system to stop playing a sound
Granted, the sound system will respond to the request in a few milliseconds, far
shorter than any human can perceive, but far longer than you could count on using
the memory currently allocated to the sound system for something that is still active
All these complications require a little architecture to keep things simple for
pro-grammers who are attaching sounds to objects or music to a game
Game Sound System Architecture
Just like a graphics subsystem, audio subsystems can have a few different
implemen-tations DirectSound, Miles Audio, WWise, and FMod are a few examples It’s a good
idea to create an implementation-agnostic wrapper for your sound system so that
you are free to choose the implementation right for your game The audio system
presented in this chapter can use DirectSound or Miles, and the only change you
have to make for your high-level game code is one line of code Figure 13.4 shows
the class hierarchy for our sound system
Game Sound System Architecture 397
Trang 8The sound system inherits fromIAudio This object is responsible for the list of soundscurrently active As you might predict, you only need one of these for your game.The Audio base class implements some implementation-generic routines, and the
DirectSoundAudioclass completes the implementation with DirectSound-specific calls.The sound system needs access to the bits that make up the raw sound The
IAudioBuffer interface defines the methods for an implementation-generic soundbuffer AudioBuffer is a base class that implements some of the IAudioBuffer
interface, and the DirectSoundAudioBuffer completes the implementation ofthe interface class using DirectSound calls Each instance of a sound effect will useone of these buffer objects
A Resource encapsulates sound data, presumably loaded from a file or yourresource cache If you had five explosions going off simultaneously, you’d have one
Resource object and five DirectSoundAudioBuffer objects
Sound Resources and Handles
If you want to play a sound in your game, the first thing you do is load it Soundresources are loaded exactly the same as other game resources; they will likely exist
in a resource file Sound effects can be tiny or quite long Your game may have sands of these things, or tens of thousands as many modern games have Just as yousaw in Chapter 8, “Loading and Caching Game Data,” you shouldn’t store each effect
thou-in its own file; rather, you should pull it from a resource cache
A resource cache is convenient if you have many simultaneous sounds that use thesame sound data, such as weapons fire You should load this resource once, taking uponly one block of memory, and have the sound driver create many“players” that willuse the same resource
Figure 13.4
Sound system class hierarchy.
398 Chapter 13 n Game Audio
Trang 9The concept of streaming sound, compressed or otherwise, is beyond the scope of this
chapter The sound system described here uses the resource cache to load the sound
data from a resource file, decompresses it if necessary, and manages DirectSound audio
buffers if you happen to have the same sound being played multiple times As usual, I’m
exchanging clarity for performance, specifically memory usage, so take this into account
when looking at this system A commercial grade sound system would only load the
compressed sound into memory and use a thread to decompress bits of it as it is played,
saving a ton of memory With that caveat in mind, the first thing to do is define three
classes to help the resource cache load and decompress WAV and OGG files::
class SoundResourceExtraData : public IResourceExtraData
{
friend class WaveResourceLoader;
friend class OggResourceLoader;
public:
SoundResourceExtraData();
virtual ~SoundResourceExtraData() { }
virtual std::string VToString() { return “SoundResourceExtraData”; }
enum SoundType GetSoundType() { return m_SoundType; }
WAVEFORMATEX const *GetFormat() { return &m_WavFormatEx; }
int GetLengthMilli() const { return m_LengthMilli; }
protected:
enum SoundType m_SoundType; // is this an Ogg, WAV, etc.?
bool m_bInitialized; // has the sound been initialized
WAVEFORMATEX m_WavFormatEx; // description of the PCM format
int m_LengthMilli; // how long the sound is in milliseconds
};
class WaveResourceLoader : public IResourceLoader
{
public:
virtual bool VUseRawFile() { return false; }
virtual unsigned int VGetLoadedResourceSize(char *rawBuffer,
unsigned int rawSize);
virtual bool VLoadResource(char *rawBuffer, unsigned int rawSize,
Trang 10class OggResourceLoader : public IResourceLoader
{
public:
virtual bool VUseRawFile() { return false; }
virtual unsigned int VGetLoadedResourceSize(char *rawBuffer,
unsigned int rawSize);
virtual bool VLoadResource(char *rawBuffer, unsigned int rawSize,
The SoundResourceExtraData class stores data that will be used by DirectSound It
is initialized when the resource cache loads the sound Take a look at the protectedmembers first The m_SoundTypemembers store an enumeration that defines the dif-ferent sound types you support: WAV, OGG, and so on The next Boolean storeswhether the sound has been initialized, which is to say that the sound is ready to play.The next data member, m_wavFormatEx, stores information about the sound so thatDirectSound can play it This includes how many channels are included in the sound,its sample rate, its bits per sample, and other data The last member is a conveniencemember used to grab the length of the sound in milliseconds, which is nice to have ifyou are timing something, like an animation, to coincide with the end of the sound
A real game would keep compressed sounds in memory and send bits and pieces ofthem into the audio hardware as they were needed, saving precious memory space Forlonger pieces such as music, the system might even stream bits of the compressed musicfrom digital media and then uncompress those bits as they were consumed by the audiocard As you can see, that system could use its own book to describe it thoroughly.The resource cache will use implementations of the IResourceLoader interface todetermine what kind of resource the sound is and the size of the loaded resource and
to actually load the resource into the memory the resource cache allocates
Stream Your Music
A better solution for music files, which tend to be huge in an uncompressed form,
is to stream them into memory as the sound data is played This is a complicated
subject, so for now we’ll simply play uncompressed sound data that is loaded
completely into memory Notice that even though a multimegabyte OGG file is
loaded into a decompressed buffer, taking up perhaps 10 times as much
memory, it loads many times faster As you might expect, the Vorbis
decompression algorithm is much faster than your hard drive.
400 Chapter 13 n Game Audio
Trang 11Loading the WAV Format withWaveResourceLoader
WAV files are what old-school game developers call a chunky file structure Each
chunk is preceded by a unique identifier, which you’ll use to parse the data in
each chunk The chunks can also be hierarchical; that is, a chunk can exist within
another chunk Take a quick look at the code below, and you’ll see what I’m talking
about The first identifier, RIFF, is a clue that the file has an IFF, or Indexed File
Format, basically the same thing as saying a chunky format If the next identifier in
the file isWAVE, you can be sure the file is a WAV audio file
You’ll notice the identifier is always four bytes and is immediately followed by a
4-byte integer that stores the length of the chunk Chunky file formats allow parsing
code to ignore chunks they don’t understand, which is a great way to create
extensi-ble file formats As you’ll see next, we’re only looking for two chunks from our WAV
file, but that doesn’t mean that other chunks aren’t there:
bool WaveResourceLoader::ParseWave(char *wavStream, size_t bufferLength,
// mmioFOURCC — converts four chars into a 4 byte integer code.
// The first 4 bytes of a valid wav file is ‘R’,’I’,’F’,’F’
type = *((DWORD *)(wavStream+pos)); pos+=sizeof(DWORD);
if(type != mmioFOURCC( ‘R’, ‘I’, ‘F’, ‘F’))
return false;
length = *((DWORD *)(wavStream+pos)); pos+=sizeof(DWORD);
type = *((DWORD *)(wavStream+pos)); pos+=sizeof(DWORD);
// ‘W’,’A’,’V’,’E’ for a legal wav file
if(type != mmioFOURCC( ‘W’, ‘A’, ‘V’, ‘E’))
return false; //not a WAV
// Find the end of the file
fileEnd = length - 4;
memset(&extra->m_WavFormatEx, 0, sizeof(WAVEFORMATEX));
Game Sound System Architecture 401
Trang 12bool copiedBuffer = false;
// Load the wav format and the wav data
// Note that these blocks can be in either order.
GCC_ERROR ”Wav resource size does not equal buffer size”)); return 0;
} memcpy(handle->WritableBuffer(), wavStream+pos, length); pos+=length;
Trang 13// Increment the pointer past the block we just read,
// and make sure the pointer is word aligned.
The ParseWave() method has two parts The first part initializes local and output
variables and makes sure the WAV file has the right beginning tag, RIFF, signifying
that the file is the IFF type, and the identifier immediately following isWAVE If either
of these two checks fails, the method returns false
The code flows into awhile loop that is looking for two blocks: fmt and data They
can arrive in any order, and there may be other chunks interspersed That’s fine, because
we’ll just ignore them and continue looking for the two we care about Once they are
found, we return with success If for some reason we get to the end of the file and we
didn’t find the two chunks we were looking for, we return false, indicating a failure
Loading the OGG Format
The ParseOgg() method decompresses an OGG stream already in memory The
OggVorbis_File object can load from a normal file or a memory buffer Loading
from a memory buffer is a little trickier since you have to “fake” the operations of an
ANSIFILE * object with your own code
This first task is to create a structure that will keep track of the memory buffer, the
size of this buffer, and where the“read” position is:
struct OggMemoryFile
{
unsigned char* dataPtr; // Pointer to the data in memory
size_t dataSize; // Size of the data
size_t dataRead; // Bytes read so far
Game Sound System Architecture 403
Trang 16ParseOgg() method looks like:
bool OggResourceLoader::ParseOgg(char *oggStream, size_t length,
// ok now the tricky part
// the vorbis_info struct keeps the most of the interesting format info
vorbis_info *vi = ov_info(&vf,-1);
Trang 17// get the total number of PCM samples
DWORD bytes = (DWORD)ov_pcm_total(&vf, -1);
// now read in the bits
while(ret && pos<bytes)
This method shows you how to decompress an OGG memory buffer using the
Vor-bis API The method will decompress the OGG stream into a PCM buffer that is
essentially identical to the results you saw earlier with the WaveResourceLoader
The first part of the method initializes the OggMemoryFile structure and sets up
the callback functions for Vorbis Then a structure called vorbis_info is used to
initialize the members of the WAVEFORMATEX, stored with the resource handle
Game Sound System Architecture 407
Trang 18After the memory buffer is double-checked to be big enough to handle the pressed OGG stream, theov_read function is called in a loop to decompress it.
decom-If you feel sufficiently energetic one weekend, this is where you’ll want to playaround if you’d like to implement decompression of the OGG stream in real time.Instead of decompressing the entire buffer, you’ll decompress a part of it, save thestream where you left off, and let DirectSound play the buffer Before DirectSoundfinishes playing the buffer, you’ll run the decompression loop again into a differentbuffer If your timing is right, DirectSound will be playing from one buffer while youare decompressing into another If you think this is touchy work, you are right; it isfor this reason that sound systems were typically fraught with weird bugs and insta-bility Imagine what would happen if the source OGG stream were thrown out of theresource cache, causing a cache miss and a huge delay in providing DirectSound withthe data it needs to create the illusion of a continuous sound from a single uncom-pressed stream
Always Show Something Moving
Any time you have a while loop that might take some time, such as
decompressing a large OGG file, it’s a good idea to create a callback function
that your game can use to monitor the progress of the routine This might be
important for creating a progress bar or some other animation that will give
your players something to look at other than a completely stalled screen.
Console games are usually required to have on-screen animations during
loads, but this is a good idea for PC games, too.
If you are just lifting this OGG code into your game and ignoring the rest of thischapter, don’t forget to link the Vorbis libraries into your project Since there’s noencoding going on here, you can just link the following libraries:libvorbisfile_static.lib, libvorbis_static.lib, and libogg_static.lib If you are compilingyour code under Visual Studio, you can add the following lines of code to one ofyour CPP files In the GameCode4 source, they are in GameCode4.cpp, where all ofthe other #pragma comment() statements are
#pragma comment(lib, “libogg_static.lib”)
#pragma comment(lib, “libvorbis_static.lib”)
#pragma comment(lib, “libvorbisfile_static.lib”)
To learn more about the OGG format, go to www.xiph.org/ The technology is opensource, the sound is every bit as good as MP3, and you don’t have to worry aboutpaying expensive license fees In other words, unless you have money to burn, useOGG for sound data compression Lots of audio tools support OGG, too You can
go to the Xiph website to find out which ones
408 Chapter 13 n Game Audio
Trang 19IAudioBuffer Interface and AudioBuffer Class
Now that you’ve got a sound in memory, it’s time to play it.IAudioBuffer exposes
methods such as volume control, pausing, and monitoring individual sounds while
they are in memory IAudioBuffer, and a partial implementation AudioBuffer,
are meant to be platform agnostic You’ll see the DirectSound specific
implementa-tion shortly Here’s the interface class:
class IAudioBuffer
{
public:
virtual ~IAudioBuffer() { }
virtual void *VGet()=0;
virtual shared_ptr<ResHandle> const VGetResource()=0;
virtual bool VRestore()=0;
virtual bool VPlay(int volume, bool looping)=0;
virtual bool VPause()=0;
virtual bool VStop()=0;
virtual bool VResume()=0;
virtual bool VTogglePause()=0;
virtual bool VIsPlaying()=0;
virtual bool VIsLooping() const=0;
virtual void VSetVolume(int volume)=0;
virtual int VGetVolume() const=0;
virtual float VGetProgress() const=0;
};
The first method is a virtual destructor, which will be overloaded by classes that
implement the interface If this destructor weren’t virtual, it would be impossible to
release audio resources grabbed for this sound effect
The next method, VGet(), is used to grab an implementation-specific handle to the
allocated sound When I say implementation-specific, I’m talking about the piece of
data used by the audio system implementation to track sounds internally In the case
of a DirectSound implementation, this would be aLPDIRECTSOUNDBUFFER This is
for internal use only, for whatever class implements the IAudio interface to call
Your high-level game code will never call this method unless it knows what the
implementation is and wants to do something really specific
The next method, VRestore(), is primarily for Windows games since it is possible
for them to lose control of their sound buffers, requiring their restoration The audio
system will double-check to see if an audio buffer has been lost before it sends
Game Sound System Architecture 409
Trang 20commands to the sound driver to play the sound If it has been lost, it will call the
VRestore() method, and everything will be back to normal Hopefully, anyway.The next four methods can control the play status on an individual sound effect
VPlay() gets a volume from 0–100 and a Boolean looping, which you set to
true if you want the sound to loop VPause(), VStop(), VResume(), and
VPause() let you control the progress of a sound
The volume methods do exactly what you’d think they do: set and retrieve the rent volume of the sound The method that sets the volume will do so instantly, ornearly so If you want a gradual fade, on the other hand, you’ll have to use something
cur-a little higher level Luckily, we’ll do exactly that later on in this chapter
The last method, VGetProgress(), returns a floating-point number between 0.0fand 1.0f and is meant to track the progress of a sound as it is being played If thesound effect is one-fourth of the way through playing, this method will return 0.25f
All Things Go from 0.0 to 1.0
Measuring things like sound effects in terms of a coefficient ranging from 0.0
to 1.0 instead of a number of milliseconds is a nice trick This abstraction gives
you some flexibility if the actual length of the sound effect changes, especially
if it is timed with animations, or animations are tied to sound, which is very
frequently the case If either the sound changes or the animation changes, it is
easy to track one versus the other.
With the interface defined, we can write a little platform-agnostic code and create the
AudioBuffer class The real meat of this class is the management of the smartpointer to a SoundResource This guarantees that the memory for your soundeffect can’t go out of scope while the sound effect is being played
class AudioBuffer : public IAudioBuffer
{
public:
virtual shared_ptr<ResHandle> VGetResource() { return m_Resource; }
virtual bool VIsLooping() const { return m_isLooping; }
virtual int VGetVolume() const { return m_Volume; }
Trang 21This class holds the precious smart pointer to your sound data managed by the
resource cache and implements the IAudioBuffer interface VIsLooping() and
VGetVolume() tell you if your sound is a looping sound and the current volume
setting VGetResource() returns a smart pointer to the sound resource, which
manages the sound data
We’re nearly to the point where you have to dig into DirectSound Before that
hap-pens, take a look at the classes that encapsulate the system that manages the list of
active sounds:IAudio and Audio
IAudio Interface and Audio Class
IAudiohas three main purposes: create, manage, and release audio buffers
class IAudio
{
public:
virtual bool VActive()=0;
virtual IAudioBuffer *VInitAudioBuffer(shared_ptr<ResHandle> soundResource)=0;
virtual void VReleaseAudioBuffer(IAudioBuffer* audioBuffer)=0;
virtual void VStopAllSounds()=0;
virtual void VPauseAllSounds()=0;
virtual void VResumeAllSounds()=0;
virtual bool VInitialize()=0;
virtual void VShutdown()=0;
};
Game Sound System Architecture 411
Trang 22VActive() is something you can call to determine if the sound system is active Asrare as it may be, a sound card might be disabled or not installed It is also likely thatduring initialization or game shutdown, you’ll want to know if the sound system has
a heartbeat
The next two methods,VInitAudioBuffer()and VReleaseAudioBuffer(), arecalled when you want to launch a new sound or tell the audio system you are donewith it and it can release audio resources back to the system This is important, soread it twice You’ll call these for each instance of a sound, even if it is exactly thesame effect You might want to play the same sound effect at two different volumes,such as when two players are firing the same type of weapon at each other, or youhave multiple explosions going off at the same time in different places
You’ll notice that the only parameter to the initialize method is a shared pointer to a
ResHandle object This object contains the single copy of the actual decompressedPCM sound data The result of the call, assuming it succeeds, is a pointer to an objectthat implements the IAudioBuffer interface What this means is that the audiosystem is ready to play the sound
The next three methods are system-wide sound controls, mostly for doing things likepausing and resuming sounds when needed, such as when the player on a Windowsgame Alt-Tabs away from your game It’s extremely annoying to have game soundeffects continue in the background if you are trying to check email or convinceyour boss you aren’t playing a game
The last two methods, VInitialize() and VShutdown(), are used to create andtear down the sound system Let’s take a look at a platform-agnostic partial imple-mentation of the IAudio interface:
class Audio : public IAudio
{
public:
Audio();
virtual void VStopAllSounds();
virtual void VPauseAllSounds();
virtual void VResumeAllSounds();
virtual void VShutdown();
static bool HasSoundCard(void);
bool IsPaused() { return m_AllPaused; }
protected:
typedef std::list<IAudioBuffer *> AudioBufferList;
412 Chapter 13 n Game Audio
Trang 23AudioBufferList m_AllSamples; // List of all currently allocated buffers
bool m_AllPaused; // Has the sound system been paused?
bool m_Initialized; // Has the sound system been initialized?
};
We’ll use STL to organize the active sounds in a linked list called m_AllSamples
This is probably good for almost any game because you’ll most likely have only a
handful of sounds active at one time Linked lists are great containers for a small
number of objects Since the sounds are all stored in the linked list, and each sound
object implements the IAudioBuffer interface, you can define routines that
per-form an action on every sound in the system
Trang 24we’ll create our platform-specific code around that technology.
You’ll need to extend this code if you want to play MP3 or MIDI Still, DirectSoundcan make a good foundation for a game’s audio system Let’s take a look at theimplementation for DirectSoundAudio first, which extends the Audio class wejust discussed:
class DirectSoundAudio : public Audio
{
public:
DirectSoundAudio() { m_pDS = NULL; }
virtual bool VActive() { return m_pDS != NULL; }
virtual IAudioBuffer *VInitAudioBuffer(
shared_ptr<ResHandle> soundResource);
virtual void VReleaseAudioBuffer(IAudioBuffer* audioBuffer);
virtual void VShutdown();
virtual bool VInitialize(HWND hWnd);
protected:
IDirectSound8* m_pDS;
414 Chapter 13 n Game Audio
Trang 25The only piece of data in this class is a pointer to an IDirectSound8 object, which is
DirectSound’s gatekeeper, so to speak Initialization, shutdown, and creating audio buffers
are all done through this object One way to look at this is thatDirectSoundAudiois a
C++ wrapper aroundIDirectSound8 Let’s look at initialization and shutdown first:
// Create IDirectSound using the primary sound device
if( FAILED( hr = DirectSoundCreate8( NULL, &m_pDS, NULL ) ) )
return false;
// Set DirectSound coop level
if( FAILED( hr = m_pDS->SetCooperativeLevel( hWnd, DSSCL_PRIORITY) ) )
This code is essentially lifted straight from the DirectX sound samples, so it might
look pretty familiar When you set the cooperative level on the DirectSound object,
you’re telling the sound driver you want more control over the sound system,
specif-ically how the primary sound buffer is structured and how other applications run at
the same time The DSSCL_PRIORITY level is better than DSSCL_NORMAL because
you can change the format of the output buffer This is a good setting for games
that still want to allow background applications like Microsoft Messenger or Outlook
to be able to send something to the speakers
Game Sound System Architecture 415
Trang 26Why bother? If you don’t do this, and set the priority level to DSSCL_NORMAL,you’re basically informing the sound driver that you’re happy with whatever pri-mary sound buffer format is in place, which might not be the same sound formatyou need for your game audio The problem is one of conversion Games use tons
of audio, and the last thing you need is for every sound to go through some sion process so it can be mixed in the primary buffer If you have 100,000 audiofiles and they are all stored in 44KHz, the last thing you want is to have each one
conver-be converted to 22KHz, conver-because it’s a waste of time Take control and use
DSSCL_PRIORITY
The call to SetPrimaryBufferFormat() sets your primary buffer format to a vor you want; most likely, it will be 44KHz, 16-bit, and some number of channelsthat you feel is a good trade-off between memory use and the number of simulta-neous sound effects you’ll have in your game For the purposes of this class, I’mchoosing eight channels, but in a commercial game you could have 32 channels oreven more The memory you’ll spend with more channels is dependent on yoursound hardware, so be cautious about grabbing a high number of channels—youmight find some audio cards won’t support it
// !WARNING! - Setting the primary buffer format and then using this
// for DirectMusic messes up DirectMusic!
//
// If you want your primary buffer format to be 22KHz stereo, 16-bit
// call with these parameters: SetPrimaryBufferFormat(2, 22050, 16);
Trang 27if( FAILED( hr = m_pDS->CreateSoundBuffer( &dsbd, &pDSBPrimary, NULL ) ) )
return DXUT_ERR( L ”CreateSoundBuffer”, hr );
WAVEFORMATEX wfx;
ZeroMemory( &wfx, sizeof(WAVEFORMATEX) );
wfx.wFormatTag = (WORD) WAVE_FORMAT_PCM;
wfx.nChannels = (WORD) dwPrimaryChannels;
wfx.nSamplesPerSec = (DWORD) dwPrimaryFreq;
wfx.wBitsPerSample = (WORD) dwPrimaryBitRate;
wfx.nBlockAlign = (WORD) (wfx.wBitsPerSample / 8 * wfx.nChannels);
wfx.nAvgBytesPerSec = (DWORD) (wfx.nSamplesPerSec * wfx.nBlockAlign);
if( FAILED( hr = pDSBPrimary->SetFormat(&wfx) ) )
return DXUT_ERR( L ”SetFormat”, hr );
SAFE_RELEASE( pDSBPrimary );
return S_OK;
}
You have to love DirectSound This method essentially makes two method calls, and
the rest of the code simply fills in parameters The first call is to
CreateSound-Buffer(), which actually returns a pointer to the primary sound buffer where all
your sound effects are mixed into a single sound stream that is rendered by the
sound card The second call to SetFormat() tells the sound driver to change the
primary buffer’s format to one that you specify
The shutdown method, by contrast, is extremely simple:
The base class’sVShutdown() is called to stop and release all the sounds still active
The SAFE_RELEASE on m_pDS will release the IDirectSound8 object and shut
down the sound system completely
The last two methods of the DirectSoundAudio class allocate and release audio
buffers An audio buffer is the C++ representation of an active sound effect In our
Game Sound System Architecture 417
Trang 28platform-agnostic design, an audio buffer is created from a sound resource, ably something loaded from a file or more likely a resource file.
presum-IAudioBuffer *DirectSoundAudio::VInitAudioBuffer(shared_ptr<ResHandle> resHandle) {
// If it ’s a midi file, then do nothing at this time
// maybe we will support this in the future
GCC_ERROR( “MP3s and MIDI are not supported”);
// Create the direct sound buffer, and only request the flags needed
// since each requires some overhead and limits if the buffer can
if( FAILED( hr = m_pDS->CreateSoundBuffer( &dsbd, &sampleHandle, NULL ) ) )
418 Chapter 13 n Game Audio
Trang 29Notice theswitchstatement at the beginning of this code? It branches on the sound
type, which signifies what kind of sound resource is about to play: WAV, MP3, OGG,
or MIDI In our simple example, we’re only looking at WAV data or OGG data that
has been decompressed, so if you want to extend this system to play other kinds of
sound formats, you’ll hook that new code in right there For now, those other
for-mats are short circuited and will force a failure
The call toIDirectSound8::CreateSoundBuffer() is preceded by setting
vari-ous values of a DSBUFFERDESC structure that informs DirectSound what kind of
sound is being created Take special note of the flags, since that member controls
what can happen to the sound An example is the DSBCAPS_CTRLVOLUME flag,
which tells DirectSound that we want to be able to control the volume of this sound
effect Other examples include DSBCAPS_CTRL3D, which enables 3D sound, or
DSBCAPS_CTRLPAN, which enables panning control Take a look at the DirectSound
docs to learn more about this important structure
After we’re sure we’re talking about a sound data format we support, there are two
things to do First, the sound data is passed onto DirectSound’s
CreateSoundBuf-fer() method, which creates an IDirectSoundBuffer8 object Next, the
Direct-Sound sound buffer is handed to our C++ wrapper class, DirectSound
AudioBuffer, and inserted into the master list of sound effects managed byAudio
Releasing an audio buffer is pretty trivial:
void DirectSoundAudio::VReleaseAudioBuffer(IAudioBuffer *sampleHandle)
{
sampleHandle->VStop();
m_AllSamples.remove(sampleHandle);
}
The call to IAudioBuffer::VStop() stops the sound effect, and it is then
removed from the list of active sounds
Game Sound System Architecture 419
Trang 30The second piece of this platform-dependent puzzle is the implementation of the
DirectSoundAudioBuffer, which picks up and defines the remaining mented virtual functions from the IAudioBuffer interface
unimple-class DirectSoundAudioBuffer : public AudioBuffer
virtual void *VGet();
virtual bool VRestore();
virtual bool VPlay(int volume, bool looping);
virtual bool VPause();
virtual bool VStop();
virtual bool VResume();
virtual bool VTogglePause();
virtual bool VIsPlaying();
virtual void VSetVolume(int volume);
Trang 31DWORD dwFlags = looping ? DSBPLAY_LOOPING : 0L;
return (S_OK==pDSB->Play( 0, 0, dwFlags ) );
Trang 32// Don ’t forget to use a logarithmic scale!
float coeff = (float)volume / 100.0f;
float logarithmicProportion = coeff >0.1f ? 1+log10(coeff) : 0; float range = (DSBVOLUME_MAX - gccDSBVolumeMin );
float fvolume = ( range * logarithmicProportion ) + gccDSBVolumeMin ; GCC_ASSERT(fvolume>=gccDSBVolumeMin && fvolume<=DSBVOLUME_MAX); HRESULT hr = pDSB->SetVolume( LONG(fvolume) );
GCC_ASSERT(hr==S_OK);
}
422 Chapter 13 n Game Audio
Trang 33Most of the previous code has a similar structure and is a lightweight wrapper
aroundIDirectSoundBuffer8 The first few lines check to see if the audio system
is running, the audio buffer has been initialized, and parameters have reasonable
values Take note of the VSetVolume method; it has to renormalize the volume
value from 0–100 to a range compatible with DirectSound, and it does so with a
log-arithmic scale, since sound intensity is loglog-arithmic in nature
The last three methods in this class are a little trickier, so I’ll give you a little more detail
on them The first, VRestore(), is called to restore sound buffers if they are ever lost
If that happens, you have to restore it with some DirectSound calls and then fill it with
sound data again—it doesn’t get restored with its data intact TheVRestore()method
calls RestoreBuffer() to restore the sound buffer, and if that is successful, it calls
FillBufferWithSound() to put the sound data back where it belongs
bool DirectSoundAudioBuffer::VRestore()
{
HRESULT hr;
BOOL bRestored;
// Restore the buffer if it was lost
if( FAILED( hr = RestoreBuffer( &bRestored ) ) )
return NULL;
if( bRestored )
{
// The buffer was restored, so we need to fill it with new data
if( FAILED( hr = FillBufferWithSound( ) ) )
return NULL;
}
return true;
}
This implementation of RestoreBuffer() is pretty much lifted from the
Direct-Sound samples Hey, at least I admit to it! If you’re paying attention, you’ll notice
an unfortunate bug in the code—see if you can find it:
HRESULT DirectSoundAudioBuffer::RestoreBuffer( BOOL* pbWasRestored )
Trang 34DWORD dwStatus;
if( FAILED( hr = m_Sample->GetStatus( &dwStatus ) ) )
return DXUT_ERR( L ”GetStatus”, hr );
if( dwStatus & DSBSTATUS_BUFFERLOST )
{
// Since the app could have just been activated, then
// DirectSound may not be giving us control yet, so
// the restoring the buffer may fail.
// If it does, sleep until DirectSound gives us control but fail if
// if it goes on for more than 1 second
up to you The lesson here is that just because you grab something directly from aDirectX sample doesn’t mean you should install it into your game unmodified!
The next method isFillBufferWithSound() Its job is to copy the sound data from
a sound resource into a prepared and locked sound buffer There’s also a bit of code tohandle the special case where the sound resource has no data—in that case, the soundbuffer gets filled with silence Notice that“silence” isn’t necessarily a buffer with all zeros
424 Chapter 13 n Game Audio
Trang 35HRESULT DirectSoundAudioBuffer::FillBufferWithSound( void )
{
HRESULT hr;
VOID *pDSLockedBuffer = NULL; // DirectSound buffer pointer
DWORD dwDSLockedBufferSize = 0; // Size of DirectSound buffer
DWORD dwWavDataRead = 0; // Data to read from the wav file
if( m_Sample )
return CO_E_NOTINITIALIZED;
// Make sure we have focus, and we didn ’t just switch in from
// an app which had a DirectSound device
if( FAILED( hr = RestoreBuffer( NULL ) ) )
return DXUT_ERR( L ”RestoreBuffer”, hr );
int pcmBufferSize = m_Resource->Size();
shared_ptr<SoundResourceExtraData> extra =
static_pointer_cast<SoundResourceExtraData>(m_Resource->GetExtra());
// Lock the buffer down
if( FAILED( hr = m_Sample->Lock( 0, pcmBufferSize,
&pDSLockedBuffer, &dwDSLockedBufferSize, NULL, NULL, 0L ) ) )
return DXUT_ERR( L ”Lock”, hr );
if( pcmBufferSize == 0 )
{
// Wav is blank, so just fill with silence
FillMemory( (BYTE*) pDSLockedBuffer,
// If the buffer sizes are different fill in the rest with silence
FillMemory( (BYTE*) pDSLockedBuffer + pcmBufferSize,
Trang 36// Unlock the buffer, we don ’t need it anymore.
m_Sample->Unlock( pDSLockedBuffer, dwDSLockedBufferSize, NULL, 0 );
return S_OK;
}
There’s also some special case code that handles the case where the DirectSoundbuffer is longer than the sound data—any space left over is filled with silence.There’s one last method to implement in the IAudioBuffer interface, the
float length = (float)m_Resource->Size();
return (float)progress / length;
}
This useful little routine calculates the current progress of a sound buffer as it isbeing played Sound plays at a constant rate, so things like music and speech willsound exactly as they were recorded It’s up to you, the skilled programmer, to getyour game to display everything exactly in sync with the sound You do this by poll-ing the sound effect’s progress when your game is about to start or change ananimation
Perhaps you have an animation of a window cracking and then shattering You’dlaunch the sound effect and animation simultaneously, call VGetProgress() onyour sound effect every frame, and set your animation progress accordingly This isespecially important because players can detect even tiny miscues between soundeffects and animation
atten-class SoundProcess : public Process
{
426 Chapter 13 n Game Audio
Trang 37virtual void VOnUpdate(const int deltaMilliseconds);
virtual void VOnInitialize();
virtual void VKill();
virtual void VTogglePause();
void Play(const int volume, const bool looping);
This class provides a single object that manages individual sounds Many of the
methods are re-implementations of some IAudioBuffer methods, and while this
isn’t the best C++ design, it can make things a little easier in your code
As you might expect, the parameters to initialize this object are a ResHandle and
initial sound settings One parameter needs a little explanation, typeOfSound
Every process has a type, and sound processes use this to distinguish themselves
Game Sound System Architecture 427
Trang 38into sound categories such as sound effects, music, ambient background effects, orspeech This creates an easy way for a game to turn off or change the volume level
of a particular type of sound, which most gamers will expect If players want to turndown the music level so they can hear speech better, it’s a good idea to let them.SoundProcess::SoundProcess(
void SoundProcess::VOnInitialize()
{
if ( m_handle == NULL || m_handle->GetExtra() == NULL)
return;
//This sound will manage its own handle in the other thread
IAudioBuffer *buffer = g_Audio->VInitAudioBuffer(m_handle);
Trang 39Play(m_Volume, m_isLooping);
}
The VOnUpdate method monitors the sound effect as it’s being played Once it is
finished, it kills the process and releases the audio buffer If the sound is looping, it
will play until some external call kills the process Again, you don’t have to do it this
way in your game Perhaps you’d rather have the process hang out until you kill it
This class overloads theVKill() method to coordinate with the audio system If the
process is going to die, so should the sound effect
Notice that the base class’s VKill() is called at the end of the method, rather than
the beginning You can look at VKill() similar to a destructor, which means this
calling order is a safer way to organize the code
As advertised, the remaining methods do nothing more than pass calls into the
Trang 40“Volume must be a number between 0 and 100”);
430 Chapter 13 n Game Audio