programming windows phần 10 ppsx

{ case WM_INITDIALOG: hwndScroll = GetDlgItem hwnd, IDC_SCROLL ; SetScrollRange hwndScroll, SB_CTL, FREQ_MIN, FREQ_MAX, FALSE ; SetScrollPos hwndScroll, SB_CTL, FREQ_INIT, TRUE ; SetD

Trang 1

as recording voice rather than music Halving the sampling rate to 22.05 kHz reduces the upper range of

reproducible sound by one octave to 10 kHz Halving it again to 11.025 kHz gives us a frequency range to 5 kHz Sampling rates of 44.1 kHz, 22.05 kHz, and 11.025 kHz, as well as 8 kHz, are the standards commonly supported

by waveform audio devices

You might think that a sampling rate of 11.025 kHz is adequate for recording a piano because the highest frequency

of a piano is 4186 Hz However, 4186 Hz is the highest fundamental of a piano Cutting off all sine waves above

5000 Hz reduces the overtones that can be reproduced and will not accurately capture and reproduce the piano sound

The Sample Size

The second parameter in pulse code modulation is the sample size measured in bits The sample size determines the difference between the softest sound and loudest sound that can be recorded and played back This is known as the dynamic range

Sound intensity is the square of the waveform amplitude (that is, the composite of the maximum amplitudes that each sine wave reaches over the course of one cycle) As is the case with frequency, human perception of sound intensity

is logarithmic

The difference in intensity between two sounds is measured in bels (named after Alexander Graham Bell, the inventor

of the telephone) and decibels (dB) A bel is a tenfold increase in sound intensity One dB is one tenth of a bel in equal multiplicative steps Hence, one dB is an increase in sound intensity of 1.26 (that is, the 10th root of 10), or an increase in waveform amplitude of 1.12 (the 20th root of 10) A decibel is about the lowest increase in sound

intensity that the ear can perceive The difference in intensity between sounds at the threshold of hearing and sounds

at the threshold of pain is about 100 dB

You can calculate the dynamic range in decibels between two sounds with the following formula:

where A1 and A2 are the amplitudes of the two sounds With a sample size of 1 bit, the dynamic range is zero,

because only one amplitude is possible

With a sample size of 8 bits, the ratio of the largest amplitude to the smallest amplitude is 256 Thus, the dynamic range is

or 48 decibels A 48-dB dynamic range is about the difference between a quiet room and a power lawn mower Doubling the sample size to 16 bits yields a dynamic range of

or 96 decibels This is very nearly the difference between the threshold of hearing and the threshold of pain and is considered just about ideal for the reproduction of music

Both 8-bit and 16-bit sample sizes are supported under Windows When storing 8-bit samples, the samples are treated as unsigned bytes Silence would be stored as a string of 0x80 values The 16-bit samples are treated as signed integers, so silence would be stored as a string of zeros

To calculate the storage space required for uncompressed audio, multiply the duration of the sound in seconds by the

This document is created with the unregistered version of CHM2PDF Pilot

Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com

Trang 2

sampling rate Double that if you're using 16-bit samples rather than 8-bit samples Double that again if you're

recording in stereo For example, an hour of CD-quality sound (or 3600 seconds at 44,100 samples per second with

2 bytes per sample in stereo) requires 635 megabytes, not coincidentally very close to the storage capability of CD ROM

Generating Sine Waves in Software

For our first exercise in waveform audio, we're not going to save sounds to files or play back recorded sounds.

We're going to use the low-level waveform audio APIs (that is, the functions beginning with the prefix waveOut) to

create an audio sine wave generator called SINEWAVE This program generates sine waves from 20 Hz (the bottom of human perception) to 5,000 Hz (two octaves short of the top of human perception) in 1 Hz increments

As you know, the standard C run-time library includes a function called sin that returns the sine of an angle given in radians (Two (2 times pi) radians equals 360 degrees.) The sin function returns a value ranging from 1 to 1 (We

used this function in another program called SINEWAVE way back in Chapter 5 ) Thus, it should be easy to use the

sin function to generate sine wave data to output to the waveform audio hardware Basically, you fill a buffer up with

data representing the waveform (in this case, a sine wave) and pass it to the API (It's a little more complicated than that, but I'll get to the details shortly.) When the waveform audio hardware finishes playing the buffer, you pass it a second buffer, and so forth

When first considering this problem (and not knowing anything about PCM), you might think it reasonable to divide one cycle of the sine wave into a fixed number of samples for example, 360 For a 20-Hz sine wave, you output

7200 samples every second For a 200-Hz sine wave, you output 72,000 samples per second That might work, but it's not the way to do it For a 5000-Hz sine wave, you'd need to output 1,800,000 samples per second, which would surely tax the DAC! Moreover, for the higher frequencies, this is much more precision than is needed

With pulse code modulation, the sample rate is a constant Let's assume the sample rate is 11,025 Hz because that's what I use in the SINEWAVE program If you wish to generate a sine wave of 2,756.25 Hz (exactly one-quarter the sample rate), each cycle of the sine wave is just 4 samples For a sine wave of 25 Hz, each cycle requires 441 samples In general, the number of samples per cycle is the sample rate divided by the desired sine wave frequency Once you know the number of samples per cycle, you can divide 2 (2 times pi) radians by that number and use the

sin function to get the samples for one cycle Then just repeat the samples for one cycle over and over again to

create a continuous waveform

The problem is the number of samples per cycle may well be fractional, so this approach won't work well either You'd get a discontinuity at the end of each cycle

The key to making this work correctly is to maintain a static "phase angle" variable This angle is initialized at 0 The first sample is the sine of 0 degrees The phase angle is then incremented by 2 (2 times pi) times the frequency, divided by the sample rate Use this phase angle for the second sample, and continue in this way Whenever the phase angle gets above 2 (2 times pi) radians, subtract 2 (2 times pi) radians from it But don't ever reinitialize it to

0

For example, suppose you want to generate a sine wave of 1000 Hz with a sample rate of 11,025 Hz That's about

11 samples per cycle The phase angles and here I'll give them in degrees to make this a little more comprehensible for approximately the first cycle and a half are 0, 32.65, 65.31, 97.96, 130.61, 163.27, 195.92, 228.57, 261.22, 293.88, 326.53, 359.18, 31.84, 64.49, 97.14, 129.80, 162.45, 195.10, and so forth The waveform data you put

in the buffer are the sines of these angles, scaled to the number of bits per sample When creating the data for a subsequent buffer, you keep incrementing the last phase angle value without reinitializing it to zero

A function called FillBuffer that does this along with the rest of the SINEWAVE program is shown in Figure 22-2

Figure 22-2 The SINEWAVE program

Trang 3

BOOL CALLBACK DlgProc (HWND, UINT, WPARAM, LPARAM) ;

TCHAR szAppName [] = TEXT ("SineWave") ;

int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance,

PSTR szCmdLine, int iCmdShow)

pBuffer [i] = (BYTE) (127 + 127 * sin (fAngle)) ;

fAngle += 2 * PI * iFreq / SAMPLE_RATE ;

static BOOL bShutOff, bClosing ;

static HWAVEOUT hWaveOut ;

static HWND hwndScroll ;

static int iFreq = FREQ_INIT ;

static PBYTE pBuffer1, pBuffer2 ;

static PWAVEHDR pWaveHdr1, pWaveHdr2 ;

static WAVEFORMATEX waveformat ;

int iDummy ;

Trang 4

{

case WM_INITDIALOG:

hwndScroll = GetDlgItem (hwnd, IDC_SCROLL) ;

SetScrollRange (hwndScroll, SB_CTL, FREQ_MIN, FREQ_MAX, FALSE) ; SetScrollPos (hwndScroll, SB_CTL, FREQ_INIT, TRUE) ;

SetDlgItemInt (hwnd, IDC_TEXT, FREQ_INIT, FALSE) ;

case SB_LINELEFT: iFreq -= 1 ; break ;

case SB_LINERIGHT: iFreq += 1 ; break ;

case SB_PAGELEFT: iFreq /= 2 ; break ;

case SB_PAGERIGHT: iFreq *= 2 ; break ;

iFreq = max (FREQ_MIN, min (FREQ_MAX, iFreq)) ;

SetScrollPos (hwndScroll, SB_CTL, iFreq, TRUE) ;

SetDlgItemInt (hwnd, IDC_TEXT, iFreq, FALSE) ;

// Allocate memory for 2 headers and 2 buffers

pWaveHdr1 = malloc (sizeof (WAVEHDR)) ;

pBuffer1 = malloc (OUT_BUFFER_SIZE) ;

pBuffer2 = malloc (OUT_BUFFER_SIZE) ;

if (!pWaveHdr1 || !pWaveHdr2 || !pBuffer1 || !pBuffer2) {

if (!pWaveHdr1) free (pWaveHdr1) ;

if (!pWaveHdr2) free (pWaveHdr2) ;

if (!pBuffer1) free (pBuffer1) ;

if (!pBuffer2) free (pBuffer2) ;

MessageBeep (MB_ICONEXCLAMATION) ;

MessageBox (hwnd, TEXT ("Error allocating memory!"), szAppName, MB_ICONEXCLAMATION | MB_OK) ; return TRUE ;

}

Trang 6

case MM_WOM_OPEN:

SetDlgItemText (hwnd, IDC_ONOFF, TEXT ("Turn Off")) ;

// Send two buffers to waveform output device

FillBuffer (pBuffer1, iFreq) ;

waveOutWrite (hWaveOut, pWaveHdr1, sizeof (WAVEHDR)) ;

FillBuffer (pBuffer2, iFreq) ;

waveOutWrite (hWaveOut, pWaveHdr2, sizeof (WAVEHDR)) ;

// Fill and send out a new buffer

FillBuffer (((PWAVEHDR) lParam)->lpData, iFreq) ;

waveOutWrite (hWaveOut, (PWAVEHDR) lParam, sizeof (WAVEHDR)) ; return TRUE ;

case MM_WOM_CLOSE:

waveOutUnprepareHeader (hWaveOut, pWaveHdr1, sizeof (WAVEHDR)) ; waveOutUnprepareHeader (hWaveOut, pWaveHdr2, sizeof (WAVEHDR)) ; free (pWaveHdr1) ;

Trang 7

SINEWAVE.RC (excerpts)

//Microsoft Developer Studio generated resource script

#include "resource.h"

#include "afxres.h"

/////////////////////////////////////////////////////////////////////////////

// Dialog

SINEWAVE DIALOG DISCARDABLE 100, 100, 200, 50

STYLE WS_MINIMIZEBOX | WS_VISIBLE | WS_CAPTION | WS_SYSMENU

CAPTION "Sine Wave Generator"

FONT 8, "MS Sans Serif"

BEGIN

SCROLLBAR IDC_SCROLL,8,8,150,12

RTEXT "440",IDC_TEXT,160,10,20,8

LTEXT "Hz",IDC_STATIC,182,10,12,8

PUSHBUTTON "Turn On",IDC_ONOFF,80,28,40,14

END

RESOURCE.H (excerpts)

// Microsoft Developer Studio generated include file

// Used by SineWave.rc

#define IDC_STATIC -1

#define IDC_SCROLL 1000

#define IDC_TEXT 1001

#define IDC_ONOFF 1002

Note that the OUT_BUFFER_SIZE, SAMPLE_RATE, and PI identifiers used in the FillBuffer routine are defined

at the top of the program The iFreq argument to FillBuffer is the desired frequency in Hz Notice that the result of the sin function is scaled to range between 0 and 254 For each sample, the fAngle argument to the sin function is

increased by 2 (2 times pi) radians times the desired frequency divided by the sample rate

SINEWAVE's window contains three controls: a horizontal scroll bar used for selecting the frequency, a static text field that indicates the currently selected frequency, and a push button labeled "Turn On." When you press the button, you should hear a sine wave from the speakers connected to your sound board and the button text will change to

"Turn Off." You can change the frequency by moving the scroll bar with the keyboard or mouse To turn off the sound, push the button again

The SINEWAVE code initializes the scroll bar so that the minimum frequency is 20 Hz and the maximum frequency

is 5000 Hz during the WM_INITDIALOG message Initially, the scroll bar is set to 440 Hz In musical terms, this is

the A above middle C, the note used for tuning an orchestra DlgProc alters the static variable iFreq on receipt of WM_HSCROLL messages Notice that Page Left and Page Right cause DlgProc to decrease or increase the

frequency by one octave

Trang 8

When DlgProc receives a WM_COMMAND message from the button, it first allocates 4 blocks of memory 2 for WAVEHDR structures, discussed shortly, and two for buffers, called pBuffer1 and pBuffer2, to hold the waveform

The second argument to waveOutOpen is a device ID This allows the function to be used on machines that have

multiple sound boards installed The argument can range from 0 to one less than the number of waveform output devices installed in the system You can get the number of waveform output devices by calling

waveOutGetNumDevs and find out about each of them by calling waveOutGetDevCaps If you wish to avoid this

device interrogation, you can use the constant WAVE_MAPPER (defined as equalling 1) to select the device the user as indicated as the Preferred Device in the Audio tab of the Multimedia applet of the Control Panel Or the system could select another device if the preferred device can't handle what you need to do and another device can The third argument is a pointer to a WAVEFORMATEX structure (More about this shortly.) The fourth argument is either a window handle or a pointer to a callback function in a dynamic-link library This argument indicates the window or callback function that receives the waveform output messages If you use a callback function, you can

specify program-defined data in the fifth argument The dwFlags argument can be set to either

CALLBACK_WINDOW or CALLBACK_FUNCTION to indicate what the fourth argument is You can also use the flag WAVE_FORMAT_QUERY to check whether the device can be opened without actually opening it A few other flags are available

The third argument to waveOutOpen is defined as a pointer to a structure of type WAVEFORMATEX, defined in

MMSYSTEM.H as shown below:

typedef struct waveformat_tag

{

WORD wFormatTag ; // waveform format = WAVE_FORMAT_PCM

WORD nChannels ; // number of channels = 1 or 2

DWORD nSamplesPerSec ; // sample rate

DWORD nAvgBytesPerSec ; // bytes per second

WORD nBlockAlign ; // block alignment

WORD wBitsPerSample ; // bits per samples = 8 or 16

WORD cbSize ; // 0 for PCM

}

WAVEFORMATEX, * PWAVEFORMATEX ;

This is the structure you use to specify the sample rate (nSamplesPerSec), the sample size (wBitsPerSample), and whether you want monophonic or stereophonic sound (nChannels) Some of the information in this structure may

seem redundant, but the structure is designed for sampling methods other than PCM, in which case the last field is set

to a nonzero value and other information follows

For PCM, set nBlockAlign field to the product of nChannels and wBitsPerSample, divided by 8 This is the total number of bytes per sample Set the nAvgBytesPerSec field to the product of nSamplesPerSec and nBlockAlign SINEWAVE initializes the fields of the WAVEFORMATEX structure and calls waveOutOpen like this:

Trang 9

waveOutOpen (&hWaveOut, WAVE_MAPPER, &waveformat,

(DWORD) hwnd, 0, CALLBACK_WINDOW)

The waveOutOpen function returns MMSYSERR_NOERROR(defined as 0) if the function is successful and a nonzero error code otherwise If waveOutOpen returns nonzero, SINEWAVE cleans up and displays a message

box indicating an error

Now that the device is open, SINEWAVE continues by initializing the fields of the two WAVEHDR structures, which are used to pass buffers through the API WAVEHDR is defined like so:

typedef struct wavehdr_tag

{

LPSTR lpData; // pointer to data buffer

DWORD dwBufferLength; // length of data buffer

DWORD dwBytesRecorded; // used for recorded

DWORD dwUser; // for program use

DWORD dwFlags; // flags

DWORD dwLoops; // number of repetitions

struct wavehdr_tag FAR *lpNext; // reserved

DWORD reserved; // reserved

}

WAVEHDR, *PWAVEHDR ;

SINEWAVE sets the lpData field to the address at the buffer that will contain the data, dwBufferLength to the size

of this buffer, and dwLoops to 1 All other fields can be set to 0 or NULL If you want to play a repeated loop of sound, you can specify that with the dwFlags and dwLoops fields

Next SINEWAVE calls waveOutPrepareHeader for the two headers Calling this function prevents the structure

and buffer from being swapped to disk

So far, all of this preparation has been in response to the button click to turn on the sound But a message is waiting

in the program's message queue Because we specified in waveOutOpen that we wish to use a window procedure for receiving waveform output messages, the waveOutOpen function posted a MM_WOM_OPEN message to the program's message queue The wParam message parameter is set to the waveform output handle To process the MM_WOM_OPEN message, SINEWAVE twice calls FillBuffer to fill the pBuffer buffer with sinewave data SINEWAVE then passes the two WAVEHDR structures to waveOutWrite This is the function that actually starts

the sound playing by passing the data to the waveform output hardware

When the waveform hardware is finished playing the data passed to it in the waveOutWrite function, the window is posted an MM_WOM_DONE message The wParam parameter is the waveform output handle, and lParam is a

pointer to the WAVEHDR structure SINEWAVE processes this message by calculating new values for the buffer

and resubmitting the buffer by calling waveOutWrite

SINEWAVE could have been written using just one WAVEHDR structure and one buffer However, there would

be a slight delay between the time the waveform hardware finished playing the data and the program processed the MM_WOM_DONE message to submit a new buffer The "double-buffering" technique that SINEWAVE uses prevents gaps in the sound

When the user clicks the "Turn Off" button to turn off the sound, DlgProc receives another WM_COMMAND message For this message, DlgProc sets the bShutOff variable to TRUE and calls waveOutReset The

waveOutReset function stops sound processing and generates a MM_WOM_DONE message When bShutOff is

TRUE, SINEWAVE processes MM_WOM_DONE by calling waveOutClose This in turn generates an

Trang 10

MM_WOM_CLOSE message Processing of MM_WOM_CLOSE mostly involves cleaning up SINEWAVE calls

waveOutUnprepareHeader for the two WAVEHDR structures, frees all the memory blocks, and sets the text of

the button back to "Turn On."

If the waveform hardware is still playing a buffer, calling waveOutClose by itself will have no effect You must call

waveOutReset first to halt the playing and to generate an MM_WOM_DONE message DlgProc also processes

the WM_SYSCOMMAND message when wParam is SC_CLOSE This results from the user selecting "Close" from the system menu If waveform audio is still playing, DlgProc calls waveOutReset Regardless, EndDialog is

eventually called to close the dialog box and end the program

A Digital Sound Recorder

Windows includes a program called Sound Recorder that lets you digitally record and playback sounds The

program shown in Figure 22-3 (RECORD1) is not quite as sophisticated as Sound Recorder because it doesn't do any file I/O or allow sound editing However, it does show the basics of using the low-level waveform audio API for both recording and playing back sounds

Figure 22-3 The RECORD1 program

Trang 11

TCHAR szAppName [] = TEXT ("Record1") ;

static DWORD dwDataLength, dwRepetitions = 1 ;

static HWAVEIN hWaveIn ;

static HWAVEOUT hWaveOut ;

static PBYTE pBuffer1, pBuffer2, pSaveBuffer, pNewBuffer ;

static PWAVEHDR pWaveHdr1, pWaveHdr2 ;

static TCHAR szOpenError[] = TEXT ("Error opening waveform audio!"); static TCHAR szMemError [] = TEXT ("Error allocating memory!") ; static WAVEFORMATEX waveform ;

Trang 12

// Allocate buffer memory

pBuffer1 = malloc (INP_BUFFER_SIZE) ;

pBuffer2 = malloc (INP_BUFFER_SIZE) ;

if (!pBuffer1 || !pBuffer2)

{

if (pBuffer1) free (pBuffer1) ;

if (pBuffer2) free (pBuffer2) ;

Trang 14

// Open waveform audio for fast output

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REV), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REP), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_SPEED), FALSE) ;

SetFocus (GetDlgItem (hwnd, IDC_RECORD_END)) ;

// Add the buffers

waveInAddBuffer (hWaveIn, pWaveHdr1, sizeof (WAVEHDR)) ;

waveInAddBuffer (hWaveIn, pWaveHdr2, sizeof (WAVEHDR)) ;

Trang 15

// Free the buffer memory

waveInUnprepareHeader (hWaveIn, pWaveHdr1, sizeof (WAVEHDR)) ;

waveInUnprepareHeader (hWaveIn, pWaveHdr2, sizeof (WAVEHDR)) ;

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE) ;

SetFocus (GetDlgItem (hwnd, IDC_RECORD_BEG)) ;

if (dwDataLength > 0)

{

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REP), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REV), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_SPEED), TRUE) ;

SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REP), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REV), FALSE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_SPEED), FALSE) ;

SetFocus (GetDlgItem (hwnd, IDC_PLAY_END)) ;

Trang 16

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REV), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_REP), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_SPEED), TRUE) ;

SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;

Trang 18

RECORD.RC (excerpts)

//Microsoft Developer Studio generated resource script

#include "resource.h"

#include "afxres.h"

/////////////////////////////////////////////////////////////////////////////

// Dialog

RECORD DIALOG DISCARDABLE 100, 100, 152, 74

STYLE WS_MINIMIZEBOX | WS_VISIBLE | WS_CAPTION | WS_SYSMENU

CAPTION "Waveform Audio Recorder"

BEGIN

PUSHBUTTON "Record",IDC_RECORD_BEG,28,8,40,14

PUSHBUTTON "End",IDC_RECORD_END,76,8,40,14,WS_DISABLED

PUSHBUTTON "Play",IDC_PLAY_BEG,8,30,40,14,WS_DISABLED

PUSHBUTTON "Pause",IDC_PLAY_PAUSE,56,30,40,14,WS_DISABLED

PUSHBUTTON "End",IDC_PLAY_END,104,30,40,14,WS_DISABLED

PUSHBUTTON "Reverse",IDC_PLAY_REV,8,52,40,14,WS_DISABLED

PUSHBUTTON "Repeat",IDC_PLAY_REP,56,52,40,14,WS_DISABLED

PUSHBUTTON "Speedup",IDC_PLAY_SPEED,104,52,40,14,WS_DISABLED

END

RESOURCE.H (excerpts)

// Microsoft Developer Studio generated include file

// Used by Record.rc

#define IDC_RECORD_BEG 1000

#define IDC_RECORD_END 1001

#define IDC_PLAY_BEG 1002

#define IDC_PLAY_PAUSE 1003

#define IDC_PLAY_END 1004

#define IDC_PLAY_REV 1005

#define IDC_PLAY_REP 1006

#define IDC_PLAY_SPEED 1007

The RECORD.RC and RESOURCE.H files will also be used in the RECORD2 and RECORD3 programs

The RECORD1 window has eight push buttons When you first run RECORD1, only the Record button is enabled When you press Record, you can begin recording The Record button becomes disabled, and the End button is enabled Press End to stop recording At this point, the Play, Reverse, Repeat, and Speedup buttons also become enabled Pressing any of these buttons plays back the sound: Play plays it normally, Reverse plays it in reverse, Repeat causes the sound to be repeated indefinitely (like with a tape loop), and Speedup plays the sound back twice

as fast You can end playback by pressing the second End button, or you can pause the playback by pressing Pause When pressed, the Pause button changes into a Resume button to resume playing back the sound If you record Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com

Trang 19

another sound, it replaces the existing sound in memory

At any time, the only buttons that are enabled are those that perform valid operations This requires a lot of calls to

EnableWindow in the RECORD1 source code, but the program doesn't have to check if a particular push-button

operation is valid Of course, it also makes the operation of the program more intuitive

RECORD1 takes a number of shortcuts to simplify the code First, if multiple waveform audio hardware devices are installed, RECORD1 uses the default one Second, the program records and plays back at the standard 11.025 kHz sampling rate with an 8-bit sample size regardless of whether a higher sampling rate or sample size is available The only exception is for the speed-up function, where RECORD1 plays back the sound at the 22.050 kHz sampling rate, thus playing it twice as fast and an octave higher in frequency

Recording a sound involves opening the waveform audio hardware for input and passing buffers to the API to receive the sound data

RECORD1 maintains several memory blocks Three of these blocks are very small, at least initially, and are allocated

during the WM_INITDIALOG message in DlgProc The program allocates two WAVEHDR structures pointed to

by pWaveHdr1 and pWaveHdr2 These structures are used to pass buffers to the waveform APIs The pSaveBuffer

pointer points to a buffer for storing the complete recorded sound; this is initially allocated as a 1-byte block Later

on, during recording, the buffer is increased in size to accommodate all the sound data (If you record for a long period of time, RECORD1 recovers gracefully when it runs out of memory during recording, and lets you play back that portion of the sound successfully stored.) I'll refer to this buffer as the "save buffer" because it is used to save the

accumulated sound data Two more memory blocks, 16K in size and pointed to by pBuffer1 and pBuffer2, are

allocated during recording to receive sound data These buffers are freed when recording is complete

Each of the eight buttons generates a WM_COMMAND message to DlgProc, the dialog procedure for

REPORT1's window Initially, only the Record button is enabled Pressing this generates a WM_COMMAND

message with wParam equal to IDC_RECORD_BEG To process this message, RECORD1 allocates the two 16K

buffers for receiving sound data, initializes the fields of a WAVEFORMATEX structure and passes it to the

waveInOpen function, and sets up the two WAVEHDR structures

The waveInOpen function generates an MM_WIM_OPEN message During this message, RECORD1 shrinks the

save buffer down to 1 byte in preparation for receiving data (Of course, the first time you record something, the save buffer is already 1 byte in length, but during subsequent recordings, it could be much larger.) During the

MM_WIM_OPEN message, RECORD1 also enables and disables the appropriate push buttons Next, the

program passes the two WAVEHDR structures and buffers to the API using waveInAddBuffer Some flags are set, and recording begins with a call to waveInStart

At a sampling rate of 11.025 kHz with an 8-bit sample size, the 16K buffer will be filled in approximately 1.5

seconds At that time, RECORD1 receives an MM_WIM_DATA message In response to this message, the

program call reallocates the save buffer based on the dwDataLength variable and the dwBytesRecorded field of the WAVEHDR structure If the reallocation fails, RECORD1 calls waveInClose to stop recording

If the reallocation is successful, RECORD1 copies the data from the 16K buffer into the save buffer It then calls

waveInAddBuffer again This process continues until RECORD1 runs out of memory for the save buffer or the user

presses the End button

The End button generates a WM_COMMAND message with wParam equal to IDC_RECORD_END Processing this message is simple RECORD1 sets the bEnding flag to TRUE and calls waveInReset The waveInReset

function causes recording to stop and generates an MM_WIM_DATA message containing a partially filled buffer RECORD1 responds to this final MM_WIM_DATA message normally, except that it closes the waveform input

device by calling waveInClose

The waveInClose message generates an MM_WIM_CLOSE message RECORD1 responds to this message by

freeing the 16K input buffers and enabling and disabling the appropriate push buttons In particular, if the save buffer

Trang 20

contains data, which it almost always will unless the first reallocation fails, then the play buttons are enabled

After recording a sound, the save buffer contains the total accumulated sound data When the user selects the Play

button, DlgProc receives a WM_COMMAND message with wParam equal to IDC_PLAY_BEG The program responds by initializing the fields of a WAVEFORMATEX structure and calling waveOutOpen

The waveOutOpen call again generates an MM_WOM_OPEN message During this message, RECORD1 enables

and disables the appropriate push buttons (allowing only Pause and End), initializes the fields of the WAVEHDR

structure with the save buffer, prepares it by calling waveOutPrepareHeader, and begins playing it with a call to

waveOutWrite

Normally, the sound will continue until all the data in the buffer has been played At that time, an

MM_WOM_DONE message is generated If there are additional buffers to be played, a program can pass them out

to the API at that time RECORD1 plays only one big buffer, so the program simply unprepares the header and calls

waveOutClose The waveOutClose function generates an MM_WOM_CLOSE message During this message,

RECORD1 enables and disables the appropriate buttons, allowing the sound to be played again or a new sound to

be recorded

I've also included a second End button so that the user can stop playing the sound at any time before the save buffer

has completed This End button generates a WM_COMMAND message with wParam equal to

IDC_PLAY_END, and the program responds by calling waveOutReset This function generates an

MM_WOM_DONE message that is processed normally

RECORD1's window also includes a Pause button Processing this button is easy The first time it's pushed,

RECORD1 calls waveOutPause to halt the sound and sets the text in the Pause button to Resume Pressing the Resume button starts the playback going again by a call to waveOutRestart

To make the program just a little more interesting, I've also included buttons labeled "Reverse," "Repeat," and

"Speedup." These buttons generate WM_COMMAND messages with wParam values equal to IDC_PLAY_REV,

IDC_PLAY_REP, and IDC_PLAY_SPEED

Playing the sound in reverse involves reversing the order of the bytes in the save buffer and playing the sound

normally RECORD1 includes a small function named ReverseMemory to reverse the bytes It calls this function

during the WM_COMMAND message before playing the block and again at the end of the MM_WOM_CLOSE message to restore it to normal

The Repeat button plays the sound over and over again This is not complicated because the API includes a

provision for repeating a sound It involves setting the dwLoops field in the WAVEHDR structure to the number of repetitions and setting the dwFlags field to WHDR_BEGINLOOP for the beginning buffer in the loop and to

WHDR_ENDLOOP for the end buffer Because RECORD1 uses only one buffer for playing the sound, these two

flags are combined in the dwFlags field

Playing the sound twice as fast is also quite easy When initializing the fields of the WAVEFORMATEX structure in

preparation for opening waveform audio for output, the nSamplesPerSec and nAvgBytesPerSec fields are set to

22050 rather than 11025

The MCI Alternative

You may find, as I do, that RECORD1 seems inordinately complex It is particularly tricky to deal with the

interaction between the waveform audio function calls and the messages they generate, and then in the midst of all this, to deal with possible memory shortages as well But maybe that's why it's called the "low-level" interface As I noted earlier in this chapter, Windows also includes the high-level Media Control Interface

Trang 21

For waveform audio, the primary differences between the low-level interface and MCI is that MCI records sound data to a waveform file and plays back the sound by reading the file This makes it difficult to perform the "special effects" that RECORD1 implements because you'd have to read in the file, manipulate it, and write it back out before playing the sound This is a typical versatility vs ease-of-use trade-off The low-level interface gives you flexibility, but MCI (for the most part) is easier

MCI is implemented in two different but related forms The first form uses messages and data structures to send commands to multimedia devices and receive information from them The second form uses ASCII text strings The text-based interface was originally created to allow multimedia devices to be controlled from simple scripting

languages But it also provides very easy interactive control, as was demonstrated in the TESTMCI program shown earlier in this chapter

The RECORD2 program shown in Figure 22-4 uses the message and data structure form of MCI to implement another digital audio recorder and player Although it uses the same dialog box template as RECORD1, it does not implement the three special effects buttons

Trang 22

mciGetErrorString (dwError, szErrorStr,

sizeof (szErrorStr) / sizeof (TCHAR)) ;

static BOOL bRecording, bPlaying, bPaused ;

static TCHAR szFileName[] = TEXT ("record2.wav") ;

static WORD wDeviceID ;

Trang 23

dwError = mciSendCommand (0, MCI_OPEN,

MCI_WAIT | MCI_OPEN_TYPE | MCI_OPEN_ELEMENT, (DWORD) (LPMCI_OPEN_PARMS) &mciOpen) ;

mciSendCommand (wDeviceID, MCI_RECORD, MCI_NOTIFY,

(DWORD) (LPMCI_RECORD_PARMS) &mciRecord) ;

// Enable and disable buttons

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE);

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG), FALSE);

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE);

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END), FALSE);

SetFocus (GetDlgItem (hwnd, IDC_RECORD_END)) ;

mciSendCommand (wDeviceID, MCI_STOP, MCI_WAIT,

(DWORD) (LPMCI_GENERIC_PARMS) &mciGeneric) ;

// Save the file

mciSendCommand (wDeviceID, MCI_CLOSE, MCI_WAIT,

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE);

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE);

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END), FALSE);

Trang 24

mciSendCommand (wDeviceID, MCI_PLAY, MCI_NOTIFY,

(DWORD) (LPMCI_PLAY_PARMS) &mciPlay) ;

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), FALSE);

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE);

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG), FALSE);

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), TRUE) ;

EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END), TRUE) ;

SetFocus (GetDlgItem (hwnd, IDC_PLAY_END)) ;

mciSendCommand (wDeviceID, MCI_PAUSE, MCI_WAIT,

(DWORD) (LPMCI_GENERIC_PARMS) & mciGeneric);

SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Resume")) ; bPaused = TRUE ;

Trang 25

mciSendCommand (wDeviceID, MCI_PLAY, MCI_NOTIFY,

(DWORD) (LPMCI_PLAY_PARMS) &mciPlay) ;

SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Pause")) ; bPaused = FALSE ;

mciSendCommand (wDeviceID, MCI_STOP, MCI_WAIT,

mciSendCommand (wDeviceID, MCI_CLOSE, MCI_WAIT,

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE) ; EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE); EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG), TRUE) ; EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE); EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END), FALSE); SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;

Trang 26

RECORD2 uses only two MCI function calls, the most important being this one:

error = mciSendCommand (wDeviceID, message, dwFlags, dwParam)

The first argument is a numeric identification number for the device You use this ID number much like a handle You

obtain the ID when you open the device, and then you use it in subsequent mciSendCommand calls The second

argument is a constant beginning with the prefix MCI These are called MCI command messages, and RECORD2 demonstrates seven of them: MCI_OPEN, MCI_RECORD, MCI_STOP, MCI_SAVE, MCI_PLAY,

MCI_PAUSE, and MCI_CLOSE

The dwFlags argument is generally composed of zero or more bit flag constants combined with the C bit-wise OR

operator These generally indicate various options Some options are specific to particular command messages, and

some are common to all messages The dwParam argument is generally a long pointer to a data structure that

indicates options and obtains information from the device Many of the MCI messages are associated with data structures unique to the message

The mciSendCommand function returns zero if the function is successful and an error code otherwise To report this

error to the user, you can obtain a text string that describes the error:

mciGetErrorString (error, szBuffer, dwLength)

This is the same function used in the TESTMCI program

When the user presses the Record button, RECORD2's window procedure receives a WM_COMMAND message

with wParam equal to IDC_RECORD_BEG RECORD2 begins by opening the device This involves setting the fields of an MCI_OPEN_PARMS structure and calling mciSendCommand with the MCI_OPEN command message For recording, the lpstrDeviceType field is set to the string "waveaudio" to indicate the device type The

lpstrElementName field is set to a zero-length string The MCI driver uses a default sampling rate and sample size,

but you can change that using the MCI_SET command During recording, the sound data is stored on the hard disk

in a temporary file and is ultimately transferred to a standard waveform file I'll discuss the format of waveform files later in this chapter For playing back the sound, MCI uses the sampling rate and sample size defined in the

waveform file

If RECORD2 cannot open a device, it uses mciGetErrorString and MessageBox to tell the user what the problem

is Otherwise, on return from the mciSendCommand call, the wDeviceID field of the MCI_OPEN_PARMS

structure contains the device ID used in subsequent calls

To begin recording, RECORD2 calls mciSendCommand with the MCI_RECORD command message and the MCI_WAVE_RECORD_PARMS data structure Optionally, you can set the dwFrom and dwTo fields of this

structure (and use bit flags that indicate these fields are set) to insert a sound into an existing waveform file, the name

of which would be specified in the lpstrElementName field of the MCI_OPEN_PARMS structure By default, any

new sound is inserted at the beginning of an existing file

RECORD2 sets the dwCallback field of the MCI_WAVE_RECORD_PARMS to the program's window handle and includes the MCI_NOTIFY flag in the mciSendCommand call This causes a notification message to be sent to

the window procedure when recording has been completed I'll discuss this notification message shortly

When done recording, you press the first End button to stop This generates a WM_COMMAND message with

wParam equal to IDC_RECORD_END The window procedure responds by calling mciSendCommand three

times: The MCI_STOP command message stops recording, the MCI_SAVE command message transfers the sound Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com

Trang 27

data from the temporary file to the file specified in an MCI_SAVE_PARMS structure ("record2.wav"), and the MCI_CLOSE command message deletes any temporary files or memory blocks that might have been created and closes the device

For playback, the lpstrElementName of the MCI_OPEN_PARMS structure field is set to the filename

"record2.wav" The MCI_OPEN_ELEMENT flag included in the third argument to mciSendCommand indicates that the lpstrElementName field is a valid filename MCI knows from the filename extension WAV that you wish to

open a waveform audio device If multiple waveform hardware is present, it opens the first device (It's also possible

to use something other than the first waveform device by setting the lpstrDeviceType field of the

MCI_OPEN_PARMS structure.)

Playing involves an mciSendCommand call with the MCI_PLAY command message and an MCI_PLAY_PARMS

structure Any part of the file can be played, but RECORD2 chooses to play it all

RECORD2 also includes a Pause button for pausing the playback of a sound file This button generates a

WM_COMMAND message with wParam equal to IDC_PLAY_PAUSE The program responds by calling

mciSendCommand with the MCI_PAUSE command message and an MCI_GENERIC_PARMS structure The

MCI_GENERIC_PARMS structure is used for any message that requires no information except an optional window

handle for notification If the playback is already paused, the button resumes play by calling mciSendCommand

again with the MCI_PLAY command message

Playback can also be terminated by pressing the second End button This generates a WM_COMMAND message

with wParam equal to IDC_PLAY_END The window procedure responds by calling mciSendCommand twice,

first with the MCI_STOP command message and then with the MCI_CLOSE command message

Now here's a problem: Although you can manually terminate playback by pressing the End button, you may want to play the entire sound file How does the program know when the file has completed? That is the job of the MCI notification message

When calling mciSendCommand with the MCI_RECORD and MCI_PLAY messages, RECORD2 includes the MCI_NOTIFY flag and sets the dwCallback field of the data structure to the program's window handle This

causes a notification message, called MM_MCINOTIFY, to be posted to the window procedure under certain

circumstances The wParam message parameter is a status code, and lParam is the device ID

You'll receive an MM_MCINOTIFY message with wParam equal to MCI_NOTIFY_ABORTED when

mciSendCommand is called with the MCI_STOP or MCI_PAUSE command messages This happens when you

press the Pause button or either of the two End buttons RECORD2 can ignore this case because it already properly

handles these buttons During playback, you'll receive an MM_MCINOTIFY message with wParam equal to

MCI_NOTIFY_SUCCESSFUL when the sound file has completed To handle this case, the window procedure

sends itself a WM_COMMAND message with wParam equal to IDC_PLAY_END to simulate the user pressing

the End button The window procedure then responds normally by stopping the play and closing the device

During recording, you'll receive an MM_MCINOTIFY message with wParam equal to

MCI_NOTIFY_SUCCESSFUL when you run out of hard disk space for storing the temporary sound file (I

wouldn't exactly call this a "successful" completion, but that's what happens.) The window procedure responds by

sending itself a WM_COMMAND message with wParam equal to IDC_RECORD_END The window procedure

stops recording, saves the file, and closes the device, as is normal

The MCI Command String Approach

At one time, the Windows multimedia interface included a function called mciExecute, with the following syntax:

Trang 28

bSuccess = mciExecute (szCommand) ;

The only argument was the MCI command string The function returned a Boolean value nonzero if the function is

successful and zero if not The mciExecute function was functionally equivalent to calling mciSendString (the

string-based MCI function used in TESTMCI) with NULL or zero for the last three arguments and then

mciGetErrorString and MessageBox if an error occurred

Although mciExecute is no longer part of the API, I've included such a function in the RECORD3 version of the

digital tape recorder and player This is shown in Figure 22-5 Like RECORD2, the program uses the RECORD.RC resource script and RESOURCE.H from RECORD1

Trang 29

mciGetErrorString (error, szErrorStr,

sizeof (szErrorStr) / sizeof (TCHAR)) ;

Trang 30

mciExecute (TEXT ("stop mysound")) ;

mciExecute (TEXT ("save mysound record3.wav")) ;

mciExecute (TEXT ("close mysound")) ;

mciExecute (TEXT ("pause mysound")) ;

SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Resume")) ; bPaused = TRUE ;

}

else

// Begin playing again

{

mciExecute (TEXT ("play mysound")) ;

SetDlgItemText (hwnd, IDC_PLAY_PAUSE, TEXT ("Pause")) ; bPaused = FALSE ;

}

Trang 31

mciExecute (TEXT ("stop mysound")) ;

mciExecute (TEXT ("close mysound")) ;

EnableWindow (GetDlgItem (hwnd, IDC_RECORD_BEG), TRUE) ; EnableWindow (GetDlgItem (hwnd, IDC_RECORD_END), FALSE); EnableWindow (GetDlgItem (hwnd, IDC_PLAY_BEG), TRUE) ; EnableWindow (GetDlgItem (hwnd, IDC_PLAY_PAUSE), FALSE); EnableWindow (GetDlgItem (hwnd, IDC_PLAY_END), FALSE); SetFocus (GetDlgItem (hwnd, IDC_PLAY_BEG)) ;

Trang 32

When you begin exploring the message-based and the text-based interfaces to MCI, you'll find that they correspond closely It's easy to guess that MCI translates the command strings into the corresponding command messages and data structures RECORD3 could use the MM_MCINOTIFY messages like RECORD2, but it chooses not to an

implication of the mciExecute function The drawback of this is that the program doesn't know when it's finished

playing the waveform file Therefore, the buttons do not automatically change state You must manually press the End button so that the program will know that it's ready to record or play again

Notice the use of the alias keyword in the MCI open command This allows all the subsequent MCI commands to

refer to the device using the alias name

The Waveform Audio File Format

If you take a look at uncompressed (that is, PCM) WAV files under a hexadecimal dump program, you'll find they have a format as shown in Figure 22-6

Figure 22-6 The WAV file format

This format is an example of a more extensive format known as RIFF (Resource Interchange File Format) RIFF was intended to be the all-encompassing format for multimedia data files It is a tagged file format, where the file consists of "chunks" of data that are identified by a preceding 4-character ASCII name and a 4-byte (32-bit) chunk size The value of the chunk size does not include the 8 bytes required for the chunk name and size

A waveform audio file begins with the text string "RIFF", which identifies it as a RIFF file This is followed by a 32-bit chunk size, which is the size of the remainder of the file, or the file size less 8 bytes

Trang 33

The chunk data begins with the text string "WAVE", which identifies it as a waveform audio chunk This is followed

by the text string "fmt" notice the blank to make this a 4-character string which identifies a sub-chunk containing the format of the waveform audio data The "fmt " string is followed by the size of the format information, in this case 16 bytes The format information is the first 16 bytes of the WAVEFORMATEX structure, or, as it was defined

originally, a PCMWAVEFORMAT structure that includes a WAVEFORMAT structure

The nChannels field is either 1 or 2, for monaural or stereo sound The nSamplesPerSec field is the number of samples per second; the standard values are 11025, 22050, and 44100 samples per second The nAvgBytesPerSec

field is the sample rate in samples per second times the number of channels times the size of each sample in bits,

divided by 8 and rounded up The standard sample sizes are 8 and 16 bits The nBlockAlign field is the number of

channels times the sample size in bits, divided by 8 and rounded up Finally, the format concludes with a

wBitsPerSample field, which is the number of channels times the sample size in bits

The format information is followed by the text string "data", followed by a 32-bit data size, followed by the waveform data itself The data are simply the consecutive samples in the same format as that used in the low-level waveform audio facilities If the sample size is 8 bits or less, each sample consists of 1 byte for monaural or 2 bytes for stereo If the sample size is between 9 and 16 bits, each sample is 2 bytes for monaural or 4 bytes for stereo For stereo waveform data, each sample consists of the left value followed by the right value

For sample sizes of 8 bits or less, the sample byte is interpreted as an unsigned value For example, for an 8-bit sample size, silence is equivalent to a string of 0x80 bytes For sample sizes of 9 bits or more, the sample is

interpreted as a signed value, and silence is equivalent to a string of 0 values

One of the important rules for reading tagged files is to ignore chunks you're not prepared to deal with Although a waveform audio file requires "fmt " and "data" sub-chunks (in that order), it can also contain other sub-chunks In particular, a waveform audio file might contain a sub-chunk labeled "INFO", and sub-sub-chunks within that

sub-chunk that provide information about the waveform audio file

Experimenting with Additive Synthesis

For many years going back to Pythagoras at least people have attempted to analyze musical tones At first it seems very simple, but then it gets complex Bear with me if I repeat a little of what I've already said about sound

Musical tones, except for some percussive sounds, have a particular pitch or frequency This frequency can range across the spectrum of human perception, from 20 Hz to 20,000 Hz The notes of a piano, for example, have a frequency range between 27.5 Hz to 4186 Hz Another characteristic of musical tones is volume or loudness This corresponds to the overall amplitude of the waveform producing the tone A change in loudness is measured in decibels So far, so good

And then there is an unwieldy thing called "timbre." Very simply, timbre is that quality of sound that lets us distinguish between a piano and a violin and a trumpet all playing the same pitch at the same volume

The French mathematician Fourier discovered that any periodic waveform no matter how complex can be

represented by a sum of sine waves whose frequencies are integral multiples of a fundamental frequency The

fundamental, also called the first harmonic, is the frequency of periodicity of the waveform The first overtone, also called the second harmonic, has a frequency twice the fundamental; the second overtone, or third harmonic, has a frequency three times the fundamental, and so forth The relative amplitudes of the harmonics governs the shape of the waveform

For example, a square wave can be represented as a sum of sine waves where the amplitudes of the even harmonics (that is, 2, 4, 6, etc) are zero and the amplitudes of the odd harmonics (1, 3, 5, etc) are in the proportions 1, 1/3, 1/5, and so forth In a sawtooth wave, all harmonics are present and the amplitudes are in the proportions 1, 1/2,

Trang 34

1/3, 1/4, and so forth

To the German scientist Hermann Helmholtz (1821_1894), this was the key in understanding timbre In his classic

book On the Sensations of Tone (1885, republished by Dover Press in 1954), Helmholtz posited that the ear and

brain break down complex tones into their component sine waves and that the relative intensities of these sine waves

is what we perceive as timbre Unfortunately, it proved to be not quite that simple

Electronic music synthesizers came to widespread public attention in 1968 with the release of Wendy Carlos's album

Switched on Bach The synthesizers available at that time (such as the Moog) were analog synthesizers Such

synthesizers use analog circuitry to generate various audio waveforms such as square waves, triangle waves, and sawtooth waves To make these waveforms sound more like real musical instruments, they are subjected to some changes over the course of a single note The overall amplitude of the waveform is shaped by an "envelope." When a note begins, the amplitude begins at zero and rises, usually very quickly This is known as the attack The amplitude then remains constant as the note is held This is known as the sustain The amplitude then falls to zero when the note ends; this is known as the release

The waveforms are also put through filters that attenuate some of the harmonics and turn the simple waveforms into something more complex and musically interesting The cut-off frequencies of these filters can be controlled by an envelope so that the harmonic content of the sound changes over the course of the note

Because these synthesizers begin with harmonically rich waveform, and some of the harmonics are attenuated using filters, this form of synthesis is known as "subtractive synthesis."

Even while working with subtractive synthesis, many people involved in electronic music saw additive synthesis as the next big thing

In additive synthesis you begin with a number of sine wave generators tuned in integral multiples so that each sine wave corresponds to a harmonic The amplitude of each harmonic can be controlled independently by an envelope Additive synthesis is not practical using analog circuitry because you'd need somewhere between 8 and 24 sine wave generators for a single note and the relative frequencies of these sine wave generators would have to track each other precisely Analog waveform generators are notoriously unstable and prone to frequency drift

However, for digital synthesizers (which can generate waveforms digitally using lookup tables) and

computer-generated waveforms, frequency drift is not a problem and additive synthesis becomes feasible So here's the general idea: You record a real musical tone and break it down into harmonics using Fourier analysis You can then determine the relative strength of each harmonic and regenerate the sound digitally using multiple sine waves When people began experimenting with applying Fourier analysis on real musical tones and generating these tones from multiple sine waves, they discovered that timbre is not quite as simple as Helmholtz believed

The big problem is that the harmonics of real musical tones are not in strict integral relationships Indeed, the term

"harmonic" is not even appropriate for real musical tones The various sine wave components are inharmonic and more correctly called "partials."

It was discovered that the inharmonicity among the partials of real musical tones is vital in making the tone sound

"real." Strict harmonicity yields an "electronic" sound Each partial changes in both amplitude and frequency over the

course of a single note The relative frequency and amplitude relationships among the partials is different for different pitches and intensities from the same instrument The most complex part of a real musical tone occurs during the attack portion of the note, when there is much inharmonicity It was discovered that this complex attack portion of the note was vital in the human perception of timbre

In short, the sound of real musical instruments is more complex than anyone imagined The idea of analyzing musical tones and coming up with relatively few simple envelopes for controlling the amplitudes and frequencies of the partials was clearly not practical

Trang 35

Some analyses of real musical sounds were published in early issues (1977 and 1978) of the Computer Music

Journal (at the time published by People's Computer Company and now published by the MIT Press) The

three-part series "Lexicon of Analyzed Tones" was written by James A Moorer, John Grey, and John Strawn, and it showed the amplitude and frequency graphs of partials of a single note (less than half a second long) played on a violin, oboe, clarinet, and trumpet The note used was the E flat above middle C Twenty partials are used for the violin, 21 for the oboe and clarinet, and 12 for the trumpet In particular, Volume II, Number 2 (September 1978) of

the Computer Music Journal contains numerical line-segment approximations for the various frequency and

amplitude envelopes for the oboe, clarinet, and trumpet

So, with the waveform support in Windows, it is fairly simple to type these numbers into a program, generate multiple sine wave samples for each partial, add them up, and send the samples out to the waveform audio sound board, thereby reproducing the sounds originally recorded over 20 years ago The ADDSYNTH ("additive synthesis") program is shown in Figure 22-7

Figure 22-7 The ADDSYNTH Program

Trang 36

TCHAR szAppName [] = TEXT ("AddSynth") ;

// Sine wave generator

dAmp = sin (* pdAngle) ;

* pdAngle += 2 * PI * dFreq / SAMPLE_RATE ;

static double dAngle [MAX_PARTIALS] ;

double dAmp, dFrq, dComp, dFrac ;

int i, iPrt, iMsecTime, iCompMaxAmp, iMaxAmp, iSmp ;

// Calculate the composite maximum amplitude

for (i = 0 ; i < ins.pprt[iPrt].iNumAmp ; i++)

iMaxAmp = max (iMaxAmp, ins.pprt[iPrt].pEnvAmp[i].iValue) ;

Trang 37

hFile = CreateFile (szFileName, GENERIC_WRITE, 0, NULL,

CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL) ;

Trang 38

WriteFile (hFile, "RIFF", 4, &dwWritten, NULL) ;

WriteFile (hFile, &iChunkSize, 4, &dwWritten, NULL) ;

WriteFile (hFile, "WAVEfmt ", 8, &dwWritten, NULL) ;

WriteFile (hFile, &iPcmSize, 4, &dwWritten, NULL) ;

WriteFile (hFile, &waveform, sizeof (WAVEFORMATEX) - 2, &dwWritten, NULL) ; WriteFile (hFile, "data", 4, &dwWritten, NULL) ;

WriteFile (hFile, &iNumSamples, 4, &dwWritten, NULL) ;

WriteFile (hFile, pBuffer, iNumSamples, &dwWritten, NULL) ;

if (MakeWaveFile (ins, szFileName))

EnableWindow (GetDlgItem (hwnd, idButton), TRUE) ;

Trang 39

static TCHAR * szTrum = TEXT ("Trumpet.wav") ;

static TCHAR * szOboe = TEXT ("Oboe.wav") ;

static TCHAR * szClar = TEXT ("Clarinet.wav") ;

TestAndCreateFile (hwnd, insTrum, szTrum, IDC_TRUMPET) ;

TestAndCreateFile (hwnd, insOboe, szOboe, IDC_OBOE) ;

TestAndCreateFile (hwnd, insClar, szClar, IDC_CLARINET) ;

SetDlgItemText (hwnd, IDC_TEXT, TEXT (" ")) ;

SetFocus (GetDlgItem (hwnd, IDC_TRUMPET)) ;

Trang 40

ADDSYNTH DIALOG DISCARDABLE 100, 100, 176, 49

STYLE WS_MINIMIZEBOX | WS_CAPTION | WS_SYSMENU

CAPTION "Additive Synthesis"

Each instrument consists of a collection of partials (12 for the trumpet and 21 each for the oboe and clarinet) stored

as an array of structures of type PRT The PRT structure stores the number of points in the amplitude and frequency envelopes and a pointer to the ENV array The INS structure contains the total time of the tone in milliseconds, the number of partials, and a pointer to the PRT array that stores the partials

ADDSYNTH has three push buttons labeled "Trumpet," "Oboe," and "Clarinet." PCs are not yet quite fast enough to

do all the additive synthesis calculations in real time, so the first time you run ADDSYNTH, these buttons will be disabled until the program calculates the samples and creates the TRUMPET.WAV, OBOE.WAV, and

CLARINET.WAV sound files The push buttons are then enabled and you can play the three sounds by using the Simpo PDF Merge and Split Unregistered Version - http://www.simpopdf.com

Định dạng
Số trang	128
Dung lượng	417,42 KB