July 1998

Low-level wave audio, Part I

by Kent Reisdorph

The TMediaPlayer component that comes with C++Builder is great for playing wave files, MIDI files, and AVI files. Sometimes, however, you need to take a lower-level approach to audio programming. For example, you might need to manipulate wave data in order to change volume, add special effects, and so on. In this series of articles, we'll focus on low-level wave audio at the system level. We'll begin this month by discussing Microsoft's Media Control Interface (MCI), then we'll cover RIFF files. In part 2, we'll go into detail about waveform output. Part 3 will cover recording waveform audio.

 

What is MCI?

MCI refers to Microsoft's Media Control Interface--not the telephone giant. As the heart of multimedia in Windows, MCI handles wave input and output, MIDI input and output, AVI video playback and capture, and more. In fact, MCI even has the capability to control videodisc players and VCRs. Basically, MCI can control any device that can install an MCI driver. For example, you can use MCI to play Autodesk Animator animations--provided that the Autodesk drivers are installed properly. MCI provides a relatively high-level interface between you and the system's hardware. This high-level interface takes a huge burden off the programmer because she doesn't have to write device drivers for every known sound card, video card, or other hardware. In fact, you don't really care about the hardware the user has installed because MCI takes care of all of that. All you have to know is the basic MCI functions and how to use them.

The MCI command set includes commands for audio playing, recording, saving, and positioning; also, video capture, video output control (display window size and position), playback in forward, reverse, fast, or slow, and so on. Utility commands allow you to query a device to identify commands that device supports, discern the driver manufacturer and version, and determine whether the device is capable of certain operations (such as stereo output and volume control). All in all, MCI has a very rich command set.

All of the MCI is contained in MMSYSTEM.DLL. To use the MCI functions, you simply include MMSYSTEM.H in your source units and start calling the functions you need. Basically, the 32-bit MCI API has four levels; we've listed them here in order of increasing complexity:
bulletThe PlaySound function
bulletThe string interface
bulletThe command interface
bulletLow-level routines
The PlaySound function is a simple means of playing a waveform audio file from within a 32-bit Windows program. We won't discuss the PlaySound function here, but you can look it up in the Win32 API help file for more information. The string interface provides a very high-level approach to MCI. Using the string interface you can write code such as

 

char* cmd = "play test.wav 
	from 0 to 3000 wait";
mciSendString(cmd, 0, 0, 0);
This example plays the first three seconds of a wave file called TEST.WAV. As you might expect, the string interface doesn't give you as much control over MCI as you might need in all circumstances. The MCI command interface is much more complex than the string interface, but it also gives you more control. The heart of the MCI command interface is the mciSendCommand function. (Although we won't go into detail on the MCI command interface here because of its complexity, you can check out an article by Kent featuring mciSendCommand at http://www.borland.com/borlandcpp/news/cobb/bcj3_1a.html.)

Finally, the low-level MCI routines provide you with the most control over multimedia operations. As you might expect, that kind of power comes with a price--you have to do a lot of the work that the higher-level interfaces do for you. However, when you need that kind of control, then the MCI low-level routines are just the ticket.

Why MCI?

You might be wondering, "Why use MCI? I thought DirectX was the way to handle multimedia in a Windows program." Don't believe everything you hear! While DirectX has some great features, it also has several drawbacks. First, it's an ActiveX library, which means you have to distribute DirectX with your application and make sure it's properly installed. In addition, your application is at the mercy of other application developers who also use DirectX. If an unknowing or uncaring developer installs an older version of DirectX on a user's machine, your application may stop working properly.

If you're a component developer and you need sound in one of your components, then using MCI is the only way to go. A VCL component that relies on an ActiveX control wouldn't be very well received by the public (components that encapsulate ActiveX controls excluded, of course). Also, since MCI is installed as part of Windows, you don't have to worry about your users installing anything but your component.

 

RIFF files

Wave audio is stored in RIFF files. RIFF is an acronym for resource interchange file format. Working with RIFF files is the less glamorous part of dealing with low-level audio. Still, it's a subject that we must cover. A RIFF file is organized into sections called chunks. The RIFF architecture allows for a hierarchical method of storing data in a file with chunks containing subchunks, as well as data. These subchunks may contain data, as well as their own subchunks. In the case of a wave file, there isn't a lot of data to store, so the file format is fairly straightforward.

A wave file contains both a format chunk that holds the wave format header and a data chunk that contains the actual waveform data. The hierarchy looks like this:

 

Root chunk

- Wave Format Chunk

- Data - Data Chunk - Data

You navigate a RIFF file by descending and ascending through the chunk layers. Now that you've had a brief introduction to RIFF files, let's look at how to read a wave file.

Reading a wave file

While reading a wave file isn't a trivial pursuit, neither is it very difficult. Although I can't say that once you've read a wave file you'll never forget how, I can guarantee you that this task is definitely something you can master. Reading a wave file requires the structure and functions in Table A.

Table A: Wave file structure and functions
Item Description
MMCKINFO Forms the chunk information structure.
mmioOpen Opens the file.
mmioDescend Descends into a chunk.
mmioAscend Ascends out of a chunk.
mmioRead Reads data from a chunk.
mmioClose Closes the file.

We'll separate this operation into sections to make it easier to understand. However, we won't go into intimate detail on each of these functions; we simply don't have time. Instead, you'll learn by the best teacher: example.

 

Step 1: Open the wave file

The first step, naturally, is to open the file. The mmioOpen command looks like this:
HMMIO mmioOpen(LPSTR szFilename, 
	LPMMIOINFO lpmmioinfo, 
DWORD dwOpenFlags);

where szFilename is the address of a string containing the filename of the file to open. The lpmmioinfo parameter is the address of an MMIOINFO structure containing extra parameters used by mmioOpen. The final parameter, dwOpenFlags, are flags for the open operation. Using mmioOpen is pretty easy as we show here:

 

HMMIO handle = 
  mmioOpen("test.wav", 0, 
	 MMIO_READ);
if (!handle) {
  MessageBox(Handle, 
    "Error opening file.",
	 "Error Message", 0);
  return;
}
This code opens the file and assigns the resulting file handle to the handle variable. The handle is checked for validity, and an error message displays if the file wasn't opened successfully.

Step 2: Read the RIFF chunk

The next step is to read the file's root chunk. The root chunk in a RIFF file is called the RIFF chunk. We need the RIFF chunk to access the format and data chunks. You'll read the RIFF chunk by calling mmioDescend,with the following format:

 

MMRESULT mmioDescend
	(HMMIO hmmio, LPMMCKINFO lpck, 
LPMMCKINFO lpckParent, 
UINT wFlags); 
In this code, hmmio is the file handle of an open RIFF file and lpck is the address an application-defined MMCKINFO structure. Next, lpckParent is the address of an optional application-defined MMCKINFO structure. Finally, wFlags specifies search parameters. Our example specifies the MMIO_FINDRIFF flag. The following code shows how to read the RIFF chunk:

 

MMCKINFO ChunkInfo;
memset(&ChunkInfo,
	 0, sizeof(MMCKINFO));
Res = mmioDescend(handle, 
  &ChunkInfo, 0, 
	MMIO_FINDRIFF);
if (Res)
  MessageBox(0, "Error", 
	"Error", 0);
First, we declare an instance of the MMCKINFO structure, ChunkInfo, and zero it out. Next, we call the mmioDescend function, passing a pointer to the chunk structure. Once again, we check the return value from mmioDescend to be sure the function succeeded. If the call to mmioDescend succeeds, the ChunkInfo variable contains the chunk information for the file's RIFF chunk. (Note: Because of space considerations, in subsequent examples we won't include the code that checks the return value of each function. You should check the return values of each function in your own code.)

 

Step 3: Read the wave format header chunk

Now that we have the RIFF chunk, we can use it to extract the format and data chunks. Here's the code for extracting the wave format header chunk:
MMCKINFO FormatChunkInfo;
FormatChunkInfo.ckid =  
  mmioStringToFOURCC
	("fmt", 0);
mmioDescend(handle, 
  &FormatChunkInfo,
	 &ChunkInfo,
		 MMIO_FINDCHUNK);
WAVEFORMATEX waveFmt;
mmioRead(handle, 
  (char*)&waveFmt, 
	FormatChunkInfo.cksize);
The first line of this code snippet declares another instance of the MMCKINFO structure. (We need a second structure to hold the sub-chunk information.) The second line uses the mmioStringToFOURCC macro to convert four characters into a FOURCC value and assign that value to the ckid member of the MMCKINFO structure. The mmioStringToFOURCC macro is defined as

 

FOURCC mmioStringToFOURCC
	(LPCSTR sz, UINT wFlags); 
where sz is the address of the null-terminated string we want to convert to a four-character code and wFlags specifies conversion options. MCI uses FOURCC values to identify chunks. A FOURCC value is simply a DWORD created out of four characters. The four characters that identify the wave format header are fmt and a space. Keep in mind that we have to specify only three characters, because mmioStringToFOURCC will use blank spaces to pad the string out to four characters.

Once the ckid member has been set, we call mmioDescend to descend from the RIFF chunk into the fmt chunk. After the call to mmioDescend completes, the FormatChunkInfo structure will be filled with the chunk's information, including the size of the chunk's data. Next, we create an instance of the WAVEFORMATEX structure. This structure will hold the wave format header.

Finally, we use the mmioRead function to read the wave header into the waveFmt structure. The function is declared as

 

LONG mmioRead(HMMIO hmmio, 
	HPSTR pch, LONG cch); 
where hmmio is the handle of the file to be read, pch is the address of a buffer to contain the data read from the file, and cch is the number of bytes to read from the file. We must cast the address of the waveFmt structure to a char*, since that's the type mmioRead requires for this parameter. Notice that we pass the size of the chunk as the size parameter (the cksize member of MMCKINFO contains the size of the chunk's data). This ensures that we read only as many bytes as the chunk actually contains. At this point, the wave header structure, waveFmt, contains the wave format information about the wave file (sample rate, bits per sample, mono or stereo, and so on).

 

Step 4: Read the data chunk

Now we get to the good stuff--we're ready to read the wave data. To read the wave data, we need to ascend out of the format chunk (where we are now) and descend into the data chunk. We'll first create another MMCKINFO structure to hold the data-chunk's information, and set its ckid data member to the ID of the data chunk. Here's how it looks:
MMCKINFO DataChunkInfo;
mmioAscend(handle, &
	FormatChunkInfo, 0);
DataChunkInfo.ckid = 
  mmioStringToFOURCC
	("data", 0);
mmioDescend(handle, 
  &DataChunkInfo, 
	&ChunkInfo,
		 MMIO_FINDCHUNK);
This code is almost identical to the previous code where we descended into the format chunk. Notice that the string value of the FOURCC for the data chunk is data. Here, we're introducing the mmioAscend function. Its definition is

 

MMRESULT mmioAscend(HMMIO 
	hmmio, LPMMCKINFO 
		lpck, UINT
		 wFlags);
where hmmio is the file handle of an open RIFF file. The lpck parameter is the address of an application-defined MMCKINFO structure previously filled by the mmioDescend or mmioCreateChunk function. The wFlags parameter is reserved and must be zero. Now that we have the chunk information for the data chunk, we can actually read the data. Remember that the cksize member of the MMCKINFO structure contains the size of the data in the chunk. We'll use this size to allocate a buffer for the data and to read the data. Here's the code:

 

unsigned int size = 
	DataChunkInfo.cksize;
char* data1 = new 
	char[size];
mmioRead(handle, 
	data1, size);
mmioClose(handle, 0);
That's all there is to it. The data1 character array now holds all of the wave file's data. (We'll do something with that data next.) After we've read the data, we close the file with the mmioClose function, which simply accepts the handle of the file to close and flags for the close operation.

 

Writing a wave file

Now we'll take a quick look at how to write a wave file. You already know most of what you need to know to do so. Most of the code is just a variation on the code we used previously to read a wave file. An obvious exception is the use of mmioWrite to write the data to the file where we used mmioRead when we read the file. The mmioWrite function accepts three parameters, as shown here:

 

LONG mmioWrite(HMMIO hmmio,
	 char _huge* pch, LONG cch);

You use the first parameter, hmmio, to specify the handle of the file. The latter two, pch and cch, are the address of the buffer to be written to the file and the number of bytes to write to the file, respectively.

We're going to write the data we just read to a new file. Just to add spice, we'll reverse the data so that the wave file plays backwards. Because the wave format won't change, and we're writing the exact number of bytes that we just read, we'll use the same FormatChunkInfo and DataChunkInfo structures that we used when we read the file. Since they contain all the necessary data, we'll just reuse them to write the file.

You must follow these steps to write the wave file:

  1. Reverse the wave data.
  2. Create and open the new file.
  3. Create the RIFF chunk
  4. .
  5. Create the fmt chunk.
  6. Write the fmt chunk data.
  7. Ascend out of the fmt chunk.
  8. Create the data chunk.
  9. Write the data chunk.
  10. Close the file.
We'll give you the code all in one go. Here's how it looks:
// Create the new data buffer.
char* data2 = new char[size];

// Copy the original 
	data into the new
// buffer in reverse order.
for (unsigned int i=0;i<
	size;i++) {
  data2[size - i] = data1[i];
}

// Open a new file.
handle = mmioOpen(
  "test.wav", 0, MMIO_CREATE 
	| MMIO_WRITE);

// Write the RIFF chunk.
mmioCreateChunk(
  handle, &ChunkInfo,
	 MMIO_CREATERIFF);

// Create and write the
	 format chunk.
mmioCreateChunk(handle, 
	&FormatChunkInfo, 0);
mmioWrite(handle, 
  (char*)&waveFmt, sizeof
	(WAVEFORMATEX) - 2);

// Ascend out of the format 
	chunk.
mmioAscend(handle, &
	FormatChunkInfo, 0);

// Create and write the data 
	chunk.
mmioCreateChunk(handle, &
	DataChunkInfo, 0);
mmioWrite(handle, data2, Data
	ChunkInfo.cksize);

// Close the file.
mmioClose(handle, 0);
This code is fairly straightforward, especially once you understand the structure of a wave file. However, you should notice one thing: The line that writes the wave format header looks like this:

 

mmioWrite(handle, 
  (char*)&waveFmt, sizeof
	(WAVEFORMATEX) - 2);
Note how we subtract 2 from the size of a WAVEFORMATEX structure when we write the structure. If we don't do this subtraction, then the Sound Recorder program that comes with Windows 95 won't be able to play the wave file we created (although, it works fine with the Windows NT Sound Recorder). The reason is that the Win95 Sound Recorder expects the wave format header to be a PCMWAVEFORMAT structure rather than a WAVEFORMATEX structure. The latter is two bytes longer than the former, so we just cheat a little and subtract two bytes when we write the structure to the file. The extra two bytes we're cutting off are for ADPCM file formats and aren't used with PCM wave files. We took a shortcut here, but any code you write should do the right thing based on the wave-format type.

When you play the TEST.WAV produced by this code, you'll hear a wave file that plays backwards (a la Pink Floyd or the Electric Light Orchestra). You can find the complete example program for this code on our Web site at www.cobb.com/cpb; click on the Source Code hyperlink.

The example program, RIFFTEST, takes a wave file, reverses its data, and saves it to a new wave file. The contents of each chunk are displayed in a memo control so you can view them. The program also allows you to play both the original file and the converted file.

 

Conclusion

RIFF files aren't very exciting. However, if you're going to do low-level wave audio work, then you need to know about RIFF files. Next month, we'll continue this series and discuss how to play wave files directly from a data buffer.