Click here to Skip to main content
Click here to Skip to main content

Low-Level Control of *.wav Data (Part I)

, 14 Jun 2013
Rate this:
Please Sign up or sign in to vote.
How to play .wav data at the low-level using waveOut* functions

Introduction

I've been looking for a way to grab and control low-level data for audio playback. In the grand scheme of things, this research I did was all for a larger project (which I won't go over in these articles). This article is Part I of IV and introduces us to the waveOut API. Ultimately, this gives us a quick look into low-level data control of wav files and shows us how to play them using a specified device. This article assumes you have a solid basis to C/C++ programming with a good knowledge of memory and pointers.

What really got me started down this path of low-level control was the sudden disappearance of a simple *.wav file recorder on Windows systems. In an attempt to create my own recorder/player for simple *.wav files, I stumbled upon waveOut*/waveIn*/mmio*. This article will only cover waveOut. The other two will follow in the additional articles.

What is waveOut and Why is it Important?

WaveOut is Microsoft's low-level control of audio playback. Using certain functions, we can grab the actual-no-joke data to the audio channel. Better put, think of it this way: all data on a computer is stored, interpreted, and manipulated through binary values. The same is with sound. With simple sound such as a .wav file, data is stored by the byte as a value from 0x00 to 0xFF, giving us a total of 256 possible values. Think of 0 as the lowest you can go on a musical scale and 256 as a pitch that would break even the thickest of glass. This is how most (if not all) soundcards interpret the data in order to play the sound you want.

WaveOut gives us the ability to do things with that data. Whatever we want, actually. For example, if we wanted to grab a wav file and garble the sound up, we could grab the data and use a bit-wise operation to screw up the data to give us a garbled (or most likely white-noise) effect. But waveOut doesn't just give us access to the data -- it has some nice helper functions to control volume, playback rate, etc. As such, for low-level playback, you can't get much lower than this (in actuality, you can, but you'd be right on the hardware itself, probably programming in something darn close to machine code).

If you are curious as to what functions are available via waveOut, check them out in the MSDN.

All Roads Lead to waveOut

Here's the generalized process of how to playback data using waveOut.

  1. Open your file using mmio
  2. Grab the format, plug it into WAVEFORMATEX, grab the data, plug into a buffer
  3. Close your file from mmio (or optionally wait until all has been accomplished)
  4. Open your waveOut device
  5. Prepare a header for the waveOut device, specifying the appropriate format and buffer
  6. Write it to the soundcard
  7. Stop playback (or wait until finished)
  8. Unprepare the header, free the buffer
  9. Close the device

Anyway, that's the generalized process. In the meat and potatoes, I'm going to show you how to build your format and buffer from scratch, skipping steps 1-3. The main reason for this is so you learn the basics without getting too overwhelmed from all the extra stuff dealing with the .wav file itself. Additionally, you'll see I leave out reading/writing wav files until the last article for the main reason that once we have a successful playback and recorder, we have all the necessary data to just simply "plug in" to read/write.

The Meat and Potatoes

Setting Us Up For Success

In my opinion, the best way to control waveOut routines is with an actual window. I find the usefulness and surprising ease of the Windows messaging loop to be quite the sure-fire way to go here. However, Windows is certainly a burden in the learning curve here (because of all the extra coding), so, for this article, we'll be using DOS. Create a new project in VS 2008 named "WavPlay" and create an empty console application project. Create a .c or .cpp file named wavPlay for your main source body.

We'll be using Win32 most definitely in Part IV for our multi-buffering playback, if not Part II for the recording process.

Since this is a DOS program, we'll just go ahead and just straight into playing -- why try making things fancy with menus and the like when all we need is to see how it works? So, first thing's first: find our devices and create our playing format.

I like structs to group my variables together, so here we've got our output device struct "spkr".

struct wOut
{
 int devnum, count;
 HWAVEOUT handle;
 WAVEFORMATEX fmt;
 WAVEHDR hdr;
 char *buffer;
}spkr;

To find out how many output devices we have on the system, we use the function waveOutGetNumDevs which returns an unsigned integer. Set spkr.count to equal waveOutGetNumDevs.

Everything's Always About Order with this Guy, Isn't It?

Next, we set up our format. To do this, we use a WAVEFORMATEX struct, supplied to us by the friendly people at Microsoft.

A WAVEFORMATEX struct looks like this (found on MSDN Library):

typedef struct{
  WORD  wFormatTag;
  WORD  nChannels;
  DWORD nSamplesPerSec;
  DWORD nAvgBytesPerSec;
  WORD  nBlockAlign;
  WORD  wBitsPerSample;
  WORD  cbSize;
} WAVEFORMATEX;

Let's set this up:

  • wFormatTag = WAVE_FORMAT_PCM; I can't remember what the PCM stands for but ultimately, this is the format you'll use for all your .wav files.
  • nChannels = 2; This specifies between 1 or 2 channels. It is your designator between Mono or Stereo.
  • nSamplesPerSec = 44100; Common values for this are 8.0, 11.025, 22.05, or 44.1kHz for your WAVE_FORMAT_PCM. You'll notice the values I gave you are in kHz. What does this mean for your computer? How does it know it's kHz? It doesn't. But you do the math for it--k stands for kilo, which means 1000. 44.1 * 1000 = 44100. That's how you'll set this variable.
  • nAvgBytesPerSec = nSamplesPerSec * nBlockAlign; NOTE that you'll need to set this variable after you set both nSamplesPerSec and nBlockAlign because otherwise you're setting this variable to 0, or worse, garbage, which will cause your program to, in the best of cases, not work properly.
  • nBlockAlign = (nChannels * wBitsPerSample) / 8; nBlockAlign tells us the size in bytes of our alignment. Remember back to the ones and zeros that store the sound data? Remember how I told you that one byte represented each piece of the pie and stored up to 256 options for us? This is where this falls in. As you know, data must be stored in a linear fashion in the file; with this block alignment, the playback device knows how much data to grab and interpret as a single sound. Note that you'll need to set this variable after you've set nChannels and wBitsPerSample.
  • wBitsPerSample = 16; For WAVE_FORMAT_PCM, 8 or 16 are your options. Obviously, the higher you go, the higher the quality.
  • cbSize = 0; For WAVE_FORMAT_PCM, this value should be zero. cbSize is a variable telling the playback device how much extra data is attached to the WAVEFORMATEX structure and should be taken into account. Since WAVE_FORMAT_PCM is the standard, this shouldn't be an issue and so should be set to 0.

Beam Me Up, Scotty! ... Scotty? ... Scotty!!!

So now that we have the number of devices and have specified our format, let's open the device. So to show what we have here, let's look at some code:

/* wavPlay.c */
#pragma once
#include <windows.h>
#include <Mmsystem.h>
#pragma comment(lib, "Winmm.lib")

struct wOut
{
 int devnum, count;
 HWAVEOUT handle;
 WAVEFORMATEX fmt;
 WAVEHDR hdr;
 char *buffer;
}spkr;

int n;
BOOL devfound=FALSE;

void setformat(WAVEFORMATEX *fmt, int channels, int Hz, WORD bitspersample);

void main(void)
{
 memset(&spkr, 0, sizeof(spkr));
 setformat(&spkr.fmt, 2, 44100, 16);
 spkr.count = waveOutGetNumDevs();

 if(spkr.count > 0)
 {
  do
  {
   if(waveOutOpen(&spkr.handle, spkr.devnum, &spkr.fmt, NULL, 
        NULL, CALLBACK_NULL | WAVE_FORMAT_QUERY) == MMSYSERR_NOERROR)
   {
    devfound = TRUE;
    break;
   }else{
    spkr.devnum++;
    if(spkr.devnum >= spkr.count)
    {
     printf("No supported devices connected.\n");
     break;
    }
   }
  }while(TRUE);
 }

 if(devfound)
 {
  if(waveOutOpen(&spkr.handle, spkr.devnum, &spkr.fmt, NULL, 
           NULL, CALLBACK_NULL) == MMSYSERR_NOERROR)
  {
   /* todo */
  }

  /* stop and close playback */
  waveOutReset(spkr.handle);
  waveOutClose(spkr.handle);
 }
}

void setformat(WAVEFORMATEX *fmt, int channels, int Hz, WORD bitspersample)
{
 if(fmt)
 {
  fmt->cbSize = 0;
  fmt->nChannels = channels;
  fmt->nSamplesPerSec = Hz;
  fmt->wBitsPerSample = bitspersample;
  fmt->nAvgBytesPerSec = fmt->nSamplesPerSec * fmt->nBlockAlign;
  fmt->nBlockAlign = (fmt->nChannels * fmt->wBitsPerSample) / 8;
 }
}

Let's go over this, just to recap. First, we have our struct with our data, and then we run to our routine declaration, then we move into main. We initialize our spkr struct to zero, then set the format of our WAVEFORMATEX struct to 44.1kHz, Stereo, with 16 Bits per second. Lastly, we determined how many devices are on the computer.

Now you'll see the new lines in, right underneath our waveOutGetNumDevs. It's waveOutOpen - where we actually are calling down to our soundcard to open up an electrical line to our speakers. Now don't get too excited just yet. If you read through it, you'll see that we're using a flag called WAVE_FORMAT_QUERY to check to see if the format we're trying to use is supported by the device. If so, then we break out of our do/while loop. If not, it tries the next device until all devices are exhausted, at which point, the program will print out that it can't find a suitable device and will exit out.

Following that if we've found a device, we try to open it. If we can't, we exit out. If we can, we run into our "todo" comment. Notice here you're introduced to two new commands: waveOutReset and waveOutClose. I think waveOutClose is pretty easy to guess but waveOutReset might not be.

Let's run another metaphor: wave data on the audio canal is like people in a line. Each person is standing in line at say a postal office and each person is holding a packet they need to send off. Normally, when we write this data to the audio canal, each person is placed behind the previous to wait for the postal office to get to them and process their mail. At that point, once they've sent their mail, they're excused from the line and leave the building. Well, maybe the user doesn't want to process all the people. Maybe it's closing time and the user says I'll help one more person but then you all need to go home. waveOutReset does that exact thing for us. It takes all the data buffers waiting in line to be played (including the one currently playing) and tells them, "You're done. Leave.", and they do. They stop playing, and the soundcard goes quiet. At this point, it is then safe to call waveOutClose. Please note that waveOutClose will not close on a device that is currently playing anything. If you call waveOutClose without calling waveOutReset first and for some unknown reason waveOutClose does in fact close the line, it could cause memory leaks and cause damage to your system.

Wait...Another Struct? What is it with this guy?

Here's where things get fun. Well, I suppose that depends on your interpretation of the term. Now, in order to actually put music onto the canal, we have to grab two more friends - a WAVEHDR struct (found on MSDN Library) and LPSTR* buffer (I like to use char* just to be a pain).

typedef struct wavehdr_tag {
  LPSTR              lpData;
  DWORD              dwBufferLength;
  DWORD              dwBytesRecorded;
  DWORD_PTR          dwUser;
  DWORD              dwFlags;
  DWORD              dwLoops;
  struct wavehdr_tag  *lpNext;
  DWORD_PTR          reserved;
} WAVEHDR, *LPWAVEHDR;

Here's what a WAVEHDR struct looks like. Initially can feel kind of daunting, but don't worry, we can ignore it all except for three crucial pieces: lpData, dwBufferLength, and dwFlags. In order to put our data onto a soundcard we have to "prepare it", "write it", and then "unprepare it". Using this WAVEHDR struct is the first step to preparing it.

To set up our WAVEHDR struct, we set dwFlags to 0, dwBufferLength to the size of our lpData (I'll show you in a second here), and set lpData to our LPSTR*. Normally, we'd allocate our buffer to a necessary size from the file, but here we don't have a file, so what I'm doing is allocating to a size of my own desire and then filling in the buffer with random noise, and to fill it in, I'll need a call to srand, so going back up to main():

void main(void)
{
 srand(GetTickCount());
 memset(&spkr, 0, sizeof(spkr));
 ...

Then moving back down to our clause under if(devfound), waveOutOpen...

...
spkr.buffer = (char*)malloc(3000);
if(spkr.buffer)
{
 for(n=0;n<3000;n++)
 {
  spkr.buffer[n] = rand()%255;
 }

 spkr.hdr.dwFlags = 0;
 spkr.hdr.lpData = spkr.buffer;
 spkr.hdr.dwBufferLength = 3000;
}

Then, now that we have the struct set up and our buffer filled in with random noise, we'll make a call to waveOutPrepareHeader, specifying spkr.handle, spkr.hdr, sizeof(spkr.hdr). Once this call to waveOutPrepareHeader has returned successful, we then make a call to waveOutWrite, which will place data on the audio canal and cause immediate playback.

Think of it this way: you have a very-old-school megaphone record player (the kind with that big-ol' brass horn built on it) and you want to play some music. Well, you first place your record on it (the data) and you release the brake (prepare the header) and then you lower the needle (write the data). You'll notice that when you've released the brake before you release the needle, the disc starts to spin. Well this is how the speakers are once you've opened them. No sound is coming from them, but they cycle and cycle and cycle until data is present. Once data is present, they play it immediately, because they've cycled to its starting point in the blink of an eye.

Here's how it should look:

if(devfound)
{
  if(waveOutOpen(&spkr.handle, spkr.devnum, 
     &spkr.fmt, NULL, NULL, CALLBACK_NULL) == MMSYSERR_NOERROR)
  {
   spkr.buffer = (char*)malloc(3000);
   if(spkr.buffer)
   {
    for(n=0;n<3000;n++)
    {
     spkr.buffer[n] = rand()%255;
    }
  
    spkr.hdr.dwFlags = 0;
    spkr.hdr.lpData = spkr.buffer;
    spkr.hdr.dwBufferLength = 3000;

    if(waveOutPrepareHeader(spkr.handle, 
          spkr.hdr, sizeof(spkr.hdr)) == MMSYSERR_NOERROR)
    {
     if(waveOutWrite(spkr.handle, spkr.hdr, sizeof(spkr.hdr)) == MMSYSERR_NOERROR)
     {
      while(TRUE)
      {
       if(spkr.hdr.dwFlags & WHDR_DONE)
       {
        break;
       }
      }
     }

     waveOutUnprepareHeader(spkr.handle, spkr.hdr, sizeof(spkr.hdr));
    }

    free(spkr.buffer);
   }
  }

  /* stop and close playback */
  waveOutReset(spkr.handle);
  waveOutClose(spkr.handle);
}

Alright, let's go over it. We've prepared the header and we wrote it out to the canal. So next you see a while loop. What? Why? Well, iff we were to call waveOutUnprepareHeader right after calling waveOutWrite it would immediately stop playback. So we do a while loop while checking our header for the WHDR_DONE flag and once it's slapped on there by waveOutWrite (because once done playing waveOutWrite will slap on WHDR_DONE to the header it has finished playing) then we break out of the while loop and unprepare the header, free our buffer, stop our playback, and close the device. All in that specific order.

If you try freeing your memory before you've unprepared, you will encounter a memory leak. I've done it more times than I'm proud to admit. But if it's not unprepared first, the device will still think the data is on there, and although it has finished with it, it will still try to read in the memory location you've just freed up. See the problem? Thought you might. Anyway, remember when dealing with memory to take the utmost care, and honestly, if I were you (and just FYI I do take care of this issue in my library) I would do something about checking for errors and such for waveOutUnprepareHeader to prevent against memory leaks. For the purposes of this tutorial, however, I've written it this way just to show you the general idea and to give you some room to play around with this.

Anyway, this is basically the nuts and bolts. This is what's necessary to get it done. So here's me hoping that the article here is enough to get you to build it on your own, without the source code (in this way you actually learn how to do it on your own).

Take that, O'Malley!

You'll find that waveOut, waveIn, and mmio are all very simple APIs. The thing about them that you'll find truly annoying is how much you find that you hate it and think you don't understand it until you finally do. It's like pointers in a way. You just don't understand it until you understand it. Once you do though, it's the easiest thing on the planet.

Additionally, although I haven't, and most likely won't ever, it couldn't hurt for you to wrap this into a class. It'd be a very simple process and would clean up your code a bit. I won't because I love vanilla C, but for you, it could very well be the way to go--depends upon your style and attitude towards classes. In creating a class, it'd simplify the process for you, and make it so you wouldn't have to remember all the necessary nitty-gritty, all the variables involved... basically all the very specificities required for low-level audio manipulation.

Coming Up...

As I stated in my introduction, this is Part I of IV. Part II, I'll cover waveIn and how to record data. Part III, we'll be delving into mmio and how to save/read audio data from a file. Lastly, Part IV, we'll be taking a small step back into waveOut. This time for multiple-buffer playback.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

suendisra
Software Developer
United States United States
I have been programming in C since 2004. Even with what I know now, I find that I am continually learning very rewarding stuff every single day.

Comments and Discussions

 
Questionpossible error PinmemberMember 1086860415-Jun-14 22:21 
AnswerRe: possible error Pinmembersuendisra16-Jun-14 8:13 
QuestionWhere can I download the source code? PinmemberMember 1078457529-Apr-14 16:13 
AnswerRe: Where can I download the source code? Pinmembersuendisra16-Jun-14 8:15 
GeneralMy vote of 5 PinprofessionalMihai MOGA13-Jul-13 20:41 
QuestionPCM PinmemberBarrRobot17-Jun-13 0:20 
AnswerRe: PCM Pinmembersuendisra17-Jun-13 2:12 
GeneralMy vote of 5 PinmemberAndy Bantly14-Jun-13 10:32 
GeneralRe: My vote of 5 Pinmembersuendisra15-Jun-13 14:13 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140721.1 | Last Updated 14 Jun 2013
Article Copyright 2013 by suendisra
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid