CWave - A Simple C++ Class to Manipulate WAV Files

darkoman

Rate me:

4.81/5 (29 votes)

25 Sep 2008CPOL4 min read

244.8K

14.2K

An article on a simple C++ class for manipulating WAV files

Introduction

This article is about the simple C++ .WAV manipulation class called CWave. This main features of this class are .wav files loading, saving, playing and mixing. It can handle only PCM .wav files (8-bit or 16-bit). It is written just to show the developer how to:

Load the .WAV file from the disk into memory (parsing .WAV file's RIFF structure)
Save the .WAV file from memory to the disk (creating .WAV file's RIFF structure)
Playing the .WAV file using the default WAVE_MAPPER device
Offline mixing of the two .WAV files (8-bit and 16-bit mixing not using DirectX)

Background

The main goal here was to find the .WAV files mixing algorithm which would not create artefacts nor distort the original .WAVs. I was using the Google search engine to see the existing solutions, and found - believe it or not just 1 or 2 (?!?). I was really surprised, so I tried to 'fix' the proposed solution to get the best output I could. This class holds the results of my research.

Here on The Code Project, one can find different .WAV files loading and playback solutions, using only Platform SDK and not using DirectX, so please take a look at them too before turning to this class only. Some solutions also provide .WAV files recording...

Using the Code

To use the CWave class, do the following:

C++

//
#include "Wave.h"
//
//
// Load .WAV files from the disk
CWave wave1, wave2;
wave1.Load(_T("Enter first .WAV file path here..."));
wave2.Load(_T("Enter second .WAV file path here..."));
//
// Mix .WAVs
wave1.Mix(wave2);
//
// Start .WAV playback (will run in a separate thread, so your program continues)
wave1.Play();
//
// Wait some time (ie. 10 seconds, for the .WAV file to finish playback)
Sleep(10000);
//
// Save the .WAV file on the disk
wave1.Save(_T("Enter destination .WAV file path here..."));
//

Here is the list of the public methods of the CWave class:

C++

BOOL Load(LPTSTR lpszFilePath);
BOOL Save(LPTSTR lpszFilePath);
BOOL Play();
BOOL Stop();
BOOL Pause();
BOOL Mix(CWave& wave);
BOOL IsValid()				{return (m_lpData != NULL);}
BOOL IsPlaying()			{return (!m_bStopped && !m_bPaused);}
BOOL IsStopped()			{return m_bStopped;}
BOOL IsPaused()				{return m_bPaused;}
LPBYTE GetData()			{return m_lpData;}
DWORD GetSize()				{return m_dwSize;}
SHORT GetChannels()			{return m_Format.channels;}
DWORD GetSampleRate()			{return m_Format.sampleRate;}
SHORT GetBitsPerSample()		{return m_Format.bitsPerSample;}

General Notes About .WAV Files Mixing

Please see the inner structure of the typical .WAV file below:

So, as you can see, parsing of this file type should not be too difficult. This is called the RIFF structure, built from different 'chunks'. The first 'chunk' is the DESCRIPTOR explaining the RIFF file type. Next if the FORMAT 'chunk' which explains the data format. The .WAV sound data can be 8-bit or 16-bit, mono or stereo, can have a different sampling rate, can be compressed or not, etc. We use this information to initialize the sound input or the sound output devices on the host PC.

The next 'chunk', called the DATA 'chunk' holds the sound data, please see below:

So, actually we work with this sound data which represents the samples of the original audio signal, sampled at some high frequency, typically 11kHz, 22kHz, 44kHz. The data samples can be 8-bit, 16-bit, 24-bit, 32-bit. The data can be uncompressed (like PCM) or compressed (like MP3). The different audio decompressors take care of the compressed sound data, and the different sound output devices can 'play' this data. Also, the sound data can be converted from one format to another using different ACMs (Audio Compression Managers).

The main goal here was to 'make a mixture' of the 2 different .WAV files. The DirectX component (called the DirectSound) can do this easily, but I wanted to use an old fashioned method, by using Platform SDK and old Microsoft winmm.lib library for the sound playback. The two different sound data buffers were mixed using the following equation:

destination = destination + source

Is that simple, or what? Well, it is... almost. This is, however, the main sound mixing equation you would find. Well... if you find, better say. I am sure that sound experts will disagree with my last statement, but there is just no better solution available. Not in the open-source world. So, what will we get with this on the output, in most cases, will be very good (and sometimes even excellent). But, often some artifact could be noticed (heard) on the speakers. It's because of the method itself. It does simple wave superposition. So, if the wave amplitudes differ significantly, the output may be distorted. To avoid this, I used the median value for the 8-bit .WAV files and got good output, with the decreased volume. I have increased the amplitude by the factor of log₁₀(20) which has given better results. For 16-bit .WAV files, I have checked the absolute product of the amplitudes. If it was above 0.5, then I used the simple superposition (with no volume correction). If it was below 0.5, then I used the lower amplitude. The output results were finally good enough for both, mono and stereo .WAV files.

I am, however, sure that this solution is not the best nor the final one. My goal was to give developers a possibility for further research and improvement of the sound data processing technique. I am also hoping that this would be a good starting point for all CodeProject developers who want to improve this section.

Points of Interest

I was always interested in processing of sound data. While working on this article, I learned many things considering the .WAV file format and also about the sound data processing techniques. It was not too easy to understand and implement, but the final results justify the effort taken.

History

24^th September, 2008: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

darkoman

Software Developer (Senior) Elektromehanika d.o.o. Nis

Serbia

He has a master degree in Computer Science at Faculty of Electronics in Nis (Serbia), and works as a C++/C# application developer for Windows platforms since 2001. He likes traveling, reading and meeting new people and cultures.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.