![]() |
Multimedia »
Audio and Video »
Audio
Intermediate
License: The Code Project Open License (CPOL)
CWave - A Simple C++ Class to Manipulate WAV FilesBy darkomanAn article on a simple C++ class for manipulating WAV files |
C++, Windows, Dev
|
|
|
|
||||||||||||||||
This article is about the simple C++ .WAV manipulation class called CWave. This main features of this class are .wav files loading, saving, playing and mixing. It can handle only PCM .wav files (8-bit or 16-bit). It is written just to show the developer how to:
WAVE_MAPPER device The main goal here was to find the .WAV files mixing algorithm which would not create artefacts nor distort the original .WAVs. I was using the Google search engine to see the existing solutions, and found - believe it or not just 1 or 2 (?!?). I was really surprised, so I tried to 'fix' the proposed solution to get the best output I could. This class holds the results of my research.
Here on The Code Project, one can find different .WAV files loading and playback solutions, using only Platform SDK and not using DirectX, so please take a look at them too before turning to this class only. Some solutions also provide .WAV files recording...
To use the CWave class, do the following:
//
#include "Wave.h"
//
//
// Load .WAV files from the disk
CWave wave1, wave2;
wave1.Load(_T("Enter first .WAV file path here..."));
wave2.Load(_T("Enter second .WAV file path here..."));
//
// Mix .WAVs
wave1.Mix(wave2);
//
// Start .WAV playback (will run in a separate thread, so your program continues)
wave1.Play();
//
// Wait some time (ie. 10 seconds, for the .WAV file to finish playback)
Sleep(10000);
//
// Save the .WAV file on the disk
wave1.Save(_T("Enter destination .WAV file path here..."));
//
Here is the list of the public methods of the CWave class:
BOOL Load(LPTSTR lpszFilePath);
BOOL Save(LPTSTR lpszFilePath);
BOOL Play();
BOOL Stop();
BOOL Pause();
BOOL Mix(CWave& wave);
BOOL IsValid() {return (m_lpData != NULL);}
BOOL IsPlaying() {return (!m_bStopped && !m_bPaused);}
BOOL IsStopped() {return m_bStopped;}
BOOL IsPaused() {return m_bPaused;}
LPBYTE GetData() {return m_lpData;}
DWORD GetSize() {return m_dwSize;}
SHORT GetChannels() {return m_Format.channels;}
DWORD GetSampleRate() {return m_Format.sampleRate;}
SHORT GetBitsPerSample() {return m_Format.bitsPerSample;}
Please see the inner structure of the typical .WAV file below:
So, as you can see, parsing of this file type should not be too difficult. This is called the RIFF structure, built from different 'chunks'. The first 'chunk' is the DESCRIPTOR explaining the RIFF file type. Next if the FORMAT 'chunk' which explains the data format. The .WAV sound data can be 8-bit or 16-bit, mono or stereo, can have a different sampling rate, can be compressed or not, etc. We use this information to initialize the sound input or the sound output devices on the host PC.
The next 'chunk', called the DATA 'chunk' holds the sound data, please see below:
So, actually we work with this sound data which represents the samples of the original audio signal, sampled at some high frequency, typically 11kHz, 22kHz, 44kHz. The data samples can be 8-bit, 16-bit, 24-bit, 32-bit. The data can be uncompressed (like PCM) or compressed (like MP3). The different audio decompressors take care of the compressed sound data, and the different sound output devices can 'play' this data. Also, the sound data can be converted from one format to another using different ACMs (Audio Compression Managers).
The main goal here was to 'make a mixture' of the 2 different .WAV files. The DirectX component (called the DirectSound) can do this easily, but I wanted to use an old fashioned method, by using Platform SDK and old Microsoft winmm.lib library for the sound playback. The two different sound data buffers were mixed using the following equation:
destination = destination + source
Is that simple, or what? Well, it is... almost. This is, however, the main sound mixing equation you would find. Well... if you find, better say. I am sure that sound experts will disagree with my last statement, but there is just no better solution available. Not in the open-source world. So, what will we get with this on the output, in most cases, will be very good (and sometimes even excellent). But, often some artifact could be noticed (heard) on the speakers. It's because of the method itself. It does simple wave superposition. So, if the wave amplitudes differ significantly, the output may be distorted. To avoid this, I used the median value for the 8-bit .WAV files and got good output, with the decreased volume. I have increased the amplitude by the factor of log10(20) which has given better results. For 16-bit .WAV files, I have checked the absolute product of the amplitudes. If it was above 0.5, then I used the simple superposition (with no volume correction). If it was below 0.5, then I used the lower amplitude. The output results were finally good enough for both, mono and stereo .WAV files.
I am, however, sure that this solution is not the best nor the final one. My goal was to give developers a possibility for further research and improvement of the sound data processing technique. I am also hoping that this would be a good starting point for all CodeProject developers who want to improve this section.
I was always interested in processing of sound data. While working on this article, I learned many things considering the .WAV file format and also about the sound data processing techniques. It was not too easy to understand and implement, but the final results justify the effort taken.
16 messages have been posted for this article.
Visit http://www.codeproject.com/KB/audio-video/CWave.aspx to post and view comments
on this article, or click
here to get a
print view with messages.
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 25 Sep 2008 Editor: Deeksha Shenoy |
Copyright 2008 by darkoman Everything else Copyright © CodeProject, 1999-2010 Web20 | Advertise on the Code Project |