WAVE File Processor in C#

Sujoy G

4.55/5 (26 votes)

Jul 13, 2007

CPOL

5 min read

201887

5678

A very simple class containing useful methods to process WAVE audio files

Download source - 19.7 KB

Introduction

I was doing some experiments on WAVE files and how they can be modified or manipulated programmatically. In case of voice mail or audio based marketing campaign applications, where WAVE files are widely used and need to be processed quickly, managing audio files programmatically becomes very useful.

In this code example, I have introduced some useful methods like ChangeVolume(), StripSilence(), WaveMix(), Validate() and Compare() along with other associated methods. Most importantly, the code can be used with no dependency on any legacy or third party audio processing tool.

Background

Initially, I searched several articles and examples but was of no luck. Finally, I came across Ehab Mohamed Essa's article Concatenation Wave Files using C# 2005. The article was excellent and contained fundamentals of WAVE format, some of which I have adapted in my example while reading and writing headers and merging two wave files.

Using the Code

Download the attached zip file containing the entire source code. Open the solution in Visual Studio 2005 IDE. The given form is for test purposes only, and uses all available WAVE processing features. Apart from the form, the most important file is clsWaveProcessor.cs which contains all the key methods for WAVE file processing. I will try my best to explain the methods.

Before running the example, you should keep some source wave files handy with you because those will be necessary to execute all methods. The source WAVE files must be of 16Bit 8kHz Mono with 128kbps Bit Rate. I used this audio specification because, so far I found, this is the audio specification used by voice mail and pre-recorded audio campaign applications.

Upon running the example, only the form will be displayed with all available options. Below is the screen shot of the form.

Screenshot - WaveProcessor_ScreenShot.jpg

The first two text boxes take the source audio files as input. In this example, I have made it mandatory to provide two source audio files in the same format. But as you can understand, two files will be necessary only for merging. The third (optional) source file is the background audio file which will be mixed with the two source audio files. Finally, the path and file name of output audio.

See the click event of the Go button for the codes as below:

//
...
if (!wa.ChangeVolume(txtWaveOne.Text,false, 60)) MessageBox.Show(...);
if (!wa.ChangeVolume(txtWaveTwo.Text, false, 60)) MessageBox.Show(...);
if (!wa.ChangeVolume(txtWaveMix.Text, True, 100)) MessageBox.Show(...);
...
//

I have by default decreased the volume of source audios by 30%. You may need to adjust this value depending on the files you are using for testing.

Rest of the functionalities of the said form are self explanatory. Now let me explain the heart of this example, the clsWaveProcessor class.

The clsWaveProcessor Class

The class contains all necessary methods and fields to function. First check the constants.

//Constants for default or base format [16bit 8kHz Mono]
    private const short CHNL = 1;
    private const int SMPL_RATE = 8000;
    private const int BIT_PER_SMPL = 16;
    private const short FILTER_FREQ_LOW = -10000;
    private const short FILTER_FREQ_HIGH = 10000;
...

The constant names are self explanatory. I have set the values for 16Bit 8kHz Mono and silence filters at +- 10000 (amplitude), this is the signed amplitude values of equivalent sampled audio (see ComplementToSigned(...) private method). Filters can be used at frequency level, but then complexity increases. So the idea is, filter out the amplitudes above or below the specified range. You can change the values depending on your need. Now let's take a look at the key methods for WAVE processing.

For making successful changes in this class, it is important to understand how WAVE audio samples are stored in the data segment of the file. The samples are stored as a byte stream with each byte value in 2's complement form. I have implemented the ComplementToSigned(..) method to take two adjacent bytes (short int) at a given position and return equivalent signed amplitude (-32,767 to 32,768) of sampled audio.

//
...
private short ComplementToSigned(ref byte[] bytArr, int intPos) 
        {
            short snd = BitConverter.ToInt16(bytArr, intPos);
            if (snd != 0)
                snd = Convert.ToInt16((~snd | 1));
            return snd;
        }
...
//

This is ok for 16 Bit Mono, but it will change depending on bit rate and/or number of channels, e.g. 16 Bit Stereo will require four bytes (two bytes for each channel) to get the equivalent signed value. Methods can be overridden to meet this requirement.

Let's see the key methods of this example:

//
...
    /// <summary>
    /// Ensure any given wave file path that the file matches 
    /// with default or base format [16bit 8kHz Mono]
    /// </summary>
    /// <param name="strPath">Wave file path</param>
    /// <returns>True/False</returns>
    public bool Validate(string strPath)
    {
        if (strPath == null) strPath = "";
        if (strPath == "") return false;

        clsWaveProcessor wa_val = new clsWaveProcessor();
        wa_val.WaveHeaderIN(strPath);
        if (wa_val.BitsPerSample != BIT_PER_SMPL) return false;
        if (wa_val.Channels != CHNL) return false;
        if (wa_val.SampleRate != SMPL_RATE) return false;
        return true;
    }

    /// <summary>
    /// Compare two wave files to ensure both are in same format
    /// </summary>
    /// <param name="Wave1">ref. to processor object</param>
    /// <param name="Wave2">ref. to processor object</param>
    /// <returns>True/False</returns>
    private bool Compare(ref clsWaveProcessor Wave1, ref clsWaveProcessor Wave2)
    {
        if (Wave1.Channels != Wave2.Channels) return false;
        if (Wave1.BitsPerSample != Wave2.BitsPerSample) return false;
        if (Wave1.SampleRate != Wave2.SampleRate) return false;
        return true;
    }
     
    /// <summary>
    /// Increase or decrease volume of a wave file by percentage
    /// </summary>
    /// <param name="strPath">Source wave</param>
    /// <param name="booIncrease">True - Increase, False - Decrease</param>
    /// <param name="shtPcnt">1-100 in %-age</param>
    /// <returns>True/False</returns>
    public bool ChangeVolume(string strPath, bool booIncrease, short shtPcnt)
    {
        if (strPath == null) strPath = "";
        if (strPath == "") return false;
        if (shtPcnt > 100) return false;

        clsWaveProcessor wain = new clsWaveProcessor();
        clsWaveProcessor waout = new clsWaveProcessor();

        waout.DataLength = waout.Length = 0;

        if (!wain.WaveHeaderIN(@strPath)) return false;

        waout.DataLength = wain.DataLength;
        waout.Length = wain.Length;

        waout.BitsPerSample = wain.BitsPerSample;
        waout.Channels = wain.Channels;
        waout.SampleRate = wain.SampleRate;
        
        byte[] arrfile = GetWAVEData(strPath);


        //change volume
        for (int j = 0; j < arrfile.Length; j += 2)
        {
            short snd = ComplementToSigned(ref arrfile, j);
            try
            {
                short p = Convert.ToInt16((snd * shtPcnt) / 100);
                if (booIncrease)
                    snd += p;
                else
                    snd -= p;
            }
            catch
            {
                snd = ComplementToSigned(ref arrfile, j);
            }
            byte[] newval = SignedToComplement(snd);
            if ((newval[0] != null) && (newval[1] != null))
            {
                arrfile[j] = newval[0];
                arrfile[j + 1] = newval[1];
            }
        }

        //write back to the file
        waout.DataLength = arrfile.Length;
        waout.WaveHeaderOUT(@strPath);
        WriteWAVEData(strPath, ref arrfile);
       
        return true;
    }

    /// <summary>
    /// Mix two wave files. The mixed data will be written back to the main wave file.
    /// </summary>
    /// <param name="strPath">Path for source or main wave file.</param>
    /// <param name="strMixPath">Path for wave file to be mixed with source.</param>
    /// <returns>True/False</returns>
    public bool WaveMix(string strPath, string strMixPath)
    {
        if (strPath == null) strPath = "";
        if (strPath == "") return false;

        if (strMixPath == null) strMixPath = "";
        if (strMixPath == "") return false;

        clsWaveProcessor wain = new clsWaveProcessor();
        clsWaveProcessor wamix = new clsWaveProcessor();
        clsWaveProcessor waout = new clsWaveProcessor();

        wain.DataLength = wamix.Length = 0;

        if (!wain.WaveHeaderIN(strPath)) return false;
        if (!wamix.WaveHeaderIN(strMixPath)) return false;

        waout.DataLength = wain.DataLength;
        waout.Length = wain.Length;

        waout.BitsPerSample = wain.BitsPerSample;
        waout.Channels = wain.Channels;
        waout.SampleRate = wain.SampleRate;
        
        byte[] arrfile = GetWAVEData(strPath);
        byte[] arrmix = GetWAVEData(strMixPath);

        for (int j = 0, k = 0; j < arrfile.Length; j += 2, k += 2)
        {
            if (k >= arrmix.Length) k = 0;
            short snd1 = ComplementToSigned(ref arrfile, j);
            short snd2 = ComplementToSigned(ref arrmix, k);
            short o = 0;
            // ensure the value is within range of signed short
            if ((snd1 + snd2) >= -32768 && (snd1 + snd2) <= 32767) 
                o = Convert.ToInt16(snd1 + snd2);
            byte[] b = SignedToComplement(o);
            arrfile[j] = b[0];
            arrfile[j + 1] = b[1];
        }

        //write mixed file
        waout.WaveHeaderOUT(@strPath);
        WriteWAVEData(strPath, ref arrfile);
        
        return true;
    }
    
    /// <summary>
    /// Filter out silence or noise from wave file. 
    /// The noise or silence frequencies are set in filter constants -
    /// FILTER_FREQ_HIGH and FILTER_FREQ_LOW. For a given application, 
    /// some experimentation may be required in 
    /// beginning to decide the HIGH and LOW filter frequencies 
    /// (alternate suggestion are most welcome).
    /// </summary>
    /// <param name="strPath">Path for wave file</param>
    /// <returns>True/False</returns>
    public bool StripSilence(string strPath)
    {
        if (strPath == null) strPath = "";
        if (strPath == "") return false;

        clsWaveProcessor wain = new clsWaveProcessor();
        clsWaveProcessor waout = new clsWaveProcessor();

        waout.DataLength = waout.Length = 0;

        if (!wain.WaveHeaderIN(@strPath)) return false;

        waout.DataLength = wain.DataLength;
        waout.Length = wain.Length;

        waout.BitsPerSample = wain.BitsPerSample;
        waout.Channels = wain.Channels;
        waout.SampleRate = wain.SampleRate;
       
        byte[] arrfile = GetWAVEData(strPath);

        //check for silence
        int startpos = 0;
        int endpos = arrfile.Length - 1;
        //At start
        try
        {
            for (int j = 0; j < arrfile.Length; j += 2)
            {
                short snd = ComplementToSigned(ref arrfile, j);
                if (snd > FILTER_FREQ_LOW && snd < FILTER_FREQ_HIGH) startpos = j;
                else
                    break;
            }
        }
        catch (Exception ex)
        {
            Console.Write(ex.Message);
        }
        //At end
        for (int k = arrfile.Length - 1; k >= 0; k -= 2)
        {
            short snd = ComplementToSigned(ref arrfile, k - 1);
            if (snd > FILTER_FREQ_LOW && snd < FILTER_FREQ_HIGH) endpos = k;
            else
                break;
        }

        if (startpos == endpos) return false;
        if ((endpos - startpos) < 1) return false;

        byte[] newarr = new byte[(endpos - startpos) + 1];

        for (int ni = 0, m = startpos; m <= endpos; m++, ni++)
            newarr[ni] = arrfile[m];

        //write file
        waout.DataLength = newarr.Length;
        waout.WaveHeaderOUT(@strPath);
        WriteWAVEData(strPath, ref newarr);
        
        return true;
    }
  
...
//

The main trick is in manipulating the audio data by doing simple arithmetic. The methods are very generic and can be used in various ways.

Validate(...) - This method ensures that the given WAVE file is matching with default format specifications as set with constants.
Compare(...) - This simple method compares two WAVE files to see they are in the same format. For successful operation, it is necessary that all the source files are in same format.
ChangeVolume(...) - This is a very simple but tricky method. I searched a lot for such a simple way of changing volume of WAVE files but finally had to code this one. The method can be used to increase or decrease volume of WAVE file by percentage. It is simply adding or subtracting calculated value with the audio sample data.
WaveMix(...) - It uses almost similar trick of ChangeVolume() to mix audio samples of two wave files. I think this could be very useful for a lot of developers.
StripSilence(...) - This is also a very useful method. The method can remove silence or noise from the WAVE. The filter threshold is set in constants FILTER_FREQ_LOW and FILTER_FREQ_HIGH. Depending on your need, you may have to change these values.

Of course, other methods Merge(...), WaveHeaderIN(...) and WaveHeaderOUT(...) are present in the class, but I have modified them to match with the key methods of this class above. Thanks to Ehab Mohamed Essa for his excellent article explaining these basic methods for WAVE processing.

Summary

I am not that good as an author. But I wanted to share my experiment and learning with everyone. Suggestions, comments, etc. are most welcome.