The OGG Wrapper (An audio converter)






4.27/5 (7 votes)
A wrapper for the libvorbis library that ease the conversion of PCM (*.wav) to Ogg Vorbis audio file (*.ogg) and vice versa to just two lines of code. This also allow the conversion of stereo PCM to mono vorbis and vice versa.
Introduction
The MP3 format gained popularity in the mid 90 but the Fraunhofer Society announced plans to start charging licensing fee for it use in 1998, this prompted the Xiph.Org Foundation (home of the OGG’s; Ogg Vorbis, FLAC, Theora, etc) to intensify work on the already ongoing Vorbis project aiming to make it free and open-source thereby replacing MP3. However it didn't and hasn't replaced MP3 but it has become a standard and decent format in its own right.
It free nature and not been encumbered with patents like the MP3 as made it gained popularity in both open and closed source works like WebM (the HTML5 standard video format), Matroska (*.mkv), and numerous games.
The libvorbis library
The libvorbis library can be downloaded from http://www.vorbis.com/ or http://www.xiph.org/downloads/ or from this article’s attachment.
The wrapper
Working directly with the libvorbis library can be very complicated. Unlike the LAME mp3 library which has in-built functions for some crucial tasks, the libvorbis library however left many crucial tasks to the hands of the coder. Some of these tasks include but not limited to interleaving of samples, channel mixing, encode mode setting (VBR, CBR, ABR), etc.
This wrapper will help by
- Reducing encoding/decoding to just two lines of code (if you are using the default value of the wrapper)
- Making setting of parameters very easy
- Mixing of channels (Encoding a mono PCM to stereo ogg, stereo PCM to mono ogg, etc)
- Etc.
NOTE that the libvorbis library is written in such a way that an ogg will be decode to a PCM having the same channel (a mono channel ogg will decode to a mono channel PCM).
Setting up your environment
If you already know how to setup your development environment to compile with libvorbis, please skip to the second part of this section.
I will be using Visual Studio, if you are using another IDE please find out how to link a static library to your project.
1. Grab the libvorbis library archive from this article and extract it.
2. Click on the “Project” button on the menu bar of Visual Studio, select Property Pages from the drop down menu (<Your Project Name> Property Pages), then go to “Configuration Properties” section. Go to the “Linker” subsection, then to the “General item”. Select “Additional Library Directories” from the list of now available options and add the extracted archive path to it.
3. Go to the “Input” item and select “Additional Dependencies”. Add libogg_static.lib
, libvorbis_static.lib
and libvorbisfile_static.lib
to it (each on separate lines)
4. Now go back to the “Configuration Properties” section then to the “C/C++” subsection, then to the “General” item. Select “Additional Include Directories” from the list of now available options and add the extracted archive path to it.
Environment ready for libvorbis.
Secondly, grab the oggHelper_dd-mm-yyyy archive and extract it
- Add all the files (
AudioSettings.h, oggHelper.cpp, oggHelper.h, OggHelper_VorbisSettings.cpp, OggHelper_VorbisSettings.h, WaveFileHeader.cpp
andWaveFileHeader.h
) to your project. - #include the
oggHelper.h
to your project
#include oggHelper.h
Using the wrapper
To use the wrapper, initializes it thus
#include "oggHelper.h"
int main()
{
oggHelper oHelper;
return 0;
}
Encoding (Conversing from PCM to ogg)
There are five overloaded member functions for the encoding
BOOL Encode(char* file_in, char* file_out);
BOOL Encode(char* file_in, char* file_out, EncodeSetting es);
BOOL Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc);
BOOL Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc);
//The asynchronous function
void* Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc, BOOL async);
Encode(char* file_in, char* file_out);
This is the Encode function with the least parameter, the PCM file path as file_in and the resulting ogg file path as file_out. It used the wrapper default setting of stereo channel VBR encode mode, (just for completeness) bitrate of 128kbps for ABR and vbr quality of 0.4 with setting of all comment to an empty string.
Parameters
- file_in: the path to the PCM (*.wav) file (including the file name)
- file_out: the output path for the resulting ogg (including the file name)
Return Values
The member function returns 1
on success, 0
on failure.
A usage example
#include "oggHelper.h"
int main()
{
oggHelper oHelper;
oHelper.Encode("file.wav", "file.ogg");
return 0;
}
Encode(char* file_in, char* file_out, EncodeSetting es);
This is another overloaded method of the Encode which allows setting of the encoding environment via the EncodeSetting es
parameter. The EncodeSetting
struct is
//Encoding setting
struct EncodeSetting
{
Channel channel;
Encode_Mode encode_mode;
Bitrate min_abr_br;
Bitrate max_abr_br;
Bitrate abr_br;
Bitrate cbr_br;
VBR_Quality vbr_quality;
//The constructor: used to set default values
EncodeSetting();
};
channel
is anenum Channel
whose value is either valueChannel::Stereo
orChannel::Mono
encode_mode
is anenum Encode_Mode
whose value is one ofEncode_Mode::VBR
,Encode_Mode::ABR
, orEncode_Mode::CBR
min_abr_br
,max_abr_br
,abr_br
, andcbr_br
and bitrate for abr and cbr which areenum Bitrate
. The most commonly used bitrate isBR_128kbps
vbr_quality
is the quality value if usingVBR
. It should be between -0.1 to 1
Parameters
- file_in: the path to the PCM (*.wav) file (including the file name)
- file_out: the output path for the resulting ogg (including the file name)
- es: object of the struct EncodeSetting which is used to specify encoding settings
Return Values
The member function returns 1
on success, 0
on failure.
A usage example
#include "oggHelper.h"
int main()
{
oggHelper oHelper;
EncodeSetting es;
es.channel = Stereo;
es.cbr_br = BR_128kbps;
es.encode_mode = CBR;
oHelper.Encode("file.wav", "file.ogg", es);
return 0;
}
Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc);
This overloaded method allows the setting of Comments alongside various other previous setting. Supported comments include TITLE, VERSION, ALBUM, TRACKNUMBER, ARTIST, PERFORMER, COPYRIGHT, LICENSE, ORGANISATION, DESCRIPTION, GENRE, DATE, LOCATION, CONTACT, and ISRC
struct VorbisComment
{
char* TITLE;
char* VERSION;
char* ALBUM;
char* TRACKNUMBER;
char* ARTIST;
char* PERFORMER;
char* COPYRIGHT;
char* LICENSE;
char* ORGANISATION;
char* DESCRIPTION;
char* GENRE;
char* DATE;
char* LOCATION;
char* CONTACT;
char* ISRC;
//The constructor: used to set default values
VorbisComment();
};
Parameters
- file_in: the path to the PCM (*.wav) file (including the file name)
- file_out: the output path for the resulting ogg (including the file name)
- es: object of the struct EncodeSetting which is used to specify encoding settings
- ivc: object of the struct VorbisComment which is used to set comment for the audio file.
Return Values
The member function returns 1
on success, 0
on failure.
A usage example
#include "oggHelper.h"
int main()
{
oggHelper oHelper;
//Encode setting
EncodeSetting es;
es.channel = Stereo;
es.cbr_br = BR_128kbps;
es.encode_mode = CBR;
//Comment setting
VorbisComment ivc;
ivc.ALBUM = "Beautiful Imperfection";
ivc.ARTIST = "Asa";
ivc.DATE = "2011";
oHelper.Encode("file.wav", "file.ogg", es, ivc);
return 0;
}
BOOL Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc);
This method includes a callback procedure but it is not asynchronous (i.e. it will still hilt your application until encoding is complete)
Parameters
- file_in: the path to the PCM (*.wav) file (including the file name)
- file_out: the output path for the resulting ogg (including the file name)
- es: object of the struct
EncodeSetting
which is used to specify encoding settings - ivc: object of the struct
VorbisComment
which is used to set comment for the audio file. - callbackproc: the callback function
Return Values
The member function returns 1
on success, 0
on failure.
Notes
In WNDPROC
callbackproc, msg receive the a OH_STARTED
message at the start of encoding/decoding, it receives an OH_COMPUTED
message with WPARAM
holding the percentage of progress (as an int
) during encoding/decoding, and OH_DONE
at the end of encoding/decoding. It receives OH_ERROR
if any error occur with wParam
holding the error code.
A usage example
#include "oggHelper.h"
HRESULT CALLBACK proc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
switch(msg)
{
case OH_STARTED:
//Start of encoding / decoding
printf("Starting encoding ");
break;
case OH_COMPUTED:
//Update of percentage done
//wParam contains the percentage as int
//the best way to use this is to pass wParam's value into a progress bar
printf("%i ", wParam);
break;
case OH_DONE:
//Notifying end of encoding / decoding
printf("Completed successfully");
break;
case OH_ERROR:
//Error occured
printf("Error code = %i\n", wParam);
break;
}
return 0;
}
int main()
{
oggHelper oHelper;
//Encode setting
EncodeSetting es;
es.channel = Stereo;
es.cbr_br = BR_128kbps;
es.encode_mode = CBR;
//Comment setting
VorbisComment ivc;
ivc.ALBUM = "Beautiful Imperfecion";
ivc.ARTIST = "Asa";
ivc.DATE = "2011";
oHelper.Encode("file.wav", "file.ogg", es, ivc, proc);
return 0;
}
void* Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc, BOOL async);
This is the asynchronous member of the Encode function (i.e. when used, it wont hilt the application, and up to 5 files can be encoded at the same time)
Parameters
- file_in: the path to the PCM (*.wav) file (including the file name)
- file_out: the output path for the resulting ogg (including the file name)
- es: object of the struct
EncodeSetting
which is used to specify encoding settings - ivc: object of the struct
VorbisComment
which is used to set comment for the audio file. - callbackproc: the callback function
- async: a Boolean, if
TRUE
the function will be asynchronous and ifFALSE
the function will behave exactly like the memberBOOL Encode(char* file_in, char* file_out, EncodeSetting es, VorbisComment ivc, WNDPROC callbackproc);
. Default isFALSE
Return Values
If async is set to TRUE
, the function returns an HANDLE
if successful or -3 if the maximum allowed process is reached. If any other error occurred, OH_ERROR
message is sent with WPARAM
holding the error code.
If async is set to FALSE
, the member function returns 1
on success, 0
on failure.
A usage example
#include "oggHelper.h"
HRESULT CALLBACK proc1(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
switch(msg)
{
case OH_STARTED:
//Start of encoding / decoding
printf("Starting encoding ");
break;
case OH_COMPUTED:
//Update of percentage done
//wParam contains the percentage as int
//the best way to use this is to pass wParam's value into a progress bar
printf("%i ", wParam);
break;
case OH_DONE:
//Notifying end of encoding / decoding
printf("Completed successfully");
break;
case OH_ERROR:
//Error occured
printf("Error code = %i\n", wParam);
break;
}
return 0;
}
//Write a full blown callback
HRESULT CALLBACK proc2(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam){return 0;}
HRESULT CALLBACK proc3(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam){return 0;}
int main()
{
//Encode setting
EncodeSetting es;
es.channel = Stereo;
es.cbr_br = BR_128kbps;
es.encode_mode = CBR;
//Comment setting
VorbisComment ivc;
ivc.ALBUM = "Beautiful Imperfection";
ivc.ARTIST = "Asa";
ivc.DATE = "2011";
//Handles
HANDLE oHelperHandle[5];
oggHelper oHelper;
oHelperHandle[0] = oHelper.Encode("file1.wav", "file1.ogg", es, ivc, proc1, true);
oHelperHandle[1] = oHelper.Encode("file2.wav", "file2.ogg", es, ivc, proc2, true);
oHelperHandle[2] = oHelper.Encode("file3.wav", "file3.ogg", es, ivc, proc3, true);
WaitForMultipleObjects(3, oHelperHandle, TRUE, INFINITE);
return 0;
}
Decoding (Conversion of vorbis ogg to PCM)
As earlier said, libvorbis is written is such a way that a vorbis ogg will decode to a PCM having the same channel setting and sample rate. The wrapper as three overloaded member.
//Decode OGG to PCM (with a WAVE header)
BOOL Decode(char* file_in, char* file_out);
BOOL Decode(char* file_in, char* file_out, WNDPROC callbackproc);
//Async function
void* Decode(char* file_in, char* file_out, WNDPROC callbackproc, BOOL async);
BOOL Decode(char* file_in, char* file_out);
This takes two arguments char* file_in, char* file_out
Parameters
- file_in: the path to the ogg (*.ogg) file (including the file name)
- file_out: the output path for the resulting PCM (including the file name)
Return Values
The member function returns 1
on success, 0
on failure.
A usage example
#include "oggHelper.h"
int main()
{
oggHelper oHelper;
oHelper.Decode("file.ogg", "file.wav");
return 0;
}
BOOL Decode(char* file_in, char* file_out, WNDPROC callbackproc);
This method includes a callback procedure but it is not asynchronous (i.e. it will still hilt your application until decoding is complete)
Parameters
- file_in: the path to the ogg (*.ogg) file (including the file name)
- file_out: the output path for the resulting PCM (including the file name)
- callbackproc: the callback function
Return Values
The member function returns 1
on success, 0
on failure.
A usage example
#include "oggHelper.h"
HRESULT CALLBACK proc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
switch(msg)
{
case OH_STARTED:
//Start of encoding / decoding
printf("Starting encoding ");
break;
case OH_COMPUTED:
//Update of percentage done
//wParam contains the percentage as int
//the best way to use this is to pass wParam's value into a progress bar
printf("%i ", wParam);
break;
case OH_DONE:
//Notifying end of encoding / decoding
printf("Completed successfully");
break;
case OH_ERROR:
//Error occured
printf("Error code = %i\n", wParam);
break;
}
return 0;
}
int main()
{
oggHelper oHelper;
oHelper.Decode("file.ogg", "file.wav", proc);
system("pause");
return 0;
}
void* Decode(char* file_in, char* file_out, WNDPROC callbackproc, BOOL async);
This member function is truly asynchronous when async is set to TRUE
.
Parameters
- file_in: the path to the ogg (*.ogg) file (including the file name)
- file_out: the output path for the resulting PCM (including the file name)
- callbackproc: the callback function
- async: a Boolean, if
TRUE
the function will be asynchronous and ifFALSE
the function will behave exactly like the memberBOOL Decode(char* file_in, char* file_out, WNDPROC callbackproc);
Return Values
If async is set to TRUE
, the function returns an HANDLE
if successful or -3
if the maximum allowed process is reached. If any other error occurred, OH_ERROR
message is sent with WPARAM
holding the error code.
If async is set to FALSE
, the member function returns 1
on success, 0
on failure.
A usage example
#include "oggHelper.h"
HRESULT CALLBACK proc1(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
switch(msg)
{
case OH_STARTED:
//Start of encoding / decoding
printf("Starting encoding ");
break;
case OH_COMPUTED:
//Update of percentage done
//wParam contains the percentage as int
//the best way to use this is to pass wParam's value into a progress bar
printf("%i ", wParam);
break;
case OH_DONE:
//Notifying end of encoding / decoding
printf("Completed successfully");
break;
case OH_ERROR:
//Error occured
printf("Error code = %i\n", wParam);
break;
}
return 0;
}
int main()
{
oggHelper oHelper;
HANDLE oHelperHandle[5];
oHelperHandle[0] = oHelper.Decode("file.ogg", "file.wav", proc1, true);
WaitForMultipleObjects(1, oHelperHandle, TRUE, INFINITE);
system("pause");
return 0;
}
Understanding the code
Understanding the channel mixing (Working with the 16bits per sample PCM)
PCM data structure in based on the number of channel and the bits per sample of the PCM.
A 16bits per sample stereo PCM
A 16bits per sample stereo PCM stores each sample in sections of 4bytes which is (16(bits) * 2(channel))/8
. Since both the left and the right channel has to be represented, the left have 2bytes while the right also have 2 bytes.
[Left Channel 2 byte][Right Channel 2 byte]
In a .wav file (which is what will are dealing with), the data are stored in little endian format which means the Less Significant Byte (LSB
) is stored first while the Most Significant Byte (MSB
) is stored last. So the format becomes
[Left LSB][Left MSB][Right LSB][Right MSB]
A 16bits per sample mono PCM
A 16bits per sample mono PCM stores each sample in section of 2bytes which is (16(bits) * 1(channel))/8
. Since the sample has only one channel, the channel takes the whole 2bytes.
[Mono channel 2byte]
In .wav little endian format, it is stored as
[Mono LSB][Mono MSB]
Note that in vorbis, the samples should be in float.
Mixing the channels
Stereo to Mono sampling
To get a mono sample out of a stereo sample, the 4 bytes of the stereo have to be reduced to 2 bytes. This can be done by getting the left and right channels from the stereo sample, adding them together and dividing them by 2.
Say, the 4 bytes of the stereo sample are read into readbuffer, to get the left and right channel out (remember LSB, MSB).
lChannel = ((readbuffer[1]<<8) | (0x00ff & (int)readbuffer[0])) / 32768.f;
rChannel = ((readbuffer[3]<<8) | (0x00ff & (int)readbuffer[2])) / 32768.f;
To get the mono channel,
mChannel = (lChannel + rChannel) * 0.5f
Mono to Stereo sampling
To convert a mono channel sample to stereo, get out the mono channel and set it has both left and right channel.
Say, the 2 bytes of the mono sample are read into readbuffer,
monoChl = ((readbuffer[1]<<8) | (0x00ff & (int)readbuffer[0])) / 32768.f;
lChannel = monoChl;
rChannel = monoChl;
History
- 22nd October, 2014 - Included "Understanding the code" and updated a section of oggHelper.cpp file
- 29th of September, 2014 - Initial article release