This article explains how to use the DirectShow API for simple audio
conversion, particularly Wav to MP3 conversion. Audio codecs in the DirectShow
API are of three type : native codecs, ACM codecs, and DMO (DirectX Media
There are only few audio native codecs for audio compression. For MP3
encoding, the only one that I've found is the LAME
DirectShow wrapper from Elecard. Most MP3 encoders are in the ACM (Audio
Compression Manager) format, wich was introduced with the Windows Multimedia
CDSEncoder class and it's relative classes
CDSCodecFormat enumerate ACM codecs and
their respective compression parameters, construct a graph and do the
The GraphBuilder and other filters
The graph consist of five filters:
- File source (async) for reading the input wav file,
- WAV Parser for wav parsing,
- ACM codec for audio compression (in this case : MP3 ACM, wrapped by
the ACM Wrapper Filter),
- WAV Dest, for wav output multiplexing,
- File Writer for writing the output file.
The WAV Dest filter is not included in standard filters, but need to be
compiled from the DirectX SDK
convenience, the compiled WAV Dest filter is included in the demo zip, but you
have to register it by RegSrv32 wavdest.ax.
ACM codecs and the ACM Wrapper filter
All of the ACM codecs are listed in DirectShow in the Audio Compressors
Filter Category (
CLSID_AudioCompressorCategory) and cannot
be instantiated directly. We have to use the Device Enumerator to use them.
Note : Depending of your configuration, several ACM codecs for a
same format can be installed on your computer. This can be the case for MP3
codecs. You set priority or deactivate some of them by the use of control
panel, as show in the following figure.
The DeviceEnumerator or how to browse ACM codecs
The Device Enumerator must be used to retrieve an instance of an ACM codec.
It returns the codecs list by the
IEnumMoniker interface, so we can
get the filter interface (
IBaseFilter) by a call to
IMoniker::BindToObject() and the filter name by a call to
Configuring the ACM codec with IAMStreamConfig interface
Once the desired codec is instantiated, we can obtain an
IBaseFilter interface for filter configuration. Since each
IBaseFilter have one or more Pin, we have to search the output Pin
by the use of the
IEnumPins interface and
With the output Pin, we can query
IAMStreamConfig interface to configure the following property :
- Numbers of channels,
- Samples per second,
- Average byte per second,
- Bits per sample.
Note : For some codecs (including MP3), the call to
IAMStreamConfig::SetFormat() must be after the graph
CDSEncoder assumes the following task :
- Enumerate the Audio codecs (
- Build, render, and run the graph.
class CDSEncoder : public CArray<CDSCodec*, CDSCodec*>
void BuildGraph(CString szSrcFileName, CString szDestFileName,
int nCodec, int nFormat);
HRESULT AddFilterByClsid(IGraphBuilder *pGraph, LPCWSTR wszName,
const GUID& clsid, IBaseFilter **ppF);
BOOL SetFilterFormat(AM_MEDIA_TYPE* pStreamFormat,
CDSEncoder inherits from
CArray, the collection
of codecs is exposed by
CArray methods with each codecs returned as
CDSCodec assumes the following task :
- Enumerate the codec supported parameters,
- expose the codec name.
class CDSCodec : public CArray<CDSCodecFormat*, CDSCodecFormat*>
CDSCodec inherits from
CArray, the collection of
codecs supported parameters is exposed by
CArray methods with each
parameters returned as
CDSCodecFormat exposes the properties of one-codec parameters :
- Number of channels,
- Samples per second,
- Bytes per second,
- Bits per samples.
The article goal is to demonstrate the use of DirectShow for simple audio
conversion. These classes are not as safe as they have to be. Please keep this
in mind if you plan to use it in a production environment.
Source Wav format
There are no sampling conversion, so you can only generate 44 kHz output
files if you use 44 kHz Wav.
Windows Media format can be used only with a certificate that can be obtained
by the Windows Media SDK from Microsoft.