Click here to Skip to main content
15,860,972 members
Articles / Multimedia / DirectX
Article

Audio File Saving for the DirectX.Capture Class Library

Rate me:
Please Sign up or sign in to vote.
4.63/5 (17 votes)
23 Feb 2008CPOL12 min read 197.1K   2.6K   78   49
Enhancements to the DirectX.Capture class for capturing audio to WMA files.
Image 1

Introduction

Originally, DirectX.Capture supported AVI file saving only. This format is not very usable because it often results in huge files. Sometimes, only the audio needs to be saved (e.g. FM Radio or just TV-sound). The enhancements described in this article make it possible to save audio in Windows Media Audio format. It took quite a lot of searching and reading before I came to the solution presented in this article. Also, I did not find other complete C# examples on this subject, so I thought an article on this subject might be interesting.

Considerations

My first idea was to save audio as WAV (about 10 MB per minute) or as MP3 (about 1 MB per minute) and describe the resulting implementation. These are generally known formats, but coding was not very straightforward. Looking around at possible coding examples, I came across the Windows Media format. Comparing these examples with the DirectX.Capture class example, I noticed that minor to major code modifications would be needed.

I was in favor of the minor version using SetOutputFileName() and RenderStream(). I noticed that ConfigureFilterUsingProfileGuid() was also used. It was not fully clear to me what to do with ConfigureFilterUsingProfileGuid(). It was used for changing the Windows Media audio/video format, but the example showed me a video-only GUID and I had the idea that there were many more possible choices.

The SetOutputFileName() method is very important. It not only sets the filename as the name suggests, but it also adds filters to the graph. For ASF, this will be the ASF file writer; for AVI, this will be the AVI mixer and file writer (two filters!). It is obvious that the selection of the preferred Windows Media audio/video format is essential. So, when starting the implementation for saving the Windows Media files, many questions were raised.

What file writer needed to be used (looking into graphedt)? The WM ASF file writer seemed to be the one. Which audio/video format should be used? Well, I didn't know. How to change the audio/video format? This could be done via the property window that belongs to the ASF file writer. Via a right click on the filter in graphedt, the windows showed up, so I knew it was there. Are there other ways to change the audio/video format? Via a profile representing the audio/video format. Which formats are possible? I didn't know. What is the resulting video resolution? I didn't know.

When I started coding, I got some strange errors from the code. In other words: I had to do more research to find answers to these questions.

IWMProfile Interface

Where should we start from? First, go to MSDN. For capturing audio, search here for the interesting article Creating an Audio Capture Graph. There you can find a lot of information! Also, MSDN has a special Windows Media Format SDK page; just search for Windows Media Format SDK. There, I found the GUIDs that DirectX offers by default, so that was already one answer I wanted.

The best way for selecting the wanted audio/video file format seems to be the IWMProfile interface. Actually, the selection is done by loading a profile representing the audio/video file format. I noticed that MSDN also mentioned that the ConfigureFilterUsingProfileGuid(), without a note, should not be used anymore. That amazes me a bit because in DirectShowLib, I found a note that the interface has become obsolete.

I found an interesting example which showed me that it was possible to get a list of audio/video formats: Windows Media Audio Compressor by Idael Cardoso. This article describes how IWMProfile can be used and the sample program showed me a nice ListBox with a list of Windows Media Audio formats. It helped me in finding more information: C# Windows Media Format SDK Translation by Idael Cardoso.

Then I made up my mind and thought to get it working first, and the rest would follow.

Problem with the ASF File Writer

When I started coding, I had problems with the ASF file writer. For some reason, errors occurred when using mediaControl.Run(). What I did in general was add the filter in advance so that the property window for the filter becomes accessible before it is really used. Upon real capturing, the added filter(s) will be connected (either using RenderStream() or a direct connect). I used that method for Wav, MP3 and AVI file saving, but now this caused problems for ASF (+ WMA, WMV).

Why is the mediaControl.Run() called? Well, while selecting something in the menu, updateMenu() is called... My impression is that these problems have something to do with the filter itself. I noticed that all the examples I saw either configured the filter immediately after the SetOutputFileName() or did not configure it at all. So, I went for the first solution. For audio, this is okay because the default choice is good enough. For video, this would not be acceptable.

The Current Implementation

Before telling what the solution is, I want to explain the current implementation of the audio/video capturing to a file. This functionality is listed here and can be found in the function renderGraph in Capture.cs. In the code, the mediaSubtype is set first. Then SetOutputFileName() is called to add the AVI multiplexer and the file writer. Furthermore, the filename is stored. The next step is to initialize the video rendering. A possible compressor filter is taken into account. After this, the audio rendering is initialized. Upon calling mediaControl.Run() in the function StartPreviewIfNeeded(), the capture graph starts:

C#
//Original code fragment renderGraph Capture.cs
Guid mediaSubType = MediaSubType.Avi;
hr = captureGraphBuilder.SetOutputFileName( ref mediaSubType, 
     Filename, out muxFilter, out fileWriterFilter );
if( hr < 0 ) Marshal.ThrowExceptionForHR( hr );

// Render video (video -> mux)
if ( VideoDevice != null )
{
    // Try interleaved first, because if the device supports it,
    // it's the only way to get audio as well as video
    cat = PinCategory.Capture;
    med = MediaType.Interleaved;
    hr = captureGraphBuilder.RenderStream( ref cat, ref med, 
         videoDeviceFilter, videoCompressorFilter, muxFilter ); 
    if( hr < 0 ) 
    {
        med = MediaType.Video;
        hr = captureGraphBuilder.RenderStream( ref cat, ref med, 
             videoDeviceFilter, videoCompressorFilter, muxFilter ); 
        if ( hr == -2147220969 )
            throw new DeviceInUseException( "Video device", hr );
        if( hr < 0 ) Marshal.ThrowExceptionForHR( hr );
    }
}

// Render audio (audio -> mux)
if ( AudioDevice != null )
{
    cat = PinCategory.Capture;
    med = MediaType.Audio;
    hr = captureGraphBuilder.RenderStream( ref cat, ref med, 
         audioDeviceFilter, audioCompressorFilter, muxFilter );
    if( hr < 0 ) Marshal.ThrowExceptionForHR( hr );
}

isCaptureRendered = true;
didSomething = true;

The Enhancements

I made the choice to present the code that enables Windows Media file saving and uses ConfigureFilterUsingProfileGuid() for selecting the proper audio/video format. It does this via the GUID corresponding to a specific audio/video format and so it corresponds to a specific profile. Another advantage is that this way, the coding will be easier to understand.

In a future topic, the IWMProfile interface could be explained in some more detail. For now, this will not be explained because it would make the article a bit too long. So, if this article makes you more curious, I think that is very positive.

I also made the choice to show changes made to Capture.cs (in DirectX.Capture) only and not for CaptureTest.cs. Changes to CaptureTest.cs are not really needed unless you want to select between an audio and a video format or if you want to switch between AVI or a Windows Media ASF format, such as WMA or WMV. Such selections could be implemented via a menu option or a button press. I think it is a good learning goal to add user control functionality yourself if the functionality is really needed. Well, now on to the enhancements.

Recording File Mode

The enumeration RecFileModeType is declared, which specifies the possible audio/video recording file choices. The choices are AVI, WMV, and WMA. The WMA choice becomes the default because this example is about audio file saving. The AVI choice was added, so the original functionality is still there and could be used if needed. WMV is not explained in this article. It is a good learning goal to figure out how to use that. Also, the use of WMV is a little more complicated because video could be a video + audio stream or a video stream only. A good implementation should check in advance if conflicts may occur. For this, the IWMProfile interface can be used because IWMProfile can provide information about the streams it uses.

The variable recFileMode contains the audio/video recording file mode, so this variable gets the value RecFileModeType.Wma. The variable recFileMode should be accessed via RecFileMode. There is a check added that prevents changing the file mode during file capturing. This is to prevent strange side effects. It also shows that other functionality could be executed upon changing the value of recFileMode. A nice feature would be the ability to change the filename extension. This code should be put in Capture.cs, preferably in the beginning because there you can find more declarations:

C#
/// <summary>
/// Recording file mode type enumerations
/// </summary>

public enum RecFileModeType
{
    /// <summary> Avi video (+ audio) </summary>
    Avi,
    /// <summary> Wmv video (+ audio) </summary>
    Wmv,
    /// <summary> Wma audio </summary>
    Wma,
}

private RecFileModeType recFileMode = RecFileModeType.Wma;

/// <summary>
/// Recording file modes
/// </summary>

public RecFileModeType RecFileMode
{
    get { return(recFileMode); }
    set
    {
        if(this.graphState == GraphState.Capturing)
        {
            // Value may not be changed now
            return;
        }
        recFileMode = value;
    }
}

Capturing, RenderStream()

The most interesting part is the modification in the capture-specific code with RenderStream(), mentioned earlier. The major difference is that the file recording mode is taken into account.

Keep in mind for saving a WMA file:

  • No audio compressor
  • Filename with filename extension *.wma (or *.asf)
  • RecFileMode = RecFileModeType.Wma

Keep in mind for saving a WMV file:

  • No video compressor
  • No audio compressor
  • Filename with filename extension *.wmv (or *.asf)
  • Some video formats do not have an audio stream; video capturing will fail in the current implementation
  • RecFileMode = RecFileModeType.Wmv

Keep in mind for saving an AVI file:

  • Video compressor, e.g. DV AVI?
  • Filename with filename extension *.avi
  • RecFileMode = RecFileModeType.Avi

The code explains itself (I hope). There are checks added, so depending on the file format, specific actions can be performed. The first action is the initialization of mediaSubtype. The next action is to configure the ASF file writer. The configuration is a one-liner. An interesting thing is the type-casting of the file writer pointer for calling ConfigureFilterUsingProfileGuid(). This solution is specific for .NET; it gives an easy solution for changing the preferred audio/video profile.

There is still one question left: What does WMProfile_V80_64StereoAudio mean? WMProfile_V80_64StereoAudio is the audio recording format I chose as default. There are more choices possible and these will be described later on. For a different choice, a different value must be used. Also, if you want to save video, just select a valid Windows Media video format. The audio and video rendering sections look the same, with one major difference. Upon entering the video rendering section, there is a check on the file format: for audio capturing, no video must be rendered, so that section must be ignored!

C#
// Record captured audio/video in Avi, Wmv or Wma format
Guid mediaSubType; // Media sub type

// Set media sub type
if(RecFileMode == RecFileModeType.Avi)
{
    mediaSubType = MediaSubType.Avi;
}
else
{
    mediaSubType = MediaSubType.Asf;
}

// Initialize the Avi or Asf file writer
hr = captureGraphBuilder.SetOutputFileName( ref mediaSubType, 
     Filename, out muxFilter, out fileWriterFilter );
if( hr < 0 )
{
    Marshal.ThrowExceptionForHR( hr );
}

// For Wma (and Wmv) a suitable profile
// must be selected. This can be done
// via a property window, however the muxFilter
// is just created. if needed, the
// property Windows should show up right now!
// Another solution is to configure
// the Asf file writer, however DShowNet does not
// have the interface that is needed. Please ensure it is added. 
if(RecFileMode == RecFileModeType.Wma)
{
    IConfigAsfWriter lConfig = muxFilter as IConfigAsfWriter;

    // Obsolete interface?
    // According to MSDN no, according to DirectShowLib yes.
    // For simplicity, it will be used ...
    hr = 
        lConfig.ConfigureFilterUsingProfileGuid(WMProfile_V80_64StereoAudio);
    if(hr < 0)
    {
        // Problems with selecting video write format
        // Release resources ... (not done yet)
        Marshal.ThrowExceptionForHR( hr );
    }
}

// Render video (video -> mux) if needed or possible
if((VideoDevice != null)&&(RecFileMode != RecFileModeType.Wma))
{
    // Try interleaved first, because if the device supports it,
    // it's the only way to get audio as well as video
    cat = PinCategory.Capture;
    med = MediaType.Interleaved;
    hr = captureGraphBuilder.RenderStream( ref cat, ref med, 
         videoDeviceFilter, videoCompressorFilter, muxFilter ); 
    if( hr < 0 ) 
    {
        med = MediaType.Video;
        hr = captureGraphBuilder.RenderStream( ref cat, ref med, 
             videoDeviceFilter, videoCompressorFilter, muxFilter ); 
        if ( hr == -2147220969 )
        {
            throw new DeviceInUseException( "Video device", hr );
        }
        if( hr < 0 )
        {
            Marshal.ThrowExceptionForHR( hr );
        }
    }
}

// Render audio (audio -> mux) if possible
if ( AudioDevice != null )
{
    // Keep in mind that for certain Wmv
    // formats do not have an audio stream,
    // so when using this code, please ensure
    // you use a format which supports
    // audio!
    cat = PinCategory.Capture;
    med = MediaType.Audio;
    hr = captureGraphBuilder.RenderStream( ref cat, ref med, 
         audioDeviceFilter, audioCompressorFilter, muxFilter );
    if( hr < 0 )
    {
         Marshal.ThrowExceptionForHR( hr );
    }
}

isCaptureRendered = true;
didSomething = true;

WMA Audio Format

The function ConfigureFilterUsingProfileGuid() needs to be called for setting the proper audio format. In this example, there are seven specific Windows media audio formats provided (hard coded) by means of a Guid. Each Guid represents a special audio format and profile. There are more Windows Media formats. However, those formats support video (and or audio). The best way to get these formats is to use the IWMProfile functionality. Then you will be sure that you can get the formats that really exist.

In a future code sample, the IWMProfile method can be shown. For now, this will be enough. Thanks to the IWMProfile interface, I was able to retrieve all the names that belong to a specific audio format and profile. The following code, with at least the audio format you are going to use, should be put somewhere in Capture.cs:

C#
// Windows Media Audio 8 for Dial-up Modem (Mono, 28.8 Kbps)
private static readonly Guid WMProfile_V80_288MonoAudio = 
             new Guid("7EA3126D-E1BA-4716-89AF-F65CEE0C0C67");

// Windows Media Audio 8 for Dial-up Modem 
// (FM Radio Stereo, 28.8 Kbps)
private static readonly Guid WMProfile_V80_288StereoAudio = 
             new Guid("7E4CAB5C-35DC-45bb-A7C0-19B28070D0CC");

// Windows Media Audio 8 for Dial-up Modem (32 Kbps)
private static readonly Guid WMProfile_V80_32StereoAudio = 
             new Guid("60907F9F-B352-47e5-B210-0EF1F47E9F9D");

// Windows Media Audio 8 for Dial-up Modem 
// (Near CD quality, 48 Kbps)
private static readonly Guid WMProfile_V80_48StereoAudio = 
             new Guid("5EE06BE5-492B-480a-8A8F-12F373ECF9D4");

// Windows Media Audio 8 for Dial-up Modem (CD quality, 64 Kbps)
private static readonly Guid WMProfile_V80_64StereoAudio = 
             new Guid("09BB5BC4-3176-457f-8DD6-3CD919123E2D");

// Windows Media Audio 8 for ISDN (Better than CD quality, 96 Kbps)
private static readonly Guid WMProfile_V80_96StereoAudio = 
             new Guid("1FC81930-61F2-436f-9D33-349F2A1C0F10");

// Windows Media Audio 8 for ISDN (Better than CD quality, 128 Kbps)
private static readonly Guid WMProfile_V80_128StereoAudio = 
             new Guid("407B9450-8BDC-4ee5-88B8-6F527BD941F2");

IConfigAsWriter Interface

To get the code working, still one more interface needs to be added because DShowNET does not support the interface of IConfigAsfWriter. DirectShowLib supports this interface. So, if you use that library, then no extra work is needed. This interface can be added to Capture.cs or another suitable place. Keep in mind to change the naming accordingly:

C#
[Guid("45086030-F7E4-486a-B504-826BB5792A3B"),
InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
public interface IConfigAsfWriter
{
    /// Obsolete?
    [PreserveSig]
    int ConfigureFilterUsingProfileId([In] int dwProfileId);

    /// Obsolete?
    [PreserveSig] 
    int GetCurrentProfileId([Out] out int pdwProfileId);

    /// Obsolete?
    [PreserveSig]
    int ConfigureFilterUsingProfileGuid([In, 
        MarshalAs(UnmanagedType.LPStruct)] Guid guidProfile);

    [PreserveSig]
    int GetCurrentProfileGuid([Out] out Guid pProfileGuid);

    /// Obsolete?
    [PreserveSig]
    int ConfigureFilterUsingProfile([In] IntPtr pProfile);

    /// Obsolete?
    [PreserveSig]
    int GetCurrentProfile([Out] out IntPtr ppProfile);

    [PreserveSig]
    int SetIndexMode([In, 
          MarshalAs(UnmanagedType.Bool)] bool bIndexFile);

    [PreserveSig]
    int GetIndexMode([Out, 
          MarshalAs(UnmanagedType.Bool)] out bool pbIndexFile);
}

Audio Rendering, Testing

The modifications shown in this article have been tested. This does not mean that it will work without any problem. During testing, I noticed that I did not get the audible sound. So, I had to make an additional modification. The modification was needed because I was using a Hauppauge PVR150-MCE (Amity2) TV card. This card gets the audio via the PCI bus and not via a wired connection.

The current DirectX.Capture class will work with TV cards using a wired connection to a sound card. So, for those who have a special TV card, the following modification might be interesting: during the preview, do audio rendering. I added an option via a variable audioviapci to tell the program to use audio rendering or do nothing:

  • A value false must be used for wired audio connections (default choice)
  • A value true must be used when the audio is coming via the PCI bus

The variable audioviapci should be put somewhere in Capture.cs. There is one limitation: the TV card driver must have a capture device for audio. If the TV card driver does not have such a device, then the following code will not work either. How to handle such cases could be the topic for a new article. For capturing audio, this modification is not needed. It is just meant for listening:

C#
// Option for selection audio rendering via the PCI bus of the TV card
// For wired audio connections the value must be false!
// For TV-cards, like the Hauppauge PVR150, the value must be true!
// This TV-card does not have a wired audio connection. However, this
// option will work only if the TV-card driver has an audio device!
private bool audioviapci = false;

The actual code should be put in the function renderGraph() in Capture.cs. This code should be inserted after the check if (wantPreviewRendered && !isPreviewRendered) in renderGraph() where the video rendering is started.

C#
// Special option to enable rendering audio via PCI bus
if(audioviapci)
{
    med = MediaType.Audio;
    hr = captureGraphBuilder.RenderStream( ref cat, ref med, 
                            audioDeviceFilter, null, null ); 
    if( hr < 0 )
    {
        Marshal.ThrowExceptionForHR( hr );
    }
}

The last parameter in RenderStream() is null. This means the audio is rendered to the default audio renderer. It might be possible that there is still no audible sound. In that case, do the following:

  • Go to the menu Audio Device in Devices and select the proper audio device.
  • Go to the menu Property Pages in Options and click on Video Crossbar. Go to the ListBox showing Video Decoder Out and select the choice Audio Decoder Out. Then go to the other list box showing Video Tuner In and select the choice Tuner Audio In. Then mark the checkbox, link related streams and click on OK.

Points of Interest

There is no demo program added. Please download the full source from the DirectX.Capture Class Library page and download the demo version. The source file that can be downloaded contains the code changes and the original code. Just replace the original file with my version.

Normally, I use conditions to have the original and the new code in the same file. This is handy if there is an error in the code. Then I compare that code with the previous version. I removed the conditions because this source file should be easy to use. I did not remove the conditions completely, so you will be able to find the exact location of the code changes.

I posted the articles DirectShow - TV Finetuning using the IKsPropertySet in C# and Video File Saving in Windows Media Video Format for the DirectX.Capture Class Library, which have a code example that incorporates the code changes mentioned in this article. In addition, that code example has more features, such as capturing audio via video device filter, added TV fine-tuning, added FM radio and de-interlacing. Furthermore, the code example supports either the DShowNET or DirectShowLib interface library via the conditional DSHOWNET. Please have a look at these articles.

Feedback and Improvements

I hope this code helps you in understanding the structure of the DirectX.Capture class. I also hope I provided you an enhancement that might be useful to you. Feel free to post any comments and questions.

History

  • 15 March, 2006 -- Original version posted
  • 15 August, 2007 -- Article and download updated
  • 29 August, 2007 -- Article updated
  • 23 February, 2008 -- Links updated

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Web Developer
Netherlands Netherlands
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
Questionmicrosoft dv camera and vcr error Pin
Member 87728507-Feb-14 20:08
Member 87728507-Feb-14 20:08 
AnswerRe: microsoft dv camera and vcr error Pin
almere1097-Feb-14 22:04
almere1097-Feb-14 22:04 
GeneralRe: microsoft dv camera and vcr error Pin
Member 87728507-Feb-14 23:21
Member 87728507-Feb-14 23:21 
GeneralRe: microsoft dv camera and vcr error Pin
almere10911-Feb-14 22:59
almere10911-Feb-14 22:59 
GeneralRe: microsoft dv camera and vcr error Pin
Member 877285015-Feb-14 6:17
Member 877285015-Feb-14 6:17 
GeneralRe: microsoft dv camera and vcr error Pin
almere1099-Mar-14 12:30
almere1099-Mar-14 12:30 
QuestionSmi Grabber Device Pin
Member 87728506-Feb-14 20:36
Member 87728506-Feb-14 20:36 
AnswerRe: Smi Grabber Device Pin
almere1096-Feb-14 23:25
almere1096-Feb-14 23:25 
GeneralRe: Smi Grabber Device Pin
Member 87728507-Feb-14 1:18
Member 87728507-Feb-14 1:18 
Questionfew quetion about your project Pin
daneil198923-May-12 0:01
daneil198923-May-12 0:01 
QuestionCan I get byte array of audio data ? Pin
Jack98738-Sep-09 23:38
Jack98738-Sep-09 23:38 
AnswerRe: Can I get byte array of audio data ? Pin
almere1099-Sep-09 5:11
almere1099-Sep-09 5:11 
GeneralStreaming filter Pin
kazim bhai4-Jun-09 5:44
kazim bhai4-Jun-09 5:44 
GeneralRe: Streaming filter Pin
almere1094-Jun-09 22:30
almere1094-Jun-09 22:30 
GeneralManage colors during Capturing Pin
anki12310-Mar-09 0:42
anki12310-Mar-09 0:42 
GeneralRe: Manage colors during Capturing Pin
almere10910-Mar-09 5:53
almere10910-Mar-09 5:53 
GeneralThank You!! Pin
MarkWells26-Oct-08 3:50
MarkWells26-Oct-08 3:50 
QuestionAudio Problem Pin
celebiozgur23-Jun-07 4:01
celebiozgur23-Jun-07 4:01 
AnswerRe: Audio Problem Pin
almere10923-Jun-07 9:13
almere10923-Jun-07 9:13 
GeneralRe: Audio Problem Pin
celebiozgur28-Jun-07 5:02
celebiozgur28-Jun-07 5:02 
GeneralRe: Audio Problem Pin
almere10928-Jun-07 11:21
almere10928-Jun-07 11:21 
GeneralRe: Audio Problem Pin
celebiozgur30-Jun-07 0:08
celebiozgur30-Jun-07 0:08 
GeneralRe: Audio Problem Pin
almere10930-Jun-07 4:32
almere10930-Jun-07 4:32 
GeneralRe: Audio Problem Pin
djedje7214-Apr-09 5:06
djedje7214-Apr-09 5:06 
GeneralRe: Audio Problem Pin
almere10915-Apr-09 6:01
almere10915-Apr-09 6:01 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.