Click here to Skip to main content
15,867,453 members
Articles / Multimedia / DirectX
Article

Using the DirectShow Video Mixing Renderer 9 filter

Rate me:
Please Sign up or sign in to vote.
4.58/5 (30 votes)
1 Feb 20052 min read 785K   11.4K   93   114
This article describes how to dynamically mix two video files (.mpeg, .mpg, .avi and .dat). Mixing involves alpha-blending and stretching/shrinking and positioning of the two video streams, individually, using DirectShow's VMR9 filter.

Sample Image - DirectShowVMR91.jpg

Introduction

This article shows the steps involved in creating and configuring DirectShow’s Video Mixing Renderer Filter 9 (VMR9). The two video streams, one on top of the other, are rendered on a single surface. This surface, in our case, is a PictureBox control. Each stream's alpha value, position and height/width can be adjusted at runtime.

How VMR9 is different

The following diagrams show the difference between rendering two videos with VMR9 and without VMR9.

Without VMR9

Rendering without VMR9

We notice that simply rendering two videos will result in two separate Video Renderers, which means that the videos are being played on two separate surfaces.

With VMR9

Rendering with VMR9

In this case, the VMR9 filter directs both video streams into its own input pins. This means there is only one renderer, and thus a single rendering surface for both video streams.

The Working

To enhance reusability and readability factors, the functionality of the VMR9 filter has been encapsulated inside a class named myVMR9.

The myVMR9 class

This class has the following private data members:

  • VMR9NormalizedRect *r;
  • IVMRWindowlessControl9 *pWC;
  • IVMRMixerControl9 *pMix;
  • IGraphBuilder *pGB;
  • IBaseFilter *pVmr;
  • IVMRFilterConfig9 *pConfig;
  • IMediaControl *pMC;
  • IMediaSeeking *pMS;

The constructor

The constructor receives a PictureBox's coordinates of type System::Drawing::Rectangle, along with its handler of type HWND. These two attributes are used by VMR9 for rendering purposes.

public: myVMR9(System::Drawing::Rectangle rect, HWND hwnd)
{
    // initialize video coordinates with normal values
    r = new VMR9NormalizedRect;
    r->left = 0;
    r->top = 0;
    r->right = 1;
    r->bottom = 1;

    pWC = NULL;
    pMix = NULL;
    pGB = NULL;
    pVmr = NULL;
    pConfig = NULL;
    pMC = NULL;
    pMS = NULL;
    // create an instance of the Filter Graph Manager
    CoCreateInstance(CLSID_FilterGraph, NULL, CLSCTX_INPROC_SERVER, 
        IID_IGraphBuilder, (void **)&pGB);
    // create an instance of the VMR9 filter
    CoCreateInstance(CLSID_VideoMixingRenderer9, NULL, CLSCTX_INPROC,
        IID_IBaseFilter, (void**)&pVmr);
    // add the VMR9 filter to the Graph Manager
    pGB->AddFilter(pVmr, L"Video");    
    // get a pointer to the IVMRFilterConfig9 interface
    pVmr->QueryInterface(IID_IVMRFilterConfig9, (void**)&pConfig);
    // make sure VMR9 is in windowless mode
    pConfig->SetRenderingMode(VMR9Mode_Windowless);
    // get a pointer to the IVMRWindowlessControl9 interface 
    pVmr->QueryInterface(IID_IVMRWindowlessControl9, (void**)&pWC);
    // explicitly convert System::Drawing::Rectangle type to RECT type
    RECT rcDest = {0};
    rcDest.bottom = rect.Bottom;
    rcDest.left = rect.Left;
    rcDest.right = rect.Right;
    rcDest.top = rect.Top;

    // set destination rectangle for the video
    pWC->SetVideoPosition(NULL, &rcDest);

    // specify the container window that the video should be clipped to    
    pWC->SetVideoClippingWindow(hwnd);
    // IVMRMixerControl manipulates video streams
    pVmr->QueryInterface(IID_IVMRMixerControl9, (void**)&pMix);
    // IMediaSeeking seeks to a position in the video stream
    pGB->QueryInterface(IID_IMediaSeeking, (void **)&pMS);
    // IMediaControl controls flow of data through the graph
    pGB->QueryInterface(IID_IMediaControl, (void **)&pMC);
}

The methods

HRESULT play()
{
    pMC->Run(); return
    S_OK;
}

HRESULT pause()
{
    pMC->Pause();
    return S_OK;
}

HRESULT stop()
{
    LONGLONG pos = 0;
    pMC->Stop();
    pMS->SetPositions(&pos, AM_SEEKING_AbsolutePositioning, 
                      NULL,AM_SEEKING_NoPositioning);
    pMC->Pause();
    return S_OK;
}

HRESULT close()
{
    // make sure resources are freed
    SAFE_RELEASE(pWC);
    SAFE_RELEASE(pMix);
    SAFE_RELEASE(pGB);
    SAFE_RELEASE(pVmr);
    SAFE_RELEASE(pConfig);
    SAFE_RELEASE(pMC);
    SAFE_RELEASE(pMS);
    return S_OK;
}

HRESULT setAlpha(DWORD stream, float alpha)
{
    // set alpha of specified video stream
    pMix->SetAlpha(stream, alpha);
    return S_OK;
}

HRESULT setX(DWORD stream, float x)
{
    // video displacement along x-axis
    r->right = x + (r->right - r->left);
    r->left = x;
    pMix->SetOutputRect(stream, r);
    return S_OK;
}

HRESULT setY(DWORD stream, float y)
{
    // video displacement along y-axis
    r->bottom = y + (r->bottom - r->top);
    r->top = y;
    pMix->SetOutputRect(stream, r);
    return S_OK;
}

HRESULT setW(DWORD stream, float w)
{
    // video stretching/shrinking along x-axis
    r->right = r->left + w;
    pMix->SetOutputRect(stream, r);
    return S_OK;
}

HRESULT setH(DWORD stream, float h)
{
    // video stretching/shrinking along y-axis
    r->bottom = r->top + h;
    pMix->SetOutputRect(stream, r);
    return S_OK;
}

HRESULT renderFiles(String* file1, String* file2)
{
    // convert String type to LPCSTR type and render the videos
    LPCTSTR lFile;
    lFile = 
      static_cast<LPCTSTR>(const_cast<void*>(static_cast<const void*>
      (System::Runtime::InteropServices::Marshal::StringToHGlobalAuto(file1))));
    pGB->RenderFile((LPCWSTR)lFile, NULL);
    lFile = 
      static_cast<LPCTSTR>(const_cast<void*>(static_cast<const void*>
      (System::Runtime::InteropServices::Marshal::StringToHGlobalAuto(file2))));
    pGB->RenderFile((LPCWSTR)lFile, NULL);
    System::Runtime::InteropServices::Marshal::FreeHGlobal
      (static_cast<IntPtr>(const_cast<void*>
      (static_cast<const void*>(lFile))));
    pMC->StopWhenReady();
    return S_OK;
}

Now that the VMR9's functionality has been separated from the GUI, Button and TrackBar handlers can simply create a pointer to a myVMR9 object and call the required methods.

Sample screenshot

Additional Information

  • The second video stream opened is on top of the first one, i.e., file-2 video is rendered on top of file-1 video. Therefore, if the first video's alpha value is a 100% and the second video's alpha value is 50%, then both videos will be equally (50%) visible.
  • It should be noted that the values of Width and Height of trackbars can run into negative values. So when a video stream's width is -100%, it is laterally inverted. Similarly, when a video stream's height is -100%, the video is upside down.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Software Developer
Pakistan Pakistan
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralLNK error Pin
craigbennett15-Feb-05 4:10
craigbennett15-Feb-05 4:10 
GeneralLNK error Pin
craigbennett15-Feb-05 4:06
craigbennett15-Feb-05 4:06 
GeneralRe: LNK error Pin
Sameer Ahmed15-Feb-05 6:05
Sameer Ahmed15-Feb-05 6:05 
Generalcould not instantiate Pin
Member 161799912-Feb-05 18:16
Member 161799912-Feb-05 18:16 
GeneralRe: could not instantiate Pin
Sameer Ahmed12-Feb-05 22:34
Sameer Ahmed12-Feb-05 22:34 
GeneralMulti-Screen Expansion Pin
fwsouthern14-Jan-05 9:45
fwsouthern14-Jan-05 9:45 
GeneralRe: Multi-Screen Expansion Pin
Sameer Ahmed14-Jan-05 21:46
Sameer Ahmed14-Jan-05 21:46 
GeneralRe: Multi-Screen Expansion Pin
fwsouthern15-Jan-05 8:20
fwsouthern15-Jan-05 8:20 
Objective is to provide multi-display projection of both live and encoded video for large audiences including PiP video, for example, a signer for the hearing impaired and text overlaying video. Development computers are both 3G P4's with 2G memory. One uses a 4-port ColorGraphics Preditor Promedia configured multi-screen and the other two 2-port NVidia cards (MMX440) configured multi-head.

There are two different scenarios: "stretched" video (over 2-3 screens) and "simultaneous" video (parallel video instances on 3 screens). In either configuration, you can only use one audio source unless you really want a built-in echo/reverberation capability which is non-adjustable and always enabled.

Filtergraphs have been built for both configurations and suffer from common problems, most noticeably when performing software correction for hardware problems. Even when merely displaying on monitors directly attached to the computer, differences in alignment of different displays cause minor adjustments to have to be made in software to correctly align vertical, horizontal and size of projected video. When remoted to large screen projectors, the problem becomes more critical, particularly in the "stretch" or video span mode where there is no "seam" or division between the projections and normally a small overlap to ensure continuous projection -- complete hardware alignment is virtually impossible and variable from day to day due to such things as temperature and humidity variations (once the equipment is completely warmed up) not to mention differences in different projectors and lenses.

The solution you propose is not practical, particularly in the stretched projection unless you are willing to live with mis-aligned portions of a stretched video (such as having .5" difference in height between adjacent displays, etc.). Most audiences will notice and not be happy campers when there is a visual mis-alignment between adjacent displays.

Both configurations require video synchronization and minimized frame dropping, particularly in the stretched mode. One solution, as I questioned, is use of a single filtergraph, an infinite tee, and multiple VMR's. Another is separate filtergraphs and use of one audio source clock to control all filtergraphs. This solution complicates rather than aids the problem due to increased disk/file access and input to each filtergraph for encoded video and a custom device handler for the live video source providing multiple outputs (not available from most vendors). Also, if either configuration is to support composite or TV display, the filtergraph output rectangle must accommodate full screen size and position adjustments for overscan scan converstion.

Another issue not addressed in your article is the added overhead of output size enlargement such as source at 640x480 and destination at 1024x768 -- not exactly a power of 2 adjustment. In my tests of both configurations, if I can use 1:1 video input to output size, I can minimize the overhead to the filtergraph(s). Next best appears to be a power of 2 adjustment. From there, the filtergraph degrades significantly, particularly in frame dropping. Even adjusting video size off-line (pre-processing the video) to force a 1:1 input to output size relationship has dimished returns -- the larger the video source, you encounter a higher file/disk input requirement and larger bitmap size (and correspondingly memory size requirement) to be manipulated by the filtergraph. When using stretched mode, it appears better to use source video at the size of the output (such as 1280x480 to two 620x480 displays, etc.). I have also tried splitting the video into separate video files, using separate filtergraphs, and controlling time with the audio clock from one source -- this appears to be almost equal to the single wide video file approach and adds an additional pre-processing approach. It however is subject to multiple disk/file accesses and further degrades synchronization.

Optimal solution appears to be forcing the video input to output size to be 1:1 and adjust the input clipping and output rectangle positions for necessary adjustments. However, this only works if the projection devices actually are synchronized as to same size of projected image -- projected display size seems to be the most common failing requiring software (or external hardware device -- outside the solution parameters) adjustments to match adjacent displays.

I do need separate VMR's (whether in the same or different filtergraphs) due to several installations I am working with where the text size has to be different due to the size of the projection area and the audience coverage desired between different (simultaneous display) areas to be covered. The alternative is to use the filtergraph only for video and a transparent background text overlay -- this has its own failings as it works fine for two filtergraph video displays but fails at the 3rd (or subsequent) displays (text overlay surface "flashes" the text at the refresh rate of the display) and causes problems with color depth requiring use of 24bit or less -- MS has 4 service notices as to losing transparency when >24bit depth is used -- no anticipated solution in the near future.

As you can see, alignment and synchronization are critical issues, particularly in the stretched video mode. Hence, the application of a single filtergraph with multiple VMR's. Even with 3G P4's with 2G memory, this really straps the computer and gives new meaning to 100%+ cpu utilization.

While multi-screen stretched video eliminates the multiple VMR requirement, it does not solve the size and positioning problems in separate projection devices and display device variations encountered from day to day. And forcing different users to intricately align their hardware is not a realistic solution where it "can" be solved in software.

Any further thoughts?
GeneralHIII HELP ME!! Pin
Mario_Young11-Jan-05 22:03
Mario_Young11-Jan-05 22:03 
GeneralRe: HIII HELP ME!! Pin
Sameer Ahmed12-Jan-05 6:05
Sameer Ahmed12-Jan-05 6:05 
GeneralGood Show !!!!! Pin
dahar826-Jan-05 9:58
dahar826-Jan-05 9:58 
GeneralRe: Good Show !!!!! Pin
Sameer Ahmed6-Jan-05 17:04
Sameer Ahmed6-Jan-05 17:04 
GeneralGreat Article!!! Pin
Samiullah Khan6-Jan-05 0:45
Samiullah Khan6-Jan-05 0:45 
GeneralRe: Great Article!!! Pin
Sameer Ahmed6-Jan-05 16:58
Sameer Ahmed6-Jan-05 16:58 
Generalhelp-missing file DirectShow/BaseClasses Pin
yaya1014-Jan-05 5:50
yaya1014-Jan-05 5:50 
GeneralRe: help-missing file DirectShow/BaseClasses Pin
Sameer Ahmed5-Jan-05 4:09
Sameer Ahmed5-Jan-05 4:09 
GeneralRe: help-missing file DirectShow/BaseClasses Pin
Frank W. Wu11-Jan-05 8:28
Frank W. Wu11-Jan-05 8:28 
QuestionNice article, what about saving rendered output to file? Pin
Danoo3-Jan-05 19:47
Danoo3-Jan-05 19:47 
AnswerRe: Nice article, what about saving rendered output to file? Pin
Sameer Ahmed5-Jan-05 4:28
Sameer Ahmed5-Jan-05 4:28 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.