Basic Video Capture and VMR9






4.83/5 (11 votes)
Capturing video from webcam and VMR9 windowless rendering with DirectShow.
Introduction
To put it down in simple words, this small program will help to get a start on video capture from a web cam and allows windowless rendering using VMR9. At this point, I would like to say that the program reads all the filters available on a system; the user "must" select the appropriate filters in order to run the program correctly. Apart from what DirectShow help you can ever find on MSDN, I hope this small collection of code will allow you to get a feel of how things work in DirectShow as MSDN has help that is always like an abrupt stop. I have added screenshots so that using the program would be easier.
Background
STL, COM programming basics and DirectShow basics are required apart from MFC. If you have no idea on how filters are read and used, please do refer to my other articles which fully explain how filters can be used, plus an explanation on the BSTR_Compare(..)
method is also available. The link to the topic, which is a three part tutorial, is given below:
Using the code
To use the code, please do make sure that you have changed the program paths for the include directories. In particular, paths to Windows SDK must be configured properly.
The classes
The program has been developed using classes. The following class diagram should give an idea of the contents. Not all methods are explained as some of them are pretty simple, i.e., for developers with a basic knowledge of DirectShow.
Explanation of classes
CMainGraph
This is the main class. The class holds the builder graph references, the capture graph references, and the methods to control the graph. Filters such as camera and VMR9 are added to this class object.
class CMainGraph
{
public:
CMainGraph(void);
~CMainGraph(void);
protected:
// Main graph pointer
IGraphBuilder* pGraph;
// Main capture graph
ICaptureGraphBuilder2* pCaptureGraph2;
// System device enumerator
ICreateDevEnum* pFilterEnum;
// Device moniker
IMoniker *pFilterMonik;
// pointer to media playback control interface
IMediaControl* pControl;
public:
// Main COM initialization function
void Init_COM(void);
// Displays a message box when a COM error occurs
void HR_Failed(HRESULT hr);
// Run graph
void Run_Graph();
// Stop the main graph
void Stop_Graph(void);
// Get Device/Filter Enumerator
ICreateDevEnum* Get_Enumerator(void);
// Return the main graph pointer
IGraphBuilder* Get_MainGraph_Ptr(void);
// Get the main capture graph pointer
ICaptureGraphBuilder2* Get_CaptureGraph_Ptr(void);
};
void Init_COM(void);
: This method of the class initializes COM. It sets references to other interfaces such as theICaptureGraphBuilder2
,IMediaControl
, etc., and queries the required interfaces in order to stop/run the graph.ICreateDevEnum* Get_Enumerator(void);
: This method returns a reference toICreateDevEnum
. The returned pointer is then passed on to the two other filters, camera and VMR9, that use the enumeration in order to instantiate the filters, or let's say the camera device and codec, respectively.IGraphBuilder* Get_MainGraph_Ptr(void);
: This method returns thepGraph
. The objects inheriting this class will need to refer to the main graph pointer using this method.ICaptureGraphBuilder2* Get_CaptureGraph_Ptr(void);
: This method returns theICaptureGraphBuilder2
reference. This interface is also very important in order to make use of the camera capture functionality.
CFilter
The CFilter
class inherits from CMainGraph
. The class holds fields that represent the filter name, a pointer to IBaseFilter
, and IMoniker
. This class is the parent class to the camera class and the VM9 class.
class CFilter:
public CMainGraph
{
public:
CFilter(void);
CFilter(IMoniker*);
~CFilter(void);
protected:
// FriendlyName as seen in graphedit.exe
BSTR bstrFilterName;
// Pointer to filter interface
IBaseFilter* pFilter;
// pointer to filter moniker
IMoniker* pFilterMoniker;
public:
// Displays a message box when a COM error occurs
void HR_Failed(HRESULT hr);
//Compares two fitler names - true if filter found on system
bool BSTR_Compare(BSTR bstrFilterName, BSTR bstrDeviceName);
// Find pin by name
IPin* Find_Pin(BSTR bstrPinName);
// Find Pin
IPin* Find_Pin(PIN_DIRECTION PinDir,IPin *pFilterPin);
// Find a required pin
IPin* Find_Pin(PIN_DIRECTION PIN_DIR, GUID PIN_CAT, GUID MEDIA_TYPE);
// Filter initiating function
IBaseFilter *Filter_Init(IMoniker*);
//Function that connects two filter pins
void Filter_Connect(IPin* pPinOut , IPin* pPinIn);
// Function to add filter to main graph
void Filter_Addto_Graph(IBaseFilter* pFilter,BSTR bstrName);
// Set the main graph pointer
void Set_MainGraph_Ptr(IGraphBuilder* pGraph);
// Set main capture graph pointer
void Set_CaptureGraph_Ptr(ICaptureGraphBuilder2* pCG);
};
IPin* Find_Pin(BSTR bstrPinName);
: This is an overloaded method. While trying to find pins, we can use the "FriendlyName" of the pin. Once found, the pin is returned as the return pin is then further used to join the filter with either the cam or the video renderer depending upon the call.IPin* Find_Pin(PIN_DIRECTION PIN_DIR,GUID PIN_CAT,GUIDE MEDIA_TYPE);
: One of the most important methods that can be used apart from the other overloaded methods. The purpose of this method is to find a pin according to the direction of the pin, i.e., is it an incoming pin or an outgoing pin? If this is used with the camera filter and we try to find the "Capture" pin, thenPIN_DIR
would be equal toPINDIR_OUPUT
. We can specify thatPIN_CAT
is equal toPIN_CATEGORY_CAPTURE
, and the last argument can be used to specify if we have audio only, video only, or mixed; e.g., since we are using video only,MEDIA_TYPE
will be equal toMEDIATYPE_Video
, in this case.IPin* Find_Pin(PIN_DIRECTION PinDir,IPin *pFilterPin);
: The last overload method, which can be used to find a pin according to the direction; remember that the pin passed on here is returned after a successful instantiation.
All three methods almost use the same code to find the pins, i.e., by looping through the filter. Remember that these methods can only be called after a successful initiation of the filter. The code inside of the methods is like:
HRESULT hr;
IEnumPins *pEPin = NULL;// Pin enumeration
IPin *pPin = NULL;// Pins
if (SUCCEEDED(this->pFilter->EnumPins(&pEPin)))
{
while (hr = pEPin->Next(1, &pPin, 0), hr == S_OK)// loop through filter
{
//Get hold of the pin as seen in GraphEdit
hr = pFilter->FindPin(bstrPinName,&pPin);
if(SUCCEEDED(hr))
{
return pPin;
}
}
}
return NULL;
void Set_CaptureGraph_Ptr(ICaptureGraphBuilder2* pCG);
IBaseFilter *Filter_Init(IMoniker*);
: The filter is initialized by this method.void Filter_Connect(IPin* pPinOut , IPin* pPinIn);
: The filters that are connected is done by this method.void Filter_Addto_Graph(IBaseFilter* pFilter,BSTR bstrName);
: Adds the filter to the main graph, i.e.,pGraph
.void Set_MainGraph_Ptr(IGraphBuilder* pGraph);
: Since this also inherits thepGraph
pointer, a reference from the main classCMainGraph
'spGraph
is set to thepGraph
of this class. Make sure that both the camera and VMR9 filter are referring to the samepGraph
pointer.void Set_CaptureGraph_Ptr(ICaptureGraphBuilder2* pCG);
: Same asSet_MainGraph_Ptr(..)
explained above except for thepGraph
pointer.
CFilterList
The CFilterList
class keeps an STL list of the filters. When you run the program, you shall see that there are two combo boxes which hold the separate list of camera and other filters. So, why did I use STL lists? Well, for very obvious reasons of holding different types of information of the filters. But first, below is the listing of the class.
class CFilterList
{
public:
CFilterList(void);
~CFilterList(void);
public:
// STL List to hold filters/device friendly names
list<BSTR> listCamFilters;
list<BSTR>::iterator iterCam;
list<BSTR> listVRFilters;
list<BSTR>::iterator iterVR;
// STL list to hold monikers
list<IMoniker*> pListCamFilterMoniker;
list<IMoniker*>::iterator itermCam;
list<IMoniker*> pListVRFilterMoniker;
list<IMoniker*>::iterator itermVR;
// Filter/Device reader
void Filter_Read(GUID FILTER_CLSID,ICreateDevEnum* pFilterEnum);
// Displays a message box when a COM error occurs
void HR_Failed(HRESULT hr);
//Compares two fitler names - true if filter found on system
bool BSTR_Compare(BSTR bstrFilterName, BSTR bstrDeviceName);
};
The two separate STL lists hold two types of information related to the filters. The first list listCamFilters
holds a type BSTR
which is actually the friendly name of the camera filter. The second, listVRFilters
, is the list holding the friendly names of the video renderer filters. The second type of lists pListCamFilterMoniker
and pListVRFilterMoniker
hold the list of monikers of the camera and video renderers. You can clearly deduce what the iterators for each list would be required for. So, why still use STL then? Well, once devices have been enumerated and monikers used, let's say we instantiate these devices (although instantiation is done by another method), I simply keep all these filters in STL lists. I use a BSTR
type list to save the friendly names in the combo boxes while the IMoniker
type STL list keeps a list of the monikers corresponding to each friendly name. Now, in the running program, when the list of video renderers is clicked, the friendly name from the list is picked up, and a search is initiated. While searching, the program also sifts through the STL list of monikers, on a 'true' from the BSTR_Compare(...)
, which means a filter was found, and a call to filter instantiation is made. The moniker from the STL is sent to the filter instantiation method. Though you would feel it is complicated, just try looking into the code below which is for the Video Renderer filters, and it would be easier to understand the rest of the code.
void CVideoCaptureDlg::OnCbnSelchangeVrList()
{
//Temporary listbox
CComboBox *pComboVRFilter = static_cast<CComboBox*>(this->GetDlgItem(IDC_VR_LIST));
int selectedIndex = pComboVRFilter->GetCurSel();
CString strFilterName;
pComboVRFilter->GetLBText(selectedIndex,strFilterName);
//Find the required filter moniker
FLObject.itermVR = FLObject.pListVRFilterMoniker.begin();
BSTR temp = SysAllocString(strFilterName);
for(
FLObject.iterVR = FLObject.listVRFilters.begin();
FLObject.iterVR != FLObject.listVRFilters.end();
FLObject.iterVR++
)
{
//check if there is a filter on the Video Renderers list for Video Render
if((FLObject.BSTR_Compare(temp,*FLObject.iterVR)) == true)
{
//Initiate Filter
VMR9Object.pVMR9 = VMR9Object.Filter_Init((*FLObject.itermVR));
if(VMR9Object.pVMR9!=NULL)
{
//Add to main graph
VMR9Object.Filter_Addto_Graph(VMR9Object.pVMR9,temp);
break;
}
}
FLObject.itermVR++;
}
//Enable the connect filters button
this->GetDlgItem(IDC_CONNECT_FILTERS)->EnableWindow(1);
}
void CFilterList::Filter_Read(GUID FILTER_CLSID,ICreateDevEnum* pFilterEnum)
: One of the most important methods that is used to read in filters whenever the 'Find Filters' button is pressed. This method starts filling in the STL lists, and part of the code that does that is shown below.
if(SUCCEEDED(hr))
{
//check device category
if(FILTER_CLSID == CLSID_VideoInputDeviceCategory)
{
//store the moniker in the camera STL list
listCamFilters.push_front(SysAllocString(varName.bstrVal));
pListCamFilterMoniker.push_front(pDeviceMonik);
}
else
{
//store the moniker in the video renderer STL list
listVRFilters.push_front(SysAllocString(varName.bstrVal));
pListVRFilterMoniker.push_front(pDeviceMonik);
}
}
else HR_Failed(hr);
CCamerFilter
The CCameraFilter
is the smallest amongst the classes and is shown below:
class CCameraFilter :
public CFilter
{
public:
CCameraFilter(void);
~CCameraFilter(void);
// Cam Filter
IBaseFilter* pCamFilter;
};
It has a pointer member of type IBaseFilter
, which holds the reference to the camera. Since most of the functionality is defined in the CFilter
class, there is nothing happening here.
CVMR9Filter
The CVMR9Filter
class is the trickiest of all. Its listing follows.
class CVMR9Filter :
public CFilter
{
public:
CVMR9Filter(void);
~CVMR9Filter(void);
// VMR9 Filter
IBaseFilter* pVMR9;
// Set the VMR9 windowless mode
void Set_Windowess_Mode(HWND hwndApp,LPRECT DrawRect);
// Render stream for filters. i.e. connect
void Filter_RenderStream(GUID PIN_TYPE,GUID MEDIA_TYPE,IBaseFilter*);
};
I should have named the class CVRFilter
, but my initial intention was to use only VMR9 filters. I rather ended up making a general class, and just did not have the courage to change all the variables. So, please remember it is a more general class, but with one big exception, the Set_Windowless_Mode(...)
method. This is the beauty of VMR9 which I have used, and thus I still claim that the class name is appropriate. The video that is captured is then rendered in a windowless mode, and rendered at a position defined by the coordinates of the group box I called IDC_VIDEO_FRAME
. The position of this group box makes the rendering coordinates. The other interesting method is Filter_RenderStream(...)
. In general, connecting filters can be done using a simple way, and I have used the method by the name of Filter_Connect()
. This method takes two pins and two filters and connects them. But, in the case of VMR9, we can use a built-in method RenderStream(NULL,NULL,src,NULL,dest);
. This is a method exposed by the ICaptureBuilderGraph2
interface. The method can connect a cam filter 'src' to a video renderer 'dest' filter. I have used this method to connect the camera to the VMR9 filter. The code is shown below:
// Render stream for filters. i.e. connect
void CVMR9Filter::Filter_RenderStream(GUID PIN_TYPE,GUID MEDIA_TYPE,IBaseFilter *pSrcFilter)
{
HRESULT hr;
hr = this->pCaptureGraph2->RenderStream(NULL,NULL,pSrcFilter,NULL,this->pFilter);
if(SUCCEEDED(hr))
{
}
else HR_Failed(hr);
}
We are using the Filter_RenderStream(...)
method. It must equally work fine.
Where it all happens
Since the program is dialog based, most of the method calls is made from VideoCaptureDlg.cpp; this file contains the logic of calling methods that will instantiate filters, connect them, and start the video rendering. The main methods of the CVideoCaptureDlg
class are as follows:
// Find filters
afx_msg void OnBnClickedFindFilters();
// Connect filters
afx_msg void OnBnClickedConnectFilters();
// Camera list selected
afx_msg void OnCbnSelchangeCamList();
// Video Renderer list selected
afx_msg void OnCbnSelchangeVrList();
// Play/Stop
afx_msg void OnBnClickedPlaystopButton();
Although it is a very lengthy file, the most interesting is the event handler OnBnClickedConnectFilters()
. I'll write it in steps:
- This method starts by declaring pins that are used as inputs and outputs.
- Camera is searched for a 'capture' pin.
- VMR9 is searched for a 'VMR9 Input0' pin.
- Coordinates of the group box are retrieved.
- Render coordinates are set.
- VMR9's windowless mode is set.
- Filters are connected.
- Finally, the graph is kick started.
Remember the overloaded function of Find_Pin(...)
? Well, here is where you can use them, and I have already placed them in the code with comments.
Steps to run the program
Make sure you have the correct paths to the Windows SDK, and hit Run once presented with the GUI.
- Step 1. Press "Find Filters".
- Step 2. Select the correct cam and then the VMR9 Filter. Be careful at this step.
- Step 3. Click "Connect".
- Step 4. Hopefully you shall see the video.
- Step 5. Try using the "Stop/Play" button.
Screenshots
A screenshot of the video of my dual displays.
Points of interest
The most interesting point? Well, if you could not make a reference to the VMR9 filter, the pointer to VMR9 will hold a zero, and if you use RenderStream(...)
while having a correct reference to the cam, you shall still be able to render the video! But, not in the group box, rather your "third party" software would be called. It happened to me, and took me a week to understand the reason and fix it!
Code bugs
You will and shall find bugs! Did not take care of exceptions, so please note that!
The code is built with the following
- Windows Server 2008
- Microsoft Visual Studio 2008
- DirectX 10.1
- Microsoft Windows SDK 6.1
History
- First post - 23/03/2009.