Click here to Skip to main content
15,991,287 members
Articles / Programming Languages / C++
Tip/Trick

Capturing Video from Web-camera on Windows 7 and 8 by using Media Foundation

Rate me:
Please Sign up or sign in to vote.
4.96/5 (25 votes)
10 Apr 2013CPOL5 min read 289.4K   33.2K   72   64
Simple lib for capturing video from web-camera by using Media Foundation

Introduction

After starting to use the Win8-Desktop, I found that some old technologies do not work well, especially DirectShow. For instance, capturing of live video from web-camera by DirectShow works perfect on WinXP, Vista, Win7, and allows to get the specific resolution. For example, from Microsoft Life Studio Web-Camera, I can get video with 1080p. However, on Win8-Desktop, I can get only a 640x480 video. The fact is that the function in the line code, which on Win7 returns HRESULT - S_OK returns FAILED on Win8-Desktop. After reading information on MSDN, I have got an idea that Microsoft has made purpose to stop support of DirectShow and expand another technology - Media Foundation. I found some information about supporting of capturing of video from a web-camera by Media Foundation with the needed parameters, but this information is very dispersed on MSDN. I think, it would be useful to have only one C++ class which includes all procedures of initialization, but hides all of them and has a simple interface. I made it and present it in this tip.

Background

I lead the project of Augment Reality and I need simple support for capturing video from web-camera. I used the simple library videoInput from the website http://muonics.net/school/spring05/videoInput/ which uses DirectShow for this purpose. However, it was not working well on a Win8-Desktop. I found that there is a problem with setting the resolution for capturing of the video. I found the solution to this problem by using Media Foundation, but my project used videoInput and I thought that it would be useful to create a new library with the same interface as videoInput, but which uses Media Foundation. So I got the needed purpose, and I think that my new library would be useful for other people who have faced the same problem in the process of development of the program of image recognition.

Using the Code

The library videoInput was written in Visual Studio 2012 - videoInputVS2012.zip (static library videoInput-staticlib-VS2012x86.zip) and includes nine classes:

  • videoInput - class-interface. For using this library, it is enough to include videoInput.h and videoInput .lib in your project. This class is made as a singleton which makes managing of resources easy.
  • Media_Foundation - is a class singleton which manages the allocation and realizing of resources of Media Foundation.
  • videoDevices - is a class singleton which manages allocation and realizing of video devices and access to the separate video device.
  • videoDevice - is the class for manipulation of capturing of video device, getting raw data, checking a new frame, getting supported resolutions, setting the needed resolution, closing the video device.
  • ImageGrabberThread - is the class for manipulation of thread of the grabbing of the image.
  • ImageGrabber - is the class for initialization and grabbing images from the video device. It controls the process of grabbing and finishes it.
  • RawImage - is the temp class which contains for writing and reading one frame.
  • FormatReading - is class for reading information about supported resolution into customer's MediaType.
  • DebugPrintOut - is the class for printing text into console.

It is enough to use the file videoInput.h as the interface of the library. Listing of it is presented below:

C++
#pragma once

#include <guiddef.h>

struct IMFMediaSource;
 
// Structure for collecting info about types of video,
// which are supported by current video device
struct MediaType
{
    unsigned int MF_MT_FRAME_SIZE;
    unsigned int height;
    unsigned int width;
    unsigned int MF_MT_YUV_MATRIX;
    unsigned int MF_MT_VIDEO_LIGHTING;
    unsigned int MF_MT_DEFAULT_STRIDE;
    unsigned int MF_MT_VIDEO_CHROMA_SITING;
    GUID MF_MT_AM_FORMAT_TYPE;
    wchar_t *pMF_MT_AM_FORMAT_TYPEName;
    unsigned int MF_MT_FIXED_SIZE_SAMPLES;
    unsigned int MF_MT_VIDEO_NOMINAL_RANGE;
    unsigned int MF_MT_FRAME_RATE;
 
    unsigned int MF_MT_FRAME_RATE_low;
    unsigned int MF_MT_PIXEL_ASPECT_RATIO;
 
    unsigned int MF_MT_PIXEL_ASPECT_RATIO_low;
    unsigned int MF_MT_ALL_SAMPLES_INDEPENDENT;
    unsigned int MF_MT_FRAME_RATE_RANGE_MIN;
    unsigned int MF_MT_FRAME_RATE_RANGE_MIN_low;
    unsigned int MF_MT_SAMPLE_SIZE;
    unsigned int MF_MT_VIDEO_PRIMARIES;
    unsigned int MF_MT_INTERLACE_MODE;
    unsigned int MF_MT_FRAME_RATE_RANGE_MAX;
    unsigned int MF_MT_FRAME_RATE_RANGE_MAX_low;
 
    GUID MF_MT_MAJOR_TYPE;
    wchar_t *pMF_MT_MAJOR_TYPEName;
    GUID MF_MT_SUBTYPE;
    wchar_t *pMF_MT_SUBTYPEName;    
 
    MediaType();
    ~MediaType();
    void Clear();
};
 
// Structure for collecting info about one parameter of current video device
struct Parametr
{
    long CurrentValue;
    long Min;
    long Max;
    long Step;
    long Default; 
    long Flag;
    Parametr();
};
 
// Structure for collecting info about 17 parameters of current video device
struct CamParametrs
{
    Parametr Brightness;
    Parametr Contrast;
    Parametr Hue;
    Parametr Saturation;
    Parametr Sharpness;
    Parametr Gamma;
    Parametr ColorEnable;
    Parametr WhiteBalance;
    Parametr BacklightCompensation;
    Parametr Gain;
 
 
    Parametr Pan;
    Parametr Tilt;
    Parametr Roll;
    Parametr Zoom;
    Parametr Exposure;
    Parametr Iris;
    Parametr Focus;
};
 
/// The only visible class for controlling of video devices in format singleton
class videoInput
{
public:
    virtual ~videoInput(void);
 
    // Getting of static instance of videoInput class
    static videoInput& getInstance(); 
 
    // Closing video device with deviceID
    void closeDevice(unsigned int deviceID);
    // Setting callback function for emergency events
    // (for example: removing video device with deviceID) with userData
    void setEmergencyStopEvent(unsigned int deviceID, void *userData, void(*func)(int, void *));
 
    // Closing all devices
    void closeAllDevices();
 
    // Getting of parametrs of video device with deviceID
    CamParametrs getParametrs(unsigned int deviceID);
 
    // Setting of parametrs of video device with deviceID
    void setParametrs(unsigned int deviceID, CamParametrs parametrs);
 
    // Getting numbers of existence videodevices with listing in console
    unsigned int listDevices(bool silent = false);
        
    // Getting numbers of formats, which are supported by videodevice with deviceID
    unsigned int getCountFormats(unsigned int deviceID);
 
    // Getting width of image, which is getting from videodevice with deviceID
    unsigned int getWidth(unsigned int deviceID);
 
    // Getting height of image, which is getting from videodevice with deviceID
    unsigned int getHeight(unsigned int deviceID);
 
    // Getting name of videodevice with deviceID
    wchar_t *getNameVideoDevice(unsigned int deviceID);
    // Getting interface MediaSource for Media Foundation from videodevice with deviceID
    IMFMediaSource *getMediaSource(unsigned int deviceID);
    // Getting format with id, which is supported by videodevice with deviceID 
    MediaType getFormat(unsigned int deviceID, int unsigned id);
 
    // Checking of existence of the suitable video devices
    bool isDevicesAcceable();
 
    // Checking of using the videodevice with deviceID
    bool isDeviceSetup(unsigned int deviceID);
 
    // Checking of using MediaSource from videodevice with deviceID
    bool isDeviceMediaSource(unsigned int deviceID);
    // Checking of using Raw Data of pixels from videodevice with deviceID
    bool isDeviceRawDataSource(unsigned int deviceID);
 
    // Setting of the state of outprinting info in consol
    void setVerbose(bool state);
    // Initialization of video device with deviceID by media type with id
    bool setupDevice(unsigned int deviceID, unsigned int id = 0);
 
    // Initialization of video device with deviceID by width w, height h and fps idealFramerate
    bool setupDevice(unsigned int deviceID, unsigned int w, 
                     unsigned int h, unsigned int idealFramerate = 30);
 
    // Checking of recivig of new frame from video device with deviceID 
    bool isFrameNew(unsigned int deviceID);
 
    // Writing of Raw Data pixels from video device with deviceID with correction
    // of RedAndBlue flipping flipRedAndBlue and vertical flipping flipImage
    bool getPixels(unsigned int deviceID, unsigned char * pixels, 
                   bool flipRedAndBlue = false, bool flipImage = false);
    
private: 
 
    bool accessToDevices;
    videoInput(void);
 
    void processPixels(unsigned char * src, unsigned char * dst, unsigned int width, 
         unsigned int height, unsigned int bpp, bool bRGB, bool bFlip);
    void updateListOfDevices();
}; 

This class can be used in one of two modes - RawData grabbing and MediaSource. If using only the first mode, there is no need to include the headers of Media Foundation and its libraries. In this case, the interface IMFMediaSource in the method IMFMediaSource *getMediaSource(unsigned int deviceID) returns NULL and is predefined in videoInput.h. In the second mode, you can use the mentioned method and use it in your application as normal source of media data from the web-camera. The next listing shows how to use videoInput in case of getting raw data of the frame. This example uses the OpenCV framework for presenting live video (this code TestVideoInputVS2012x86.zip, TestVideoInputVS2012x86-noexe.zip). This framework has its own function for capturing web-camera, but is based on DirectShow and on Win8-Desktop it has the mentioned problem. This example is presented on the next listing:

C++
// TestvideoInput.cpp: определяет точку входа для консольного приложения.
//

#include "stdafx.h"
#include "videoInput.h"
#include "highgui.h"

#pragma comment(lib, "lib\\opencv\\Release\\opencv_highgui242.lib")
#pragma comment(lib, "lib\\opencv\\Release\\opencv_core242.lib")
 
#pragma comment(lib, "videoInput.lib")
 
void StopEvent(int deviceID, void *userData)
{
    videoInput *VI = &videoInput::getInstance();
 
    VI->closeDevice(deviceID);
}
 
int _tmain(int argc, _TCHAR* argv[])
{
    videoInput *VI = &videoInput::getInstance();
 
    int i = VI->listDevices();
 
    if(i > 0)
    {
        if(VI->setupDevice(i-1, 640, 480, 60))
        {
            VI->setEmergencyStopEvent(i - 1, NULL, StopEvent);
 
            if(VI->isFrameNew(i-1))
            {
                int countLeftFrames = 0;
 
                cvNamedWindow("VideoTest", CV_WINDOW_AUTOSIZE);
                CvSize size = cvSize(VI->getWidth(i-1), VI->getHeight(i-1));
 
                IplImage* frame;
 
                frame = cvCreateImage(size, 8,3);
 
                while(1)
                {
                    if(VI->isFrameNew(i-1))
                    {
                        VI->getPixels(i - 1, (unsigned char *)frame->imageData);                        
 
                        cvShowImage("VideoTest", frame);
 
                        countLeftFrames = 0;
                    }
                    else
                        countLeftFrames++;
 
                    char c = cvWaitKey(33);
 
                    if(c == 27) 
                        break;
                    
                    if(c == 49) 
                    {
                        CamParametrs CP = VI->getParametrs(i-1);                        
                        CP.Brightness.CurrentValue = 128; 
                        CP.Brightness.Flag = 1; 
                        VI->setParametrs(i - 1, CP);
                    }
 
                    if(!VI->isDeviceSetup(i - 1))
                    {
                        break;
                    }
 
                    if(countLeftFrames > 60)
                        break;
                }
 
                VI->closeDevice(i - 1);
                
                cvDestroyWindow("VideoTest");
            }
        }
    }
 
    if(VI->setupDevice(i-1, 1920, 1080, 60))
    {
        if(VI->isFrameNew(i-1))
        {
            int countLeftFrames = 0;
 
            cvNamedWindow("VideoTest1", CV_WINDOW_AUTOSIZE);
            CvSize size = cvSize(VI->getWidth(i-1), VI->getHeight(i-1));
 
            IplImage* frame;
 
            frame = cvCreateImage(size, 8,3);
 
            while(1)
            {
                if(VI->isFrameNew(i-1))
                {
                    VI->getPixels(i - 1, (unsigned char *)frame->imageData,false); 
                    cvShowImage("VideoTest1", frame); 
                    countLeftFrames = 0;
                }
                else
                    countLeftFrames++;
                    
                char c = cvWaitKey(33);
 
                if(c == 27) 
                    break;
                    
                if(!VI->isDeviceSetup(i - 1))
                {
                    break;
                }
 
                if(countLeftFrames > 60)
                    break;
            }
 
            VI->closeDevice(i - 1);
                
            cvDestroyWindow("VideoTest1");
        }
 
    }
    return 0;
}

In this code, the pointer on class videoInput can be got by calling the method videoInput::getInstance(). Before using camera, it needs to get the list of suitable devices using the function VI->listDevices(). The device is initialized by calling the method VI->setupDevice(i-1, 640, 480, 60). There are two overloaded methods setupDevice - setting the desired resolution and frames per second, and setting the number of needed type output. The first method finds the existent MediaType with the needed parameters, or uses the default type with number 0. Grabbing images from MediaSource starts by first calling VI->isFrameNew(i-1). After calling this method, the raw data can be gotten by the method VI->getPixels(i - 1, (unsigned char *)frame->imageData,false). Parameters of the video camera can be got by calling the method VI->getParametrs(i-1). The new parameters can be set by the method VI->setParametrs(i - 1, CP). The method of closing of the device VI->closeDevice(i - 1) stops the thread of grabbing and releases the context of the video device. The example shows fast using, stopping, and reusing of the same video device. The global function StopEvent(int deviceID, void *userData) is used as a callback function in the method VI->setEmergencyStopEvent(i - 1, NULL, StopEvent). This function is called in the case of unexpected stopping - e.g., removing web-camera from the USB socket.

The second example is based on the SimpleCapture example from the Windows SDK (this code - SimpleCaptureVS2012.zip, application - SimpleCapture-exe.zip). This example is too big for listing, but I can describe several differences from the original one. Firstly, I removed all the original linking for the web-camera and set the videoInput library.

Image 1

Secondly, I included the the second dialog for choosing a suitable resolution from the list of supported Media Types. It is important to mention that the interface IMFMediaSource ought not be stopped manually. It is released by calling the function closeDevice(unsigned int deviceID).

Image 2

Points of Interest

I have spent much time on searching for suitable information on the Microsoft website for developers and I have not gotten help from experts in that site. And I was not alone in searching for a solution for this problem. I was surprised that the problem of using a web-camera with Media Foundation was not presented, and I hope that my tip will become a useful contribution on this site.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
Australia Australia
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionActivateObject fails with Access denied error Pin
Member 1234141614-Oct-19 21:20
Member 1234141614-Oct-19 21:20 
QuestionVI->setupDevice(); Pin
Ed Hardin17-Feb-18 13:03
Ed Hardin17-Feb-18 13:03 
QuestionWebcam not working for x64 bit machine Pin
Member 131015213-Apr-17 3:19
Member 131015213-Apr-17 3:19 
AnswerRe: Webcam not working for x64 bit machine Pin
Evgeny Pereguda3-Apr-17 4:08
Evgeny Pereguda3-Apr-17 4:08 
QuestionDifference between setting frame rate by CreatePresentationDescriptor and MFSetAttributeSize Pin
kuldeep adhikari19-Dec-16 20:56
kuldeep adhikari19-Dec-16 20:56 
AnswerRe: Difference between setting frame rate by CreatePresentationDescriptor and MFSetAttributeSize Pin
Evgeny Pereguda20-Dec-16 10:56
Evgeny Pereguda20-Dec-16 10:56 
GeneralRe: Difference between setting frame rate by CreatePresentationDescriptor and MFSetAttributeSize Pin
kuldeep adhikari3-Jan-17 3:52
kuldeep adhikari3-Jan-17 3:52 
GeneralRe: Difference between setting frame rate by CreatePresentationDescriptor and MFSetAttributeSize Pin
Evgeny Pereguda3-Jan-17 22:25
Evgeny Pereguda3-Jan-17 22:25 
QuestionDoes it work with MJPEG webcams? Pin
Member 1041887319-Jun-16 22:37
Member 1041887319-Jun-16 22:37 
AnswerRe: Does it work with MJPEG webcams? Pin
Evgeny Pereguda19-Jun-16 23:00
Evgeny Pereguda19-Jun-16 23:00 
QuestionVideo stream field of view Pin
Adonis182119-Nov-15 1:19
Adonis182119-Nov-15 1:19 
AnswerRe: Video stream field of view Pin
Evgeny Pereguda19-Nov-15 11:44
Evgeny Pereguda19-Nov-15 11:44 
GeneralRe: Video stream field of view Pin
Adonis182120-Nov-15 18:09
Adonis182120-Nov-15 18:09 
GeneralRe: Video stream field of view Pin
Evgeny Pereguda20-Nov-15 21:32
Evgeny Pereguda20-Nov-15 21:32 
QuestionDoes not compile on x64 Pin
nbugalia6-Apr-15 8:38
nbugalia6-Apr-15 8:38 
AnswerRe: Does not compile on x64 Pin
Evgeny Pereguda8-Apr-15 21:51
Evgeny Pereguda8-Apr-15 21:51 
QuestionHow to zoom the webcam on demand? Pin
netsesame16-Feb-15 20:00
netsesame16-Feb-15 20:00 
AnswerRe: How to zoom the webcam on demand? Pin
Evgeny Pereguda17-Feb-15 4:58
Evgeny Pereguda17-Feb-15 4:58 
QuestionHow do I use videoInput to capture individual video frames for additional processing as the become available? Pin
johnbMA24-Nov-14 17:45
johnbMA24-Nov-14 17:45 
AnswerRe: How do I use videoInput to capture individual video frames for additional processing as the become available? Pin
Evgeny Pereguda25-Nov-14 12:22
Evgeny Pereguda25-Nov-14 12:22 
GeneralRe: How do I use videoInput to capture individual video frames for additional processing as the become available? Pin
johnbMA25-Nov-14 20:02
johnbMA25-Nov-14 20:02 
QuestionHow to switch video resolution using Media Session in Media foundation? Pin
ambikaksm13-Oct-14 2:03
ambikaksm13-Oct-14 2:03 
GeneralRe: How to switch video resolution using Media Session in Media foundation? Pin
Evgeny Pereguda14-Oct-14 1:31
Evgeny Pereguda14-Oct-14 1:31 
GeneralRe: How to switch video resolution using Media Session in Media foundation? Pin
ambikaksm11-Nov-14 2:32
ambikaksm11-Nov-14 2:32 
GeneralRe: How to switch video resolution using Media Session in Media foundation? Pin
Evgeny Pereguda12-Nov-14 18:07
Evgeny Pereguda12-Nov-14 18:07 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.