Introduction
FrameGrabber allows you to access individual frames in a video file as standard Bitmap objects. It can iterate through based on timestamp or frame number.
Background
This DLL is based on the standard MediaDet
object from DirectShow (see MSDN). FrameGrabber
alleviates two problems with IMediaDet
:
- It takes a lot of scaffolding code just to use the
IMediaDet
interface (at least if you're used to the managed world it seems like a lot)
IMediaDet
gives you access to a raw byte buffer for the image data, not a Bitmap
object
FrameGrabber
simplifies the process by letting you just specify a source video file, then retrieve bitmaps by frame numbers or timestamps.
Using the Code
Create a FrameGrabber
object, specifying the video file path in the constructor or the FileName
property prior to calling methods.
Call one of the accessor methods to get a bitmap (GetImage, GetFrame, this[], GetEnumerator()
, etc). Under the covers, every accessor ends up calling the GetImageAtTime()
method, but they all return the image data in different ways that are convenient depending on the context in which you are calling them.
Every accessor has two versions — one to retrieve a Frame
, and another to retrieve a Bitmap
. A Frame
is a simple class that FrameGrabber exposes that aggregates together the image associated with a specific frame and its indexing information (frame number, timestamp, etc). This is handy because FrameGrabber
implements IEnumerable<Frame>
, meaning that even when iterating via a foreach
statement, you will be able to access indexing information.
Included in the zip is complete MSDN-style documentation of all the methods (docs/index.html).
Points of Interest
FrameGrabber uses the DirectshowNet library, which is an open-source project that provides a managed wrapper around the standard DirectShow libraries. I prefer this approach to using Microsoft's own Managed DirectX since a) Managed DirectX hasn't been updated in a long time and b) there is little to no documentation for Managed DirectX
Even though the DirectShow architecture does not fit very well with the managed view of the world, at least it is relatively well-documented and there are plenty of real-world examples which you can inspect.
Here is an example showing how to initialize the MediaDet object with a video file. This is nearly identical to how you would do it in raw C++, minus some of the memory allocation worries. Notice, however, that AMMediaType
objects still require manual destruction:
mediaDet = (IMediaDet)new MediaDet();
DsError.ThrowExceptionForHR(mediaDet.put_Filename(fileName));
int index = 0;
Guid type = Guid.Empty;
while(type != MediaType.Video)
{
mediaDet.put_CurrentStream(index++);
mediaDet.get_StreamType(out type);
}
mediaDet.get_FrameRate(out frameRate);
mediaType = new AMMediaType();
mediaDet.get_StreamMediaType(mediaType);
videoInfo = (VideoInfoHeader)Marshal.PtrToStructure(mediaType.formatPtr,
typeof(VideoInfoHeader));
DsUtils.FreeAMMediaType(mediaType);
mediaType = null;
width = videoInfo.BmiHeader.Width;
height = videoInfo.BmiHeader.Height;
mediaDet.get_StreamLength(out mediaLength);
frameCount = (int)(frameRate * mediaLength);
The following code shows how to retrieve an image from the MediaDet once it has been initialized. The key call is to MediaDet.GetBitmapBits()
— notice though that the code calls it twice, the first time with a null pointer (ie. IntPtr.Zero
), and the second with a pointer to the actual destination buffer. Why is this? When you call GetBitmapBits
with a null pointer, the MediaDet object returns the number of bytes it needs the buffer to hold via the out bufferSize
parameter. Then the code allocates a buffer with the expected size and makes the call again with the correct pointer to the buffer:
int bufferSize;
mediaDet.GetBitmapBits(seconds, out bufferSize, IntPtr.Zero, width, height);
bufferPtr = Marshal.AllocHGlobal(bufferSize);
mediaDet.GetBitmapBits(seconds, out bufferSize, bufferPtr, width, height);
This last chunk of code copies pixel data from the buffer supplied by the MediaDet
object straight into a managed Bitmap
object. I chose to copy the buffer using int*
rather than byte*
since that cuts the number of copy operations by a factor of four. Also, since most machines have a 32-bit word size, a single int
copy operation is (theoretically) faster than a single byte
copy operation because no work needs to be done to extract sub-bytes from the word. Finally, it is necessary to flip the pixels since DirectShow returns them in a different order than the one the Bitmap
class expects. You could achieve this with some clever pointer manipulation when copying the image buffer, but it's cleaner, safer, and clearer to do a straightforward copy, then let the Bitmap
class handle the flip:
unsafe
{
returnValue = new Bitmap(width, height, PixelFormat.Format24bppRgb);
BitmapData imageData = returnValue.LockBits(new Rectangle(0, 0, width, height),
ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
int* imagePtr = (int*)imageData.Scan0;
int bitmapHeaderSize = Marshal.SizeOf(videoInfo.BmiHeader);
int* sourcePtr = (int*)((byte*)bufferPtr.ToPointer() + bitmapHeaderSize);
for(int i = 0; i < (bufferSize - bitmapHeaderSize) / 4; i++)
{
*imagePtr = *sourcePtr;
imagePtr++;
sourcePtr++;
}
returnValue.UnlockBits(imageData);
returnValue.RotateFlip(RotateFlipType.Rotate180FlipX);
}
Marshal.FreeHGlobal(bufferPtr);
return returnValue;
History
4/9/2008 - initial release