Background Removal using Kinect 2 (Green Screen Effect)

Vangos Pterneas

5.00/5 (9 votes)

Apr 11, 2014

CPOL

4 min read

36566

Background removal using Kinect 2 (green screen effect)

Throughout the past few days, I got many requests about Kinect color to depth pixel mapping. As you probably already know, Kinect streams are not properly aligned. The RGB and depth cameras have a different resolution and their point of view is slightly shifted. As a result, more and more people have been asking me (either in the blog comments or by email) about properly aligning the color and depth streams. The most common application they want to build is a cool green-screen effect, just like the following video.

Video

View on YouTube

As you can see, the pretty girl is tracked by the Kinect sensor and the background is totally removed. I can replace the background with a solid color, a gradient fill, or even a random image!

Nice, huh? So, I created a simple project that maps a player’s depth values to the corresponding color pixels. This way, I could remove the background and replace it with something else. The source code is hosted on GitHub as a separate project. It is also part of Vitruvius.

Read the tutorial to understand how Kinect coordinate mapping works and create the application by yourself.

Requirements

Kinect for Windows v2
Windows 8/8.1
Visual Studio 2013
USB 3.0 port

How Background Removal Works

When we refer to “background removal”, we need to keep the pixels which form the user and remove anything else that does not belong to the user. The depth camera of the Kinect sensor comes in handy for determining a user’s body. However, we need to find the RGB color values, not the depth distances. We need to specify which RGB values correspond to the user’s depth values. Confused? Please don’t be.

Using Kinect, each point in space has the following information:

Color value: Red + Green + Blue
Depth value: The distance from the sensor

The depth camera gives us the depth value and the RGB camera provides us with the color value. We map those values using CoordinateMapper. CoordinateMapper is a useful Kinect property that determines which color values correspond to each depth distances (and vice-versa).

Please note that the RGB frames (1920×1080) are wider than the depth frames (512×424). As a result, not every color pixel has a corresponding depth mapping. However, body tracking is performed primarily using the depth sensor, so there is no need to worry about missing values.

The Code

In the GitHub project I shared, you can use the following code to remove the background and get the green-screen effect:

void Reader_MultiSourceFrameArrived(object sender, MultiSourceFrameArrivedEventArgs e)
{
    var reference = e.FrameReference.AcquireFrame();

    var colorFrame = reference.ColorFrameReference.AcquireFrame();
    var depthFrame = reference.DepthFrameReference.AcquireFrame();
    var bodyIndexFrame = reference.BodyIndexFrameReference.AcquireFrame();

    if (colorFrame != null && depthFrame != null && bodyIndexFrame != null)
    {
        // Just one line of code :-)
        camera.Source = _backgroundRemovalTool.GreenScreen(colorFrame, depthFrame, bodyIndexFrame);
    }

    colorFrame.Dispose();
    depthFrame.Dispose();
    bodyIndexFrame.Dispose();
}

As you can see, the whole magic is relying on a single the BackgroundRemovalTool class. We need to be aware of the color frame data, the depth frame data and, of course, the body data, so to remove the background.

The BackgroundRemovalTool class has the following arrays of data:

WriteableBitmap _bitmap: The final image with the cropped background
ushort[] _depthData: The depth values of a depth frame
byte[] _bodyData: The information about the bodies standing in front of the sensor
byte[] _colorData: The RGB values of a color frame
byte[] _displayPixels: The RGB values of the mapped frame
ColorSpacePoint[] _colorPoints: The color points we need to map

It also uses a image source (WriteableBitmap) for creating the final bitmap image. The CoordinateMapper is passed as a parameter from the connected Kinect sensor.

Let’s head to the GreenScreen method. Firstly, we need to get the dimensions of each frame (remember, frames have different widths and heights):

// Color frame (1920x1080)
int colorWidth = colorFrame.FrameDescription.Width;
int colorHeight = colorFrame.FrameDescription.Height;

// Depth frame (512x424)
int depthWidth = depthFrame.FrameDescription.Width;
int depthHeight = depthFrame.FrameDescription.Height;

// Body index frame (512x424)
int bodyIndexWidth = bodyIndexFrame.FrameDescription.Width;
int bodyIndexHeight = bodyIndexFrame.FrameDescription.Height;

Then, we need to initialize the arrays. Initialization happens only once, so to avoid allocating memory every time we have a new frame.

if (_bitmap == null)
{
    _depthData = new ushort[depthWidth * depthHeight];
    _bodyData = new byte[depthWidth * depthHeight];
    _colorData = new byte[colorWidth * colorHeight * BYTES_PER_PIXEL];
    _displayPixels = new byte[depthWidth * depthHeight * BYTES_PER_PIXEL];
    _colorPoints = new ColorSpacePoint[depthWidth * depthHeight];
    _bitmap = new WriteableBitmap(depthWidth, depthHeight, DPI, DPI, FORMAT, null);
}

We now need to populate the arrays with new frame data. Before doing so, we check that the array lengths correspond to the dimensions we found earlier:

if (((depthWidth * depthHeight) == _depthData.Length) &&
   ((colorWidth * colorHeight * BYTES_PER_PIXEL) == _colorData.Length) &&
   ((bodyIndexWidth * bodyIndexHeight) == _bodyData.Length))
{
    // Update the depth data.
    depthFrame.CopyFrameDataToArray(_depthData);

    // Update the color data.
    if (colorFrame.RawColorImageFormat == ColorImageFormat.Bgra)
    {
        colorFrame.CopyRawFrameDataToArray(_colorData);
    }
    else
    {
        colorFrame.CopyConvertedFrameDataToArray(_colorData, ColorImageFormat.Bgra);
    }

    // Update the body index data.
    bodyIndexFrame.CopyFrameDataToArray(_bodyData);

   // Do the coordinate mapping here...
}

It’s time to use the coordinate mapper now. The coordinate mapper will map the depth values to the _colorPoints array:

_coordinateMapper.MapDepthFrameToColorSpace(_depthData, _colorPoints);

That’s it! The mapping has been done. What we have to do is specify which pixels belong to human bodies and add them to the _displayPixels array. So, we loop through the depth values and update the _displayPixels array accordingly.

for (int y = 0; y < depthHeight; ++y)
{
    for (int x = 0; x < depthWidth; ++x)
    {
        int depthIndex = (y * depthWidth) + x;

        byte player = _bodyData[depthIndex];

        // Check whether this pixel belong to a human!!!
        if (player != 0xff)
        {
            ColorSpacePoint colorPoint = _colorPoints[depthIndex];

            int colorX = (int)Math.Floor(colorPoint.X + 0.5);
            int colorY = (int)Math.Floor(colorPoint.Y + 0.5);

            if ((colorX >= 0) && (colorX < colorWidth) 
            && (colorY >= 0) && (colorY < colorHeight))
            {
                int colorIndex = ((colorY * colorWidth) + colorX) * BYTES_PER_PIXEL;
                int displayIndex = depthIndex * BYTES_PER_PIXEL;

                _displayPixels[displayIndex + 0] = _colorData[colorIndex];
                _displayPixels[displayIndex + 1] = _colorData[colorIndex + 1];
                _displayPixels[displayIndex + 2] = _colorData[colorIndex + 2];
                _displayPixels[displayIndex + 3] = 0xff;
            }
        }
    }
}

This would result in a bitmap with transparent pixels for a background and colored pixels for the human bodies. Finally, here is how the WriteableBitmap is updated:

// Just some Windows bitmap handling...
_bitmap.Lock();
Marshal.Copy(_displayPixels, 0, _bitmap.BackBuffer, _displayPixels.Length);
_bitmap.AddDirtyRect(new Int32Rect(0, 0, depthWidth, depthHeight));
_bitmap.Unlock();

Back to the XAML code, you can change the background of the Grid (or whatever) element is behind the Image element and have the background of your choice. For example, this code results in the following image:

<Grid>
    <Grid.Background>
        <SolidColorBrush Color="Green" />
    </Grid.Background>
    <Image Name="camera" />
</Grid>

Kinect 2 background removal (solid color background)

While this code results in a football stadium background:

<Grid>
    <Grid.Background>
        <ImageBrush ImageSource="/Soccer.jpg" />
    </Grid.Background>
    <Image Name="camera" />
</Grid>

Kinect 2 background removal (image background)

Enjoy and share if you like it!

View the complete source code.

PS: Vitruvius

The BackgroundRemovalTool is part of Vitruvius, an open-source library that will speed-up the development of your Kinect projects. Vitruvius supports both version 1 and version 2 sensors, so you can use it for any kind of Kinect project. Download it and give it a try.

The post Background removal using Kinect 2 (green screen effect) appeared first on Vangos Pterneas.