This is another post I publish after getting some good feedback from my blog subscribers. Seems that a lot of people have a problem in common when creating Kinect projects: how they can properly project data on top of the color and depth streams.
As you probably know, Kinect integrates a few sensors into a single device:
- An RGB color camera – 640×480 in version 1, 1920×1080 in version 2
- A depth sensor – 320×240 in v1, 512×424 in v2
- An infrared sensor – 512×424 in v2
These sensors have different resolutions and are not perfectly aligned, so their view areas differ. It is obvious, for example, that the RGB camera covers a wider area than the depth and infrared cameras. Moreover, elements visible from one camera may not be visible from the others. Here’s how the same area can be viewed by the different sensors:
Watch video here.
Suppose we want to project the human body joints on top of the color image. Body tracking is performed using the depth sensor, so the coordinates (X, Y, Z) of the body points are correctly aligned with the depth frame only. If you try to project the same body joint coordinates on top of the color frame, you’ll find out that the skeleton is totally out of place:
Of course, Microsoft is aware of this, so the SDK comes with a handy utility, named
CoordinateMapper’s job is to identify whether a point from the 3D space corresponds to a point in the color or depth 2D space – and vice-versa.
CoordinateMapper is a property of the
KinectSensor class, so it is tight to each Kinect sensor instance.
Let’s get back to our example. Here is the C# code that accesses the coordinates of the human joints:
foreach (Joint joint in body.Joints)
CameraSpacePoint cameraPoint = joint.Position;
float x = cameraPoint.X;
float y = cameraPoint.Y;
float z = cameraPoint.Z;
Note: Please refer to my previous article (Kinect version 2: Overview) about finding the body joints.
The coordinates are 3D points, packed into a
CameraSpacePoint struct. Each
CameraSpacePoint has X, Y and Z values. These values are measured in meters.
The dimensions of the visual elements are measured in pixels, so we somehow need to convert the real-world 3D values into 2D screen pixels. Kinect SDK provides two additional
structs for 2D points:
ColorSpacePoint and D
CoordinateMapper, it is super-easy to convert a
CameraSpacePoint into either a
ColorSpacePoint or a
ColorSpacePoint colorPoint = _sensor.CoordinateMapper.MapCameraPointToColorSpace(cameraPoint);
DepthSpacePoint depthPoint = _sensor.CoordinateMapper.MapCameraPointToDepthSpace(cameraPoint);
This way, a 3D point has been mapped into a 2D point, so we can project it on top of the color (1920×1080) and depth (512×424) bitmaps.
How About Drawing the Joints?
You can draw the joints using a
Canvas element, a
DrawingImage object or whatever you prefer.
This is how you can draw the joints on a
public void DrawPoint(ColorSpacePoint point)
Ellipse ellipse = new Ellipse
Width = 20,
Height = 20,
Fill = Brushes.Red
Canvas.SetLeft(ellipse, point.X - ellipse.Width / 2);
Canvas.SetTop(ellipse, point.Y - ellipse.Height / 2);
Similarly, you can draw a
DepthSpacePoint above the depth frame. You can also draw the bones (lines) between two points. This the result of a perfect coordinate mapping on top of the color image:
Note: Please refer to my previous article (Kinect v2 color, depth and infrared streams) to learn how you can create the camera bitmaps.
Download the source code from GitHub and enjoy yourself:
In this tutorial, I used Kinect for Windows version 2 code, however, everything applies to the older sensor and SDK 1.8 as well. Here are the corresponding class and struct names you should be aware of. As you can see, there are some minor changes regarding the naming conventions used, but the core functionality is the same.
|Version 1 ||Version 2 |