13,047,285 members (60,050 online)
alternative version

Stats

26.9K views
6 bookmarked
Posted 27 Jan 2014

Implementing Kinect Gestures

, 28 Jan 2014
 Rate this:
How to implement Kinect gestures

Original post: http://pterneas.com/2014/01/27/implementing-kinect-gestures/

Gesture recognition is a fundamental element when developing Kinect-based applications (or any other Natural User Interfaces). Gestures are used for navigation, interaction or data input. The most common gesture examples include waving, sweeping, zooming, joining hands, and much more. Unfortunately, the current Kinect for Windows SDK does not include a gesture-detection mechanism out of the box. So, you thought that recognizing gestures using Kinect is a pain in the ass? Not any more. Today, I’ll show you how you can implement your own gestures using some really easy techniques. There is no need to be a Math guru or an Artificial Intelligence Yoda to build a simple gesture detection mechanism.

What is a Gesture?

Before implementing something, it is always good to define it. Kinect provides you with the position (X, Y and Z) of the users’ joints 30 times (or frames) per second. If some specific points move to specific relative positions for a given amount of time, then you have a gesture. So, in terms of Kinect, a gesture is the relative position of some joints for a given number of frames. Let’s take the wave gesture as an example. People wave by raising their left or right hand and moving it from side to side. Throughout the gesture, the hand usually remains above the elbow and moves periodically from left to right. Here is a graphical representation of the movement:

Now that you’ve seen and understood what a gesture is, let’s try to specify its underlying algorithm.

Gesture Segments

In the wave gesture, the hand remains above the elbow and moves periodically from left to right. Each position (left / right) is a discrete part of the gesture. Formally, these parts are called segments.

So, the first segment would contain the conditions “hand above elbow” and “hand right of elbow”:

• `Hand.Position.Y > Elbow.Position.Y AND`
• `Hand.Position.X > Elbow.Position.X`

Similarly, the second segment would contain the conditions “hand above elbow” and “hand left of elbow”:

• `Hand.Position.Y > Elbow.Position.Y AND`
• `Hand.Position.X < Elbow.Position.X`

That’s it. If you notice any consecutive repeats of the above segments for at least three or four times, then the user is waving! In .NET, the source code would be really simple; just two classes representing each segment. Of course, each segment class should implement an `Update` method. The `Update` method determines whether the specified conditions are met for a given skeleton body. Returns `Succeeded` if every condition of the segment is met, or `Failed` if none of the conditions is met.

```// WaveGestureSegments.cs
using Microsoft.Kinect;

namespace KinectSimpleGesture
{
public interface IGestureSegment
{
GesturePartResult Update(Skeleton skeleton);
}

public class WaveSegment1 : IGestureSegment
{
public GesturePartResult Update(Skeleton skeleton)
{
// Hand above elbow
if (skeleton.Joints[JointType.HandRight].Position.Y >
skeleton.Joints[JointType.ElbowRight].Position.Y)
{
// Hand right of elbow
if (skeleton.Joints[JointType.HandRight].Position.X >
skeleton.Joints[JointType.ElbowRight].Position.X)
{
return GesturePartResult.Succeeded;
}
}

// Hand dropped
return GesturePartResult.Failed;
}
}

public class WaveSegment2 : IGestureSegment
{
public GesturePartResult Update(Skeleton skeleton)
{
// Hand above elbow
if (skeleton.Joints[JointType.HandRight].Position.Y >
skeleton.Joints[JointType.ElbowRight].Position.Y)
{
// Hand left of elbow
if (skeleton.Joints[JointType.HandRight].Position.X <
skeleton.Joints[JointType.ElbowRight].Position.X)
{
return GesturePartResult.Succeeded;
}
}

// Hand dropped
return GesturePartResult.Failed;
}
}
}```

The `GesturePartResult` is an `enum` (we could even use boolean values):

```// GesturePartResult.cs
using System;

namespace KinectSimpleGesture
{
public enum GesturePartResult
{
Failed,
Succeeded
}
}```

Note: For a more advanced example, we could use another GesturePartResult (lets say “Undetermined”), which would indicate that we are not sure about the current gesture result.

Updating the Gesture

We now need a way to update and check the gesture every time the sensor provides us with new skeleton/body data. This kind of check will be done in a separate class and will be called 30 times per second, or at least as many times as our Kinect sensor allows. When updating a gesture, we check each segment and specify whether the movement is complete or whether we need to continue asking for data.

Window Size

The number of frames we ask for data is called window size and you find it after experimenting with your code. For simple gestures that last for approximately a second, a window size of 30 or 50 will do the job just fine. For the wave gesture, I chose 50.

The Gesture Class

Having decided on the window size parameter, we can now build the `WaveGesture` class. Notice the process:

• In the constructor, we create the gesture parts and we specify their order in the `_segments` array. You can use as many occurrences of each segment as you like!
• In the `Update` method, we keep track of the frame index and check every segment for success or failure.
• If we succeed, we throw the `GestureRecognized` event and reset the gesture.
• If we fail or the window size has been reached, we reset the gesture and start over.

Here is the final class for our wave gesture:

```// WaveGesture.cs
using Microsoft.Kinect;
using System;

namespace KinectSimpleGesture
{
public class WaveGesture
{

IGestureSegment[] _segments;

int _currentSegment = 0;
int _frameCount = 0;

public event EventHandler GestureRecognized;

public WaveGesture()
{
WaveSegment1 waveSegment1 = new WaveSegment1();
WaveSegment2 waveSegment2 = new WaveSegment2();

_segments = new IGestureSegment[]
{
waveSegment1,
waveSegment2,
waveSegment1,
waveSegment2,
waveSegment1,
waveSegment2
};
}

public void Update(Skeleton skeleton)
{
GesturePartResult result = _segments[_currentSegment].Update(skeleton);

if (result == GesturePartResult.Succeeded)
{
if (_currentSegment + 1 < _segments.Length)
{
_currentSegment++;
_frameCount = 0;
}
else
{
if (GestureRecognized != null)
{
GestureRecognized(this, new EventArgs());
Reset();
}
}
}
else if (result == GesturePartResult.Failed || _frameCount == WINDOW_SIZE)
{
Reset();
}
else
{
_frameCount++;
}
}

public void Reset()
{
_currentSegment = 0;
_frameCount = 0;
}
}
}```

Using the Code

Using the code we created is straightforward. Create an instance of the `WaveGesture` class inside your program and subscribe to the `GestureRecognized` event. Remember to call the `Update` method whenever you have a new `Skeleton` frame. Here is a complete Console app example:

```using Microsoft.Kinect;
using System;

namespace KinectSimpleGesture
{
class Program
{
static WaveGesture _gesture = new WaveGesture();

static void Main(string[] args)
{
var sensor = KinectSensor.KinectSensors.Where(
s => s.Status == KinectStatus.Connected).FirstOrDefault();

if (sensor != null)
{
sensor.SkeletonStream.Enable();

_gesture.GestureRecognized += Gesture_GestureRecognized;

sensor.Start();
}

}

{
using (var frame = e.OpenSkeletonFrame())
{
if (frame != null)
{
Skeleton[] skeletons = new Skeleton[frame.SkeletonArrayLength];

frame.CopySkeletonDataTo(skeletons);

if (skeletons.Length > 0)
{
var user = skeletons.Where(
u => u.TrackingState ==
SkeletonTrackingState.Tracked).FirstOrDefault();

if (user != null)
{
_gesture.Update(user);
}
}
}
}
}

static void Gesture_GestureRecognized(object sender, EventArgs e)
{
Console.WriteLine("You just waved!");
}
}
}```

That’s it! Now stand in front of your Kinect sensor and wave using your right hand!

Something to Note

Obviously, you cannot expect your users to do everything right. One might wave but not perform the entire movement. Another might just perform the movement too quickly or too slowly. When developing a business app targeting the Kinect platform, you have to be aware of all these issues and add conditions to your code. In a common situation, you’ll need to specify whether the user is “almost” performing a gesture. That is, you’ll need to bypass a number of frames before determining the final gesture result. This is why I mentioned the Undetermined statement before.

Vitruvius

So, if you want more production-ready gestures right now, consider downloading Vitruvius. Vitruvius is a free & open-source library I built, which provides many utilities for your Kinect applications. It currently supports 9 gestures, waiting for more to come. The code is more generic and you can easily build your own extensions on top of it. Give it a try, enjoy and even contribute yourself!

Share

 CEO LightBuzz United States
Vangos Pterneas is a Microsoft Most Valuable Professional in the Kinect technology. He helps companies from all over the world grow their revenue by creating profitable software products. Vangos is the founder of LightBuzz Inc. and author of two technical books.

You may also be interested in...

 First Prev Next
 mouse control.. Ralph Jason Austin Fonte15-Sep-14 22:04 Ralph Jason Austin Fonte 15-Sep-14 22:04
 Re: mouse control.. Vangos Pterneas18-Sep-14 4:39 Vangos Pterneas 18-Sep-14 4:39
 Last Visit: 31-Dec-99 18:00     Last Update: 24-Jul-17 0:32 Refresh 1