Click here to Skip to main content
Click here to Skip to main content

Implementing Kinect gestures

, 28 Jan 2014
Rate this:
Please Sign up or sign in to vote.
Gesture recognition is a fundamental element when developing Kinect-based applications (or any other Natural User Interfaces). Gestures are used for navigation, interaction or data input. The most common gesture examples include waving, sweeping, zooming, joining hands, and much more. Unfortunately,

Original post: http://pterneas.com/2014/01/27/implementing-kinect-gestures/  

Gesture recognition is a fundamental element when developing Kinect-based applications (or any other Natural User Interfaces). Gestures are used for navigation, interaction or data input. The most common gesture examples include waving, sweeping, zooming, joining hands, and much more. Unfortunately, the current Kinect for Windows SDK does not include a gesture-detection mechanism out of the box. So, you thought that recognizing gestures using Kinect is a pain in the ass? Not any more. Today I’ll show you how you can implement your own gestures using some really easy techniques. There is no need to be a Math guru or an Artificial Intelligence Yoda to build a simple gesture detection mechanism.

Prerequisites

What is a gesture?

Before implementing something, it is always good to define it. Kinect provides you with the position (X, Y and Z) of the users’ joints 30 times (or frames) per second. If some specific points move to specific relative positions for a given amount of time, then you have a gesture. So, in terms of Kinect, a gesture is the relative position of some joints for a given number of frames. Let’s take the wave gesture as an example. People wave by raising their left or right hand and moving it from side to side. Throughout the gesture, the hand usually remains above the elbow and moves periodically from left to right. Here is a graphical representation of the movement:

Kinect wave gesture

Now that you’ve seen and understood what a gesture is, let’s try to specify its underlying algorithm.

Gesture segments

In the wave gesture, the hand remains above the elbow and moves periodically from left to right. Each position (left / right) is a discrete part of the gesture. Formally, these parts are called segments.

So, the first segment would contain the conditions “hand above elbow” and “hand right of elbow”:

  • Hand.Position.Y > Elbow.Position.Y AND
  • Hand.Position.X > Elbow.Position.X

Similarly, the second segment would contain the conditions “hand above elbow” and “hand left of elbow”:

  • Hand.Position.Y > Elbow.Position.Y AND
  • Hand.Position.X < Elbow.Position.X

That’s it. If you notice any consecutive repeats of the above segments for at least three or four times, then the user is waving! In .NET, the source code would be really simple; just two classes representing each segment. Of course, each segment class should implement an Update method. The Update method determines whether the specified conditions are met for a given skeleton body. Returns Succeeded if every condition of the segment is met, or Failed if none of the conditions is met.

// WaveGestureSegments.cs
using Microsoft.Kinect;

namespace KinectSimpleGesture
{
    public interface IGestureSegment
    {
        GesturePartResult Update(Skeleton skeleton);
    }

    public class WaveSegment1 : IGestureSegment
    {
        public GesturePartResult Update(Skeleton skeleton)
        {
            // Hand above elbow
            if (skeleton.Joints[JointType.HandRight].Position.Y > 
                skeleton.Joints[JointType.ElbowRight].Position.Y)
            {
                // Hand right of elbow
                if (skeleton.Joints[JointType.HandRight].Position.X > 
                    skeleton.Joints[JointType.ElbowRight].Position.X)
                {
                    return GesturePartResult.Succeeded;
                }
            }

            // Hand dropped
            return GesturePartResult.Failed;
        }
    }

    public class WaveSegment2 : IGestureSegment
    {
        public GesturePartResult Update(Skeleton skeleton)
        {
            // Hand above elbow
            if (skeleton.Joints[JointType.HandRight].Position.Y > 
                skeleton.Joints[JointType.ElbowRight].Position.Y)
            {
                // Hand left of elbow
                if (skeleton.Joints[JointType.HandRight].Position.X < 
                    skeleton.Joints[JointType.ElbowRight].Position.X)
                {
                    return GesturePartResult.Succeeded;
                }
            }

            // Hand dropped
            return GesturePartResult.Failed;
        }
    }
}

The GesturePartResult is an enum (we could even use boolean values):

// GesturePartResult.cs
using System;

namespace KinectSimpleGesture
{
    public enum GesturePartResult
    {
        Failed,
        Succeeded
    }
}

Note: For a more advanced example, we could use another GesturePartResult (lets say “Undetermined”), which would indicate that we are not sure about the current gesture result.

Updating the gesture

We now need a way to update and check the gesture every time the sensor provides us with new skeleton/body data. This kind of check will be done in a separate class and will be called 30 times per second, or at least as many times as our Kinect sensor allows. When updating a gesture, we check each segment and specify whether the movement is complete or whether we need to continue asking for data.

Window size

The number of frames we ask for data is called window size and you find it after experimenting with your code. For simple gestures that last for approximately a second, a window size of 30 or 50 will do the job just fine. For the wave gesture, I chose 50.

The gesture class

Having decided on the window size parameter, we can now build the WaveGesture class. Notice the process:

  • In the constructor, we create the gesture parts and we specify their order in the the _segments array. You can use as many occurrences of each segment as you like!
  • In the Update method, we keep track of the frame index and check every segment for success or failure.
  • If we succeed, we throw the GestureRecognized event and reset the gesture
  • If we fail or the window size has been reached, we reset the gesture and start over

Here is the final class for our wave gesture:

// WaveGesture.cs
using Microsoft.Kinect;
using System;

namespace KinectSimpleGesture
{
    public class WaveGesture
    {
        readonly int WINDOW_SIZE = 50;

        IGestureSegment[] _segments;

        int _currentSegment = 0;
        int _frameCount = 0;

        public event EventHandler GestureRecognized;

        public WaveGesture()
        {
            WaveSegment1 waveSegment1 = new WaveSegment1();
            WaveSegment2 waveSegment2 = new WaveSegment2();

            _segments = new IGestureSegment[]
            {
                waveSegment1,
                waveSegment2,
                waveSegment1,
                waveSegment2,
                waveSegment1,
                waveSegment2
            };
        }

        public void Update(Skeleton skeleton)
        {
            GesturePartResult result = _segments[_currentSegment].Update(skeleton);

            if (result == GesturePartResult.Succeeded)
            {
                if (_currentSegment + 1 < _segments.Length)
                {
                    _currentSegment++;
                    _frameCount = 0;
                }
                else
                {
                    if (GestureRecognized != null)
                    {
                        GestureRecognized(this, new EventArgs());
                        Reset();
                    }
                }
            }
            else if (result == GesturePartResult.Failed || _frameCount == WINDOW_SIZE)
            {
                Reset();
            }
            else
            {
                _frameCount++;
            }
        }

        public void Reset()
        {
            _currentSegment = 0;
            _frameCount = 0;
        }
    }
}

Using the code

Using the code we created is straightforward. Create an instance of the WaveGesture class inside your program and subscribe to the GestureRecognized event. Remember to call the Update method whenever you have a new Skeleton frame. Here is a complete Console app example:

using Microsoft.Kinect;
using System;

namespace KinectSimpleGesture
{
    class Program
    {
        static WaveGesture _gesture = new WaveGesture();

        static void Main(string[] args)
        {
            var sensor = KinectSensor.KinectSensors.Where(
                         s => s.Status == KinectStatus.Connected).FirstOrDefault();

            if (sensor != null)
            {
                sensor.SkeletonStream.Enable();
                sensor.SkeletonFrameReady += Sensor_SkeletonFrameReady;

                _gesture.GestureRecognized += Gesture_GestureRecognized;

                sensor.Start();
            }

            Console.ReadKey();
        }

        static void Sensor_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
        {
            using (var frame = e.OpenSkeletonFrame())
            {
                if (frame != null)
                {
                    Skeleton[] skeletons = new Skeleton[frame.SkeletonArrayLength];

                    frame.CopySkeletonDataTo(skeletons);

                    if (skeletons.Length > 0)
                    {
                        var user = skeletons.Where(
                                   u => u.TrackingState == SkeletonTrackingState.Tracked).FirstOrDefault();

                        if (user != null)
                        {
                            _gesture.Update(user);
                        }
                    }
                }
            }
        }

        static void Gesture_GestureRecognized(object sender, EventArgs e)
        {
            Console.WriteLine("You just waved!");
        }
    }
}

That’s it! Now stand in front of your Kinect sensor and wave using your right hand!

Something to note

Obviously, you cannot expect your users to do everything right. One might wave but not perform the entire movement. Another might just perform the movement too quickly or too slowly. When developing a business app targeting the Kinect platform, you have to be aware of all these issues and add conditions to your code. In a common situation, you’ll need to specify whether the user is “almost” performing a gesture. That is, you’ll need to bypass a number of frames before determining the final gesture result. This is why I mentioned the Undetermined statement before.

Vitruvius

So, if you want more production-ready gestures right now, consider downloading Vitruvius. Vitruvius is a free & open-source library I built, which provides many utilities for your Kinect applications. It currently supports 9 gestures, waiting for more to come. The code is more generic and you can easily build your own extensions on top of it. Give it a try, enjoy and even contribute yourself!

PS: New Kinect book – 20% off

Well, I am publishing a new ebook about Kinect development in a couple months. It is an in-depth guide about Kinect, using simple language and step-by-step examples. You’ll learn usability tips, performance tricks and best practices for implementing robust Kinect apps. Please meet Kinect Essentials, the essence of my 3 years of teaching, writing and developing for the Kinect platform. Oh, did I mention that you’ll get a 20% discount if you simply subscribe now? Hurry up ;-)

Subscribe here for 20% off

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Vangos Pterneas
Product Manager LightBuzz
United Kingdom United Kingdom
I'm a Software Engineer and Entrepreneur, passionate about motion technology and the way it can affect people’s lives.
 
I have been a Kinect enthusiast since the release of the very first unofficial hacks and have already published some innovative commercial Kinect applications. These applications include complex home automation systems, 3D body scanning programs and motion-enabled product browsers for businesses.
 
I worked as a Windows developer and consultant for Microsoft Innovation Center and I'm now running my own company, LightBuzz Software. LightBuzz has been awarded the first place in Microsoft’s worldwide innovation competition, held in New York, for effectively combining Kinect and smartphone functionality.
 
When I am not coding, I love writing books, speaking and blogging about my favorite technological aspects.
Follow on   Twitter   Google+   LinkedIn

Comments and Discussions

 
-- There are no messages in this forum --
| Advertise | Privacy | Mobile
Web04 | 2.8.140709.1 | Last Updated 28 Jan 2014
Article Copyright 2014 by Vangos Pterneas
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid