Click here to Skip to main content
15,880,543 members
Articles / Desktop Programming / WPF

Sitting posture recognition with Kinect sensor

Rate me:
Please Sign up or sign in to vote.
4.78/5 (12 votes)
24 Oct 2011CPOL6 min read 66.9K   3.4K   27   12
Recognition of concentrating, non-concentrating, sleeping, and raise-hand postures.

Introduction

Many specialists are predicting that in the closer future, a new revolution in information technologies will occur. This revolution will be connected with new computer abilities to segment, track, and understand pose, gestures, and emotional expressions of humans. For this, computers must begin to use new types of video sensors that will provide 3D-videos. The Kinect sensor is the first of such new types of sensors. The Kinect sensor has two cameras: a traditional color video camera and an infrared light sensor that measures depth, position, and motion. The Kinnect sensor started as a sensor for the XBox 360 game system about an year ago, but almost immediately many software developers began to try to use it for recognition of human poses and gestures. More information about can be read from www.kinecthacks.com.

KinnectSensor.jpg

My article is devoted to research of sitting posture recognition. Sitting posture recognition is based on human skeleton tracking. There are three software packages that may produce human skeleton tracking with the Kinect sensor: OpenNi/PrimeSense Nite library, Micosoft Kinnect Research SDK, and the Libfreenet library. I have used the first two. On their basis, I developed C# WPF applications where I combined color video streams and skeleton images.

These applications run under Microsoft Windows 7 and .NET Framework 4.0. For their compilation, you need Microsoft Visual Studio 2010. You may find instructions to install the OpenNi/PrimeSense Nite library and the Microsoft Kinect Research SDK at www.kinecthacks.com.

Background

Sitting posture recognition algorithm is based on human skeleton tracking and obtaining three coordinates (xs, ys, zs), (xh, yh, zh), and (xk, yk, zk) of the positions of the human Shoulder (denoted as S), Hip (denoted as H), and Knee (denoted as K).

A sitting posture is related to the angle a between the line HK (from hip to knee) and the line HS (from hip to shoulder).

We will distinguish the left body part angle a - angle between “center hip to left knee” vector and “center hip to center shoulder” vector, and right body part angle a - angle between “center hip to left knee” vector and “center hip to center shoulder” vector.

From angle a and the hand’s position, the human sitting posture can be concluded and classified as one of four specified types - sleeping, concentrating, raising hand, and non-focusing, as given in the table below.

Angle, aHand postureSitting posture
0 ~ 40downsleeping
40~80downnon-concentrating
80~100downconcentrating
upraising hand
100~180downnon-concentrating

Using the Code

I had two problems combining a color video stream and a skeleton image.

The first problem was how to locate them simply in one control in a window. For this problem, I used a simple WPF form for both applications that contain a StatusBar control and a Grid panel. The Grid panel contains an Image and Canvas control with the same size.

XML
<Window x:Class="RecognitionPose.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="User tracking with Microsoft SDK" Height="600" 
        Width="862" Loaded="Window_Loaded" 
        DataContext="{Binding}">
    <DockPanel LastChildFill="True">
        
        <StatusBar Name="statusBar" 
             MinHeight="40" DockPanel.Dock="Bottom">
            <StatusBarItem>
                <TextBlock Name="textBlock" 
                   Background="LemonChiffon" 
                   FontSize='10'> Ready </TextBlock>
            </StatusBarItem>
        </StatusBar>
        <Grid DockPanel.Dock="Top">
            <Image Name="imgCamera" Width="820" 
               ClipToBounds="True" Margin="10,0" />
            <Canvas Width="820" Height="510" 
               Name="skeleton"   ClipToBounds="True"/>
        </Grid>
    </DockPanel>
</Window>

The second problem of working with both OpenNI/PrimeSense Nite and the Microsoft SDK is that the events of refreshing video frames and skeleton frames occur non-synchronously.

For this problem, for the Microsoft SDK case, I call the main method RecognizePose of my Recognition class in the SkeletonFrameReady event handler after imgCamera and the skeleton controls are refreshed. The SkeletonFrameRead event handler simply synchronizes with the VideoFrameReady event handler by copying the current video frame in the planar image temp variable:

C#
planarImage = ImageFrame.Image;

and then copying this temp variable to imgCamera.Source in the SkeletonFrameReady event handler:

C#
imgCamera.Source = BitmapSource.Create(planarImage.Width, planarImage.Height,
  194,194,PixelFormats.Bgr32, null, planarImage.Bits, 
  planarImage.Width * planarImage.BytesPerPixel);

For the OpenNi/PrimeSense Nite case, I use the NuiVision library http://www.codeproject.com/Articles/169161/Kinect-and-WPF-Complete-body-tracking written by Vangos Pterneas for synchronization of the video frame and skeleton recognition events. I call the RecognizePose method in the UsersUpdated event handler of this library.

For sitting posture recognition, the main problem was to find for the human, the distance and angle relative to the Kinect sensor where recognition is stable. For this purpose, I added five parameters in the application settings to control the algorithm behavior:

  • isDebug -if true then show information about the current human location on the status bar;
  • confidenceAngle - control differences between the left part body and right part body angles a; if this angle is more for the given level, we assume that the recognition isn't stable.
  • standPoseFactor - control differences between the sitting and standing pose; if the current human height multiplication on this factor is more than the initial human height in standing pose, we assume that the current pose is standing pose too.
  • isAutomaticChoiceAngle - choice between automatic definition angle a as nearest to camera (true) and calculation angle a as average (false) between the left part body and right part body angles a;
  • shiftAngle - shift angle subtracted from angle a to delete skeleton recognitions error.

I found that the most stable sitting recognition occurs when these parameters have these values:

  • confidenceAngle=50 degree;
  • standPoseFactor=1.1;
  • isAutomaticChoiceAngle=true;
  • shiftAngle=20.

The Kinect sensor is located on the floor, the distance between the Kinect sensor and sitting human is about 2 meters, and the human body in turned on a 45-degree angle relative to the sensor.

NiteOlga.jpg

Advantage of a sitting human location is that the Kinect sensor may constantly track the parts of the human body that are necessary for recognition:

  • two knees;
  • one hip;
  • two shoulders;
  • two hands;
  • head

For other human locations, this isn't so. For example, for frontal location, the sensor really doesn't track the hip; for profile location, the sensor tracks only one part of the body: right or left.

Points of Interest

I made two movies about using these two applications:

From the movies, we can conclude that recognition works well for both software packages. However, the applications may be improved significantly by extending the sitting human location zone where the recognition is stable. For this, we must use not one but two or more Kinect sensors.

I think that these applications may be used in any area where it is necessary to control human behavior in sitting pose. For cases when human state becomes non-concentrating or sleeping, the applications may be enhanced by adding some feedback that will send an alarm, alert, or emergency signal. On the other hand, this application may be used in universities to collect statistics about student activity during seminars and labs. These applications will calculate the average time a student is concentrating or not-concentrating during a seminar, the number of times they are raising hands, and the professor can account this statistics in personal works with the student.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) Altair
Russian Federation Russian Federation
Ph.D., Image processing, Neural net, C++, C#, OPenCv,ASP.Net MVC, JScript, Qt,SQL,Kinnect,SilvelLight,

Comments and Discussions

 
QuestionDownloaded the source code, but it's empty Pin
DanielCai198718-Sep-12 22:09
DanielCai198718-Sep-12 22:09 
QuestionQuestion Pin
simemon28-Mar-12 8:30
simemon28-Mar-12 8:30 
Thanks for providing sitting posture recognition. I need your help to determine more than 1 player position(X,Y,Z) co-ordinate from Kinect. What changes should I make in the code?

Thanks in advance.
GeneralMy vote of 5 Pin
GPUToaster™24-Oct-11 23:15
GPUToaster™24-Oct-11 23:15 
QuestionInteresting job :) Pin
Kanou9224-Oct-11 6:20
Kanou9224-Oct-11 6:20 
AnswerRe: Interesting job :) Pin
Gavrilov Alexey24-Oct-11 6:29
Gavrilov Alexey24-Oct-11 6:29 
GeneralRe: Interesting job :) Pin
Kanou9224-Oct-11 6:35
Kanou9224-Oct-11 6:35 
QuestionFuture Cooperation Pin
VHGN17-Oct-11 3:09
VHGN17-Oct-11 3:09 
AnswerRe: Future Cooperation Pin
Gavrilov Alexey20-Oct-11 6:07
Gavrilov Alexey20-Oct-11 6:07 
GeneralMy vote of 5 Pin
VHGN17-Oct-11 0:06
VHGN17-Oct-11 0:06 
GeneralMy vote of 5 Pin
Ed Nutting4-Oct-11 6:54
Ed Nutting4-Oct-11 6:54 
GeneralMy vote of 5 Pin
Abinash Bishoyi29-Sep-11 5:30
Abinash Bishoyi29-Sep-11 5:30 
GeneralRe: My vote of 5 Pin
kiran dangar4-Oct-11 1:21
kiran dangar4-Oct-11 1:21 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.