Wednesday, 11 April 2012

Kinect for Windows 101

Introduction

I have been messing about with the Kinect SDK for over a year now. I tried OpenNI (its was horrible to work with), and I moved over to the official Kinect SDK (beta).
Yesterday, I decided to sit down and upgrade my code to the official Kinect for Windows SDK.
After having a quick look though the new code I found the following things:
  1. They changed the API's (this is annoying, but not unexpected)
  2. There are a lot more examples available (that's OK)
  3. The Kinect for Windows team cannot write a simple example!
Oh dear!

Let me ask you, if you wanted a program to access the Kinect get the data, and display it. How much code would you like to read?
One page, two pages?

Well the simplest example (in C#) requires:
  • 6 files of code
  • 370 lines of code
Dear Microsoft,
If you want people to use your programs, you must provide a neat, well organised API (which you have done), AND simple examples (which you have not done)

I will not do Microsoft's job for them, but I will show you how you can get most of the work done in two functions and less than 100 lines of code.
Oh, and these 100 lines of code has more comments than the 370 lines of Microsoft code too.
Here is is:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.Kinect;
using System.Threading;

namespace ConsoleKinect
{
    class Program
    {
        static void Main(string[] args)
        {
            new Program().Go();
        }

        // main data:
        KinectSensor kinect;
        byte[] pixelData = new byte[640 * 480 * 4]; // 4 bytes/pixel
        short[] depthData=new short[320 * 240];   // 1 short/depth pixel
        Skeleton[] skeletons;

        void Go()
        {
            //////////////////////
            // Setup Code
            //////////////////////

            // simple code: just grabs the first sensor (not recomended)
            //kinect = KinectSensor.KinectSensors[0];
            // more complex code:
            // uses the generics to get the first active sensor
            kinect = (from sensorToCheck in KinectSensor.KinectSensors
                        where sensorToCheck.Status == KinectStatus.Connected
                        select sensorToCheck).FirstOrDefault();
            // TODO: check to make sure there really is a sensor

            // you probably should add in a callback function to check for 
            // sensors being added/removed
            //KinectSensor.KinectSensors.StatusChanged+=KinectSensorStatusChange;

            // Initalise the kinect
            // lets have video:640x480, depth:320x240 and skeleton
            kinect.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30);
            kinect.DepthStream.Enable(DepthImageFormat.Resolution320x240Fps30);
            kinect.SkeletonStream.Enable(); // just enable without worrying on smoothing params
            kinect.Start();
            // we could use polling to access the data, but I'm using event as its simple
            kinect.AllFramesReady += KinectAllFramesReady;  // calls this fn when data is ready

            skeletons = new Skeleton[kinect.SkeletonStream.FrameSkeletonArrayLength];   // max array len

            while(true)
            {
                // should do some stuff here:
                Thread.Sleep(1000);
            }
        }
        // callback for data: call everytime data is ready
        private void KinectAllFramesReady(object sender, AllFramesReadyEventArgs e)
        {
            using (SkeletonFrame sf = e.OpenSkeletonFrame())
            {
                if (sf!=null)   // if we have a skeleton
                {
                    sf.CopySkeletonDataTo(skeletons);   // make a copy
                    // TODO: use it
                }
            }
            using (DepthImageFrame dif = e.OpenDepthImageFrame())
            {
                if (dif!=null)  // if we have a depth frame
                {
                    dif.CopyPixelDataTo(depthData); // make a copy
                    // TODO: use it
                }
            }
            using (ColorImageFrame cif = e.OpenColorImageFrame())
            {
                if (cif!=null)  // if we have a colour image
                {
                    cif.CopyPixelDataTo(pixelData); // make a copy
                    // TODO: use it
                }
            }
        }
    }
}

Thats it!
This code compiles & runs without issue, but it doesn't actually display anything.
The code just starts up and captures the data and does nothing with it.
You can easily add in your own display code.

XNA display

Ok, I think I will add a little bit more. Since my tool of choice is XNA. Here is a quick summary of how to add the Kinect into XNA.

Rather than give a full explanation, I will just drop in some commented snippets of code, you should be able to figure it out from there.

// Snippets for XNA 4.0:
// you need a texture for display, and an array for the raw data:
Texture2D texture;  // texture
Color[] colorData = new Color[640 * 480]; // raw 640x480 colours

// in your LoadContent you can set it up
// at the same time as you setup the Kinect
texture = new Texture2D(GraphicsDevice, 640, 480);

// when you are processing, you can copy the bytes into colours:
// (assuming pixelData is the raw RGB's from Kinect
for (int i = 0; i < 640 * 480; i++)
{
  int idx = i * 4;
  // Either the XNA format is reversed or the Kinect format is
  // its an RGB/BGR issue
  // hence the reversed order for the pixels
  colorData[i] = new Color(pixelData[idx+2], pixelData[idx + 1], pixelData[idx], 255);
}

// when you display, you will need to put the colours into the textures:
// XNA cannot let you write to a texture if its in use:
// so remove all textures from use:
GraphicsDevice.Textures[0] = GraphicsDevice.Textures[1] = null;
texture.SetData<Color>(colorData);
// draw it
spriteBatch.Begin();
spriteBatch.Draw(texture, new Rectangle(0, 0, 640, 480), Color.White);
spriteBatch.End();

Thats All Folk's
Happy Coding:
Mark