Wednesday, March 19, 2014

Mapping between Kinect Color and Depth

Continuing my series of blog posts about the new Kinect v2, I’d like to build upon my last post about the HD color stream with some of the depth frame concepts that I've used in the last few posts. Unlike the previous version of the Kinect, the Kinect v2 depth stream is not the same dimensions as the new color stream. This post will illustrate how the two are related and how you can map depth data to the color stream using the CoordinateMapper.
This is an early preview of the new Kinect for Windows, so the device, software and documentation are all preliminary and subject to change.
While the Kinect’s color and depth streams are represented as arrays of information, you simply cannot compare the x & y coordinates between the sets of data equally: the bytes of the color frame represent color pixels from the color camera; the bits of the depth frame represent Cartesian distance from the depth camera. Fortunately, the Kinect SDK ships with a mapping utility that can convert data between the different “spaces”.
For this post I’m going to use the MultiFrameSourceReader to access both the depth and color frames at the same time and we’ll use the CoordinateMapper.MapDepthFrameToColorSpace method to project the depth data into an array of ColorSpacePoint. Here’s the skeleton for processing frames as they arrive:
private void FrameArrived(object sender, MultiSourceFrameArrivedEventArgs e) 
{ 
    var reference = e.FrameReference; 

    MultiSourceFrame multiSourceFrame = null;
    ColorFrame colorFrame = null; 
    DepthFrame depthFrame = null; 

    try 
    { 
        using (_frameCounter.Increment()) 
        { 
            multiSourceFrame = reference.AcquireFrame(); 
            if (multiSourceFrame == null) 
                return; 

            using (multiSourceFrame) 
            { 
                colorFrame = multiSourceFrame.ColorFrameReference.AcquireFrame();
                depthFrame = multiSourceFrame.DepthFrameReference.AcquireFrame(); 

                if (colorFrame == null | depthFrame == null) 
                    return; 

                // initialize color frame data 
                var colorDesc = colorFrame.FrameDescription; 
                int colorWidth = colorDesc.Width; 
                int colorHeight = colorDesc.Height; 

                if (_colorFrameData == null) 
                { 
                    int size = colorDesc.Width * colorDesc.Height; 
                    _colorFrameData = new byte[size * bytesPerPixel]; 
                } 

                // initialize depth frame data 
                var depthDesc = depthFrame.FrameDescription; 

                if (_depthData == null) 
                { 
                    uint depthSize = depthDesc.LengthInPixels; 
                    _depthData = new ushort[depthSize]; 
                    _colorSpacePoints = new ColorSpacePoint[depthSize]; 
                } 

                // load color frame into byte[] 
                colorFrame.CopyConvertedFrameDataToArray(_colorFrameData, ColorImageFormat.Bgra); 

                // load depth frame into ushort[] 
                depthFrame.CopyFrameDataToArray(_depthData); 

                // map ushort[] to ColorSpacePoint[] 
                _sensor.CoordinateMapper.MapDepthFrameToColorSpace(_depthData, _colorSpacePoints); 

                // TODO: do something interesting with depth frame 

                // render color frame 
                _bmp.WritePixels( 
                    new Int32Rect(0, 0, colorDesc.Width, colorDesc.Height), 
                    _colorFrameData, 
                    colorDesc.Width * bytesPerPixel, 
                    0); 
            } 
        } 
    } 
    catch { } 
    finally 
    { 
        if (colorFrame != null) 
            colorFrame.Dispose(); 

        if (depthFrame != null) 
            depthFrame.Dispose(); 

    }
}
The MapDepthFrameToColorSpace method copies the depth data into an array of ColorSpacePoint where each item in the array corresponds to the items in the depth data. We can use the X & Y coordinates of the ColorSpacePoint to find the color data as demonstrated below. There’s one caveat: not all points in the depth array contain data that can be mapped to color pixels. Some points might be too close or too far, or there’s no depth data because it’s a shadow or reflective material.
The following snippet shows us how to locate the color bytes from the ColorSpacePoint:
// we need a starting point, let's pick 0 for now
int index = 0;

ushort depth = _depthData[index];
ColorSpacePoint point = _colorSpacePoints[index];

// round down to the nearest pixel
int colorX = (int)Math.Floor(point.X + 0.5);
int colorY = (int)Math.Floor(point.Y + 0.5);

// make sure the pixel is part of the image
if ((colorX >= 0 && (colorX < colorWidth) && (colorY >= 0) && (colorY < colorHeight))
{

    int colorImageIndex = ((colorWidth * colorY) + colorX) * bytesPerPixel;

    byte b = _colorFrameData[colorImageIndex];
    byte g = _colorFrameData[colorImageIndex + 1];
    byte r = _colorFrameData[colorImageIndex + 2];
    byte a = _colorFrameData[colorImageIndex + 3];

}
If we loop through the depth data and use the above technique we can draw our depth data on top of our color frame. For this image, I’m drawing the depth data using an intensity technique described in an earlier post. It looks like this:
ColorExample-10-16-26
You may notice that the pixels are far apart and don’t go to the entire edge of the image. This makes sense because our depth data is a smaller resolution (424 x 512 compared to 1080 x 1920) and tighter viewing angle (70.6° compared to 84°). The mapping also isn’t perfect in this release (remember this is a developer preview!)
We can use the same technique to draw each pixel of the depth frame using values from the color frame, like so:
// clear the pixels before we color them
Array.Clear(_pixels, 0, _pixels.Length);

for (int depthIndex = 0; depthIndex < _depthData.Length; ++depthIndex)
{
    ColorSpacePoint point = _colorSpacePoints[depthIndex];

    int colorX = (int)Math.Floor(point.X + 0.5);
    int colorY = (int)Math.Floor(point.Y + 0.5);
    if ((colorX >= 0) && (colorX < colorWidth) && (colorY >= 0) && (colorY < colorHeight))
    {
        int colorImageIndex = ((colorWidth * colorY) + colorX) * bytesPerPixel;
        int depthPixel = depthIndex * bytesPerPixel;

        _pixels[depthPixel] = _colorData[colorImageIndex];
        _pixels[depthPixel + 1] = _colorData[colorImageIndex + 1];
        _pixels[depthPixel + 2] = _colorData[colorImageIndex + 2];
        _pixels[depthPixel + 3] = 255;
    }
}
…which results in the following image:
ColorExample-10-09-28
So as you can see, we can easily map between the two coordinate spaces. I intended to build upon this further in upcoming posts, so if you haven’t already, add me to your favorite RSS reader.
Happy coding.

11 comments:

  1. Hey Bryan. Great tutorial. Have you done this with the older Kinect? I have been able to map the color info to the depth frame, but am having trouble going the other way.

    ReplyDelete
  2. Hey Bryan. Great tutorial. Have you done this with the older Kinect? I have been able to map the color info to the depth frame, but am having trouble going the other way.

    ReplyDelete
  3. Yes, this technique is also possible using v1 as well.

    You can map individual depth points (http://msdn.microsoft.com/en-us/library/jj883692.aspx) or the entire Frame (http://msdn.microsoft.com/en-us/library/jj883690.aspx)

    ReplyDelete
  4. Hi Bryan. Good tutorial. Is there anyway to get a pixel value on the RGB image in terms of its distance in the camera space in mm?

    the mapColorframeToCameraspace maps the whole frame. What should I do if I want to map only one pixel?

    ReplyDelete
  5. Hey Bryan, I am trying to follow your steps, but I am not sure where you initialize _colorFrameData, could you direct me to where and how you initialized this variable?

    ReplyDelete
  6. Good tutorial, I reproduced your example with the current MS SDK 2, but in my mapping, pixels in the rightmost side of the body are more dense and numerous than in your image, and it creates a perception of a body duplicated.

    Have you tested using the last SDK and any idea to reduce this effect?

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. Miguel,

    You might want to start out with the SDK examples on Microsoft's site: https://developer.microsoft.com/en-us/windows/kinect/develop

    These examples are WPF.

    ReplyDelete
  9. Hi Brian! Thanks for the post.

    What do you recommend to correct the repeat pattern visible on your right body contour?

    cheers,
    Nuno

    ReplyDelete
  10. Hi,

    Thanks for the great explanation! I'm doing something quite similar, but found that when a human body got close to the Kinect, the MapDepthFrameToColorSpace function got really slow. Did you encounter the same problem?


    ReplyDelete
  11. Hi Bryan,

    Thanks for the great tutorial! I'm trying to work on something similar, but found that when the Kinect got close to an object (which means some parts of the depth is small), the MapDepthFrameToColorSpace got very slow. Did you encounter any similar problem?

    Best,
    Yuhang

    ReplyDelete