Feb 6, 2013

RGBDtoolkit first thoughts, hints and issues

There is an interesting tool to improve the use of Kinect's mesh look in audiovisual projects called RGBDtoolkit. With it you can calibrate and record your DSLR's footage together with the depth data, and later use both to generate a 3D textured mesh of the scene from any point of view.

The creators of the project, James George, Alexander Porter, Jonathan Minard, recently achieved the funding goal of their Kickstarter project Clouds, an interactive documentary to be made with this tool / visual style.

After seeing some of the footage generated with RGBDtoolkit, we wanted to give it a try.
The first thing was to make the mount to attach both devices together so that they don't move relatively to each other, whether in the calibration process or at the recording. We constructed the mount the easiest way we could, with a 80mm length 60x40mm square tube  of 1.5mm aluminium, kindly donated by a local windows business. And also some recycled black 3mm HDPS plastic sheet, bent and drilled to embrace the kinect and attach it to the aluminium pipe.

Aluminium / plastic mount detail
Everything went well until the battery emptied... and we couldn't change it because it was blocked by the mount... our fault. 
Battery blocked by mount
So the mount had to be reworked to solve this:

Mount modified to allow access to the battery
As stated in the instructions, after building the mount you are supposed to get some infrared lights, to be able to capture the calibration checker board pattern with the Kinect's IR camera.

Checker board calibration pattern
The device's IR laser projector should be screened so it doesn't interfere with the pattern, filling it with the projected dots used for depth detection.

IR image with dot pattern interfering
But we had a different idea: why not use the projector as an IR beam to illuminate the scene? If we were able to scatter the light-dots making them diffuse and blend together, it would be just like a regular (and powerful) IR light. How to achieve this? Let's try filtering the projection with... a few layers of white cheap thin... PLASTIC BAG!

Kinect projector with "plastic bag filter"
What a crazy idea, isn't it? But it works!

IR image after filtering the dot projection with the bag
If you do this yourself, please take into account that the Kinect takes some moments to calibrate itself (probably the beam power and the IR camera exposure). So if you put the plastic bag before starting the IR stream to the computer, sometimes it will flicker, "trying to guess what is happening". To avoid this, you have to start streaming/capturing without the bag-filter for about 30 seconds, and then put it on. Also mind that at night, without the skylight's IR indirect light coming from outside, the quality of the picture is much worse, and the dots may be still noticed, even with the bag on.

IR image at night, with bag-filtered projection
So we solved the second point of the instructions. The third was to capture the calibration pairs, putting the pattern in a stand to ensure it stays still when capturing, and then start to take both sets of pictures, DSLR's (in video mode) and Kinect's IR. You have to take about 18 different positions to cover all areas of the field of view. However, we planned to do this in a different way. Instead of taking the image pairs as pictures, the idea is to play with live video from both sources, capturing frames with the computer, aiming to later develop an automatic calibration app that shows where to hold the checker board and captures the frame when it detects the right position and senses that it is still enough to be sharp (with opencv / frame differencing technique).

Custom made overlay to ease the positioning when holding the board

Captured frames from the IR live video stream
But then is when we got stuck. We never thought that the 18mm lens in our DSLR wouldn't have enough angle to match the vertical field of view of the Kinect, but it didn't!

Frame captured from the IR camera
Same frame captured from the DSLR's view
It happens that the DSLR is a Canon 600D, with an APSc sensor, which covers only a fraction of a full size sensor, multiplying the effective focal length of the lens, and reducing the field of view.
We also tried with an HD webcam, but the same happened: the patterns were cropped. 
HD webcam mounted with the Kinect
So that was all for now, until we find something with a wide enough angle to match the Kinect's field of view... :(


  1. Hmmm... I've just realised how terrible I look in the pictures... :b

  2. Please try harder! Either just deal with the fact that you can't get RGB video over the full frame or find a video camer that does go wider. It is well worth continuing because the results are fascinating and worth the effort.

    1. Hehehe... thanks for the encouragement, Robert. I already did, just been busy and didn't post anything about it.
      It was just a test. I managed to do the capture and calibration with my Ubuntu GNU/Linux system. Got some issues with the editor though.
      I'm also in contact with James to help improving the process by controlling the camera remotelly via usb:

  3. hello, in my 60D I can achieve the same field of view of the Kinect with a Tokina 11-16 (C format) set at about 14mm.
    What I cannot managed to get is an image of the checker-board (I mean the actual printed image) if I cover the IR lens. I tried your trick with the plastic bag, nothing, I tried with a progressive ND filter, with a paper tissue, cloth, IR 85, nothing nothing nothing, why?

    1. Hi Bruno,

      I guess you mean RGBDtoolkit is not recognizing the pattern in the Kinect's IR stream. Under what kind of illumination are you working, have you tried to work under sunlight and completely cover the IR projector? Can you distinguish/see the pattern in the actual IR image.