How to align RGB and depth image from Kinect in Matlab

I am trying to align an RGB and depth image with Kinect using Matlab. I am trying to do this using the this algorithm .

Here is the code I have written so far

depth = imread('depth_00500.png');
color = imread('rgb_00500.png');

rotationMat=[9.9984628826577793e-01 1.2635359098409581e-03 -1.7487233004436643e-02;
 -1.4779096108364480e-03 9.9992385683542895e-01 -1.2251380107679535e-02;
1.7470421412464927e-02 1.2275341476520762e-02 9.9977202419716948e-01 ];

 translationMat=[1.9985242312092553e-02, -7.4423738761617583e-04, -1.0916736334336222e-02 ];

%parameters for color matrix
fx_rgb= 5.2921508098293293e+02;
fy_rgb= 5.2556393630057437e+02;
cx_rgb= 3.2894272028759258e+02;
cy_rgb= 2.6748068171871557e+02;
k1_rgb= 2.6451622333009589e-01;
k2_rgb= -8.3990749424620825e-01;
p1_rgb= -1.9922302173693159e-03;
p2_rgb= 1.4371995932897616e-03;
k3_rgb= 9.1192465078713847e-01;

%parameters for depth matrix
fx_d= 5.9421434211923247e+02;
fy_d= 5.9104053696870778e+02;
cx_d= 3.3930780975300314e+02;
cy_d= 2.4273913761751615e+02;
k1_d= -2.6386489753128833e-01;
k2_d =9.9966832163729757e-01;
p1_d =-7.6275862143610667e-04;
p2_d =5.0350940090814270e-03;
k3_d =-1.3053628089976321e+00;


for row=1:row_num
    for col=1:col_num

pixel3D(row,col,1) = (row - cx_d) * depth(row,col) / fx_d;
pixel3D(row,col,2) = (col - cy_d) * depth(row,col) / fy_d;
pixel3D(row,col,3) = depth(row,col);



P2Drgb_x = fx_rgb*pixel3D(:,:,1)/pixel3D(:,:,3)+cx_rgb;
P2Drgb_y = fy_rgb*pixel3D(:,:,2)/pixel3D(:,:,3)+cy_rgb;


I especially don't understand why we are assigning the depth pixel value to the x, y and z dimensions of 3D space, shouldn't we assign the size (x, y, z) to the depth pixel value?

I mean this part:

P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)


Also I'm not sure if I can represent 3D space using a matrix. I am trying to use it in my code, but it is surely the wrong size since multiplication with a 3x3 rotation matrix is ​​not possible.

Thank you for every suggestion and help!


source to share

1 answer

This is a pretty tricky topic to explain in the short answer. For me, the code is correct. Please read about internal and external camera matrices. And reading about perspective projection will also help you understand 2D to 3D forecast.

P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d


The above line depth(x_d, y_d)

gives the depth value in pixel from the depth image. It is then multiplied by (x_d - cx_d)

, which is the difference along the x-axis with the x-coordinate of the center point of the depth map with the current pixel. Then finally it is split into fx_d

which is the focal length of the depth camera.

The following two links will help you figure it out mathematically well if you're interested.



All Articles