How to align RGB and depth image from Kinect in Matlab
I am trying to align an RGB and depth image with Kinect using Matlab. I am trying to do this using the this algorithm .
Here is the code I have written so far
depth = imread('depth_00500.png');
color = imread('rgb_00500.png');
rotationMat=[9.9984628826577793e-01 1.2635359098409581e-03 -1.7487233004436643e-02;
-1.4779096108364480e-03 9.9992385683542895e-01 -1.2251380107679535e-02;
1.7470421412464927e-02 1.2275341476520762e-02 9.9977202419716948e-01 ];
translationMat=[1.9985242312092553e-02, -7.4423738761617583e-04, -1.0916736334336222e-02 ];
%parameters for color matrix
fx_rgb= 5.2921508098293293e+02;
fy_rgb= 5.2556393630057437e+02;
cx_rgb= 3.2894272028759258e+02;
cy_rgb= 2.6748068171871557e+02;
k1_rgb= 2.6451622333009589e-01;
k2_rgb= -8.3990749424620825e-01;
p1_rgb= -1.9922302173693159e-03;
p2_rgb= 1.4371995932897616e-03;
k3_rgb= 9.1192465078713847e-01;
%parameters for depth matrix
fx_d= 5.9421434211923247e+02;
fy_d= 5.9104053696870778e+02;
cx_d= 3.3930780975300314e+02;
cy_d= 2.4273913761751615e+02;
k1_d= -2.6386489753128833e-01;
k2_d =9.9966832163729757e-01;
p1_d =-7.6275862143610667e-04;
p2_d =5.0350940090814270e-03;
k3_d =-1.3053628089976321e+00;
row_num=480;
col_num=640;
for row=1:row_num
for col=1:col_num
pixel3D(row,col,1) = (row - cx_d) * depth(row,col) / fx_d;
pixel3D(row,col,2) = (col - cy_d) * depth(row,col) / fy_d;
pixel3D(row,col,3) = depth(row,col);
end
end
pixel3D(:,:,1)=rotationMat*pixel3D(:,:,1)+translationMat;
pixel3D(:,:,2)=rotationMat*pixel3D(:,:,2)+translationMat;
pixel3D(:,:,3)=rotationMat*pixel3D(:,:,3)+translationMat;
P2Drgb_x = fx_rgb*pixel3D(:,:,1)/pixel3D(:,:,3)+cx_rgb;
P2Drgb_y = fy_rgb*pixel3D(:,:,2)/pixel3D(:,:,3)+cy_rgb;
I especially don't understand why we are assigning the depth pixel value to the x, y and z dimensions of 3D space, shouldn't we assign the size (x, y, z) to the depth pixel value?
I mean this part:
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)
Also I'm not sure if I can represent 3D space using a matrix. I am trying to use it in my code, but it is surely the wrong size since multiplication with a 3x3 rotation matrix is not possible.
Thank you for every suggestion and help!
source to share
This is a pretty tricky topic to explain in the short answer. For me, the code is correct. Please read about internal and external camera matrices. And reading about perspective projection will also help you understand 2D to 3D forecast.
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
The above line depth(x_d, y_d)
gives the depth value in pixel from the depth image. It is then multiplied by (x_d - cx_d)
, which is the difference along the x-axis with the x-coordinate of the center point of the depth map with the current pixel. Then finally it is split into fx_d
which is the focal length of the depth camera.
The following two links will help you figure it out mathematically well if you're interested.
source to share