Hello erik,
thank you for taking an interest 🙂 Yeah, I did the multi-cam calibration which gave me a few matrices.
world_to_cam
[[ 0.61411884 0.7890858 -0.01420041 -0.21579439]
[-0.00386166 0.02099733 0.99977207 0.31440464]
[ 0.78920412 -0.61392403 0.01594204 1.05200151]
[ 0. 0. 0. 1. ]]
cam_to_world
[[ 0.61411884 -0.00386166 0.78920412 -0.6965064 ]
[ 0.7890858 0.02099733 -0.61392403 0.80952764]
[-0.01420041 0.99977207 0.01594204 -0.33416839]
[ 0. 0. 0. 1. ]]
trans_vec
[[-0.21579439]
[ 0.31440464]
[ 1.05200151]]
rot_vec
[[-1.43083528]
[-0.71236433]
[-0.70309224]]
intrinsics_mat
[[1546.18994140625, 0.0, 962.3573608398438]
[0.0, 1546.18994140625, 531.37255859375]
[0.0, 0.0, 1.0]]
I guess first I have to manage to create the pointcloud from the video material. I recorded the scene with gen2-record-replay. Is gen2-pointcloud a good starting place to do so?
How is the workflow to transform the world coordinates once I aquired the pointcloud?
I found this tutorial about geometric camera calibration (Camera Calibration with Example in Python). I presume this is another approach? The math looks a little scary. Given 6 points in the image plane and their corresponding world coordinates, it should be possible to reconstruct the extrinsic matrix. The points have to be independent but I don't know what it means in this context. I can get the world coordinates for 6 points in the image, but how do I check for independence? I thought there are no more then 3 independent vectors in 3D space?! I'm feeling a little lost here and don't know what to learn first to fill my blanks. If you or other readers got a pointer that would be nice!