OK… I made the realization that my previous post was a SERIOUS SIMPLIFICATION and was honestly TERRIBLE…
Here is what I added to my engineering notebook just now to remind me (This is actually correct):
Matrix Blunder
Karrson: I just realized...
Matrices are surprisingly like vectors...
You can't take an NxN matrix, put it in an N+1xN+1 matrix and expect it to magically work...
It will not modify the N+1 dimension if that part is identity...
Data doesn't just magically appear but somehow after learning homogenous coords I was under the impression that it can xD
Emily: kinda the other way around - vectors are just 1xN matrices
Karrson: I know, but they are similar in that if you multiply a square matrix by a square matrix the extra dimension stays the same if it was originally identity
so dimenionality-wise they are similar
like multiplying 3, 5, 0 by 6, 2, 0 for example
the 0 part stays the same, same if its a 1
ofc in vectors its element-wise so that parts different
Karrson: Oh I think I get what you mean - yeah, you don't have to think of vectors differently at all if you know matrices
(other than extra work you might have to do)
Karrson: I guess what I'm saying is that a combination of dimensionality, points, vectors, homogenous coordinates, affine, projective, are all like apples and oranges where a matrix gets you from one frame to another. Apples don't multiply with oranges. They are different units.
and really, while homogenous coordinates may be a hack that allows you to multiply, it's not that hacky
so a 4x4 homogenous matrix really is a 3x3 in disguise, just that it has to be 4x4 to actually work.
So yes, I believe I have proven quite well that if you want a 3D pose that actually makes use of the world’s Z coordinates - which a 3x3 projection matrix or homography does NOT do btw because it’s really just homogenous 2D - then you will need a set of 3D affine points, so you really do NOT want to use projections at all for this part. That was my mistake. Actually, this WILL require a real calibration (intrinsics and extrinsics) rather than just using the image size to estimate that. Ofc, to go from raw stereo projected images to 3D affine pointclouds, you will need to work with projections, but after that the entire SLAM pipeline should be affine.
To quote stackoverflow (https://stackoverflow.com/a/56228667),
“First of all, 3 points are too little to recover affine transformation -- you need 4 points. For N-dimensional space there is a simple rule: to unambiguously recover affine transformation you should know images of N+1 points that form a simplex --- triangle for 2D, pyramid for 3D, etc. With 3 points you could only retrieve 2D affine transformation. A good explanation of why this is the case you may find in "Beginner's guide to mapping simplexes affinely".
Regarding some retrieval algorithm. I'm afraid, I don't know Matlab to provide you with appropriate code, but I worked with Python a little bit, maybe this code can help (sorry for bad codestyle -- I'm mathematician, not programmer)
import numpy as np
# input data
ins = [[1, 1, 2], [2, 3, 0], [3, 2, -2], [-2, 2, 3]] # <- points
out = [[0, 2, 1], [1, 2, 2], [-2, -1, 6], [4, 1, -3]] # <- mapped to
# calculations
l = len(ins)
B = np.vstack([np.transpose(ins), np.ones(l)])
D = 1.0 / np.linalg.det(B)
entry = lambda r,d: np.linalg.det(np.delete(np.vstack([r, B]), (d+1), axis=0))
M = [[(-1)**i * D * entry(R, i) for i in range(l)] for R in np.transpose(out)]
A, t = np.hsplit(np.array(M), [l-1])
t = np.transpose(t)[0]
# output
print("Affine transformation matrix:\n", A)
print("Affine transformation translation vector:\n", t)
# unittests
print("TESTING:")
for p, P in zip(np.array(ins), np.array(out)):
image_p = np.dot(A, p) + t
result = "[OK]" if np.allclose(image_p, P) else "[ERROR]"
print(p, " mapped to: ", image_p, " ; expected: ", P, result)
This code demonstrates how to recover affine transformation as matrix and vector and tests that initial points are mapped to where they should. It is based on equation presented in "Beginner's guide to mapping simplexes affinely", matrix recovery is described in section "Recovery of canonical notation". The same authors published "Workbook on mapping simplexes affinely" that contains many practical examples of this kind.“
Once you have a correct and verifiable affine transformation matrix, you should be able to decompose it into the correct rotation and translation of the camera quite easily.