On the oak-d, from e.g. a 1024x1024 color video frame (setVideoSize) x,y pixel find the corresponding x,y in the stereo frame (same field of view).
Is there an easy way to do this or it has to be done from scratch by factoring in the color camera cropping plus the difference in field of view?
Mapping color and stereo cameras
This article seems to explain it...
https://docs.luxonis.com/projects/api/en/latest/samples/StereoDepth/rgb_depth_aligned/#rgb-depth-alignment
Hi erik
I assume it takes a little time before a ROI update through SpatialLocationCalculatorConfig makes it to the calculator output. I read in one post that a way to "id" the config is to make minor changes to depthThresholds and check for the matching one on the output.
I'm doing a detection step first on an RGB frame and then using that as ROIs in the spatial calculator. That brings in a delay in that the frames in the left/right cameras corresponding to the RGB frame used for detection are gone before the config is updated. Thus the spatial info is based on newer frames. Is there a way to compensate for that such that the "frame age" matches?
Hi dexter ,
Since you are doing detection + spatial calc (instead of Spatial detection node) I assume your model isn't Yolo/SSD, and you have to do decoding on the host before you can calculate coordinates? Another option would be to calculate spatial coordinates on the host (demo here). Would that work?
Thanks, Erik
Hi dexter ,
A team member was just looking into this (porting gen2-ocr demo), but we decided we will be decoding it on the host for now. There's no demo for c++ available, but porting it from python should be quite simple, at least from depthai perspective, as API is 1:1 (python and c++) . Thoughts?
Thanks, Erik
Thanks erik
I'll try porting it, just checking to save me some time... (it took a while to port the detection and ocr parts)
Will the sequence numbers be the same for color and spatial frames or they need to be synced some other way?
Perhaps I could even warp the perspective on the spatial frame to get values from rotated rects...
- Edited
erik
I'm buffering color frames already for looking up the ones used by detection (as in the python demo).
I was thinking I could buffer the spatial frames as well to try and find the best match for the color frame used. Detection is using sequence number for this. So the question is how to best find a spatial detection that matches the time of the color frame.
I tried buffering the stereo frames and mapping using time stamps. Seems like the time difference is around 20ms or better. This works if the StereoDepth node output time stamp comes from its input frame time stamps, does it? (couldn't find it in the docs)
Since I already have the code for using the SpatialLocationCalculator I will try to feed these buffered frames to it before I convert the code for doing the calculation on the host.
Hi dexter ,
StereoDepth node output time stamp comes from its input frame time stamps, does it?
Yes, that's correct. I would suggest checking SW msg syncing docs as well, should be quite helpful in your situation.
Thanks, Erik