Hello, I am working on a project using OAK-D Pro in which I want to track the position of people within a room and eventually trigger events based on that information. I have a lot of it figured out- I know that the upper left corner of frame is 0,0 and that the mobilenet SSD spatial coordinates use the center of frame as 0,0,0. I haven't included calibration in my code just yet but I know that I can use that to transform the detections' spatial coordinates into world coordinates in relation to the calibration image's location.
What I haven't been able to get a handle on is how to pull the min and max x,y,z measurements from the depth stream so that I can eventually divide the room into regions for triggers.
Say I have a 10x10 room that I want divided into 4 regions (lets ignore how resolution cropping would change how much of the room the camera can see). I would need at least the minimum and maximum x and z measurements (in camera space) or x and y measurements (in world space) to compare against the coordinates of the tracked people wouldn't I?
I'm sure I could hard code the room size and do something like trigger is True if spatialcoordinate.x is >= a1 and <= a2 && spatialcoordinate.z is >= b1 and <= b2
but ideally I could have it done either on the fly or during the calibration setup so I could, for example, move the camera into an 8x12 room and not have to rewrite my code. I'm pretty flexible in how I could implement this and I would have access to the rooms ahead of time if your suggestion includes having something physical in the room.
Thanks!