Hi Kristoffer
StereoDepth will give you a depth map for every pixel in the frame.
YoloSpatialDetectionNetwork will give you the bounding box for a detected object. Along the bbox, you will also get the spatials (x, y, z), which will update if that object moves.
So if you wish to track some stationary object, use the depth map. If the object moves, use Spatial Yolo.
Thanks,
Jaka