For orientation, I've had some success turning the ROTATION_VECTOR quaternion from the IMU into a rotation matrix, and applying that to the [y,x,z] coordinates given by YoloSpatialDetectionNetwork. This transforms from the camera's reference frame to the a frame where either x or y is North, and z is up and down relative to gravity. Using the pyquaternion library, something like:
camera_xyz=np.array([y,x,z]) #perhaps the IMU sensor is on sideways, or I have a minus sign wrong somewhere?
quat=Quaternion(rotation_vector_from_imu)
rot=quat.rotation_matrix
map_xyz=rot@camera_xyz