Hi all,
Total noob here. Im trying to vibe code through this. I’m working on a research project where we need to stream a live stereo feed (left + right) from an OAK-D Pro PoE into a host application (Unity, for XR development).
Hardware setup:
Architecture goal:
Use the Raspberry Pi as a bridge. (My uni laptop is giving me some WSL / user rights issues)
Capture the stereo streams (left + right) from the OAK-D Pro PoE.
Forward them to the host machine in a format suitable for real-time consumption in Unity.
No heavy on-device NN required, just reliable stereo video delivery.
What I’m trying to understand:
What is the recommended DepthAI pipeline for streaming a stereo pair (left + right) from an RVC2 PoE device to host?
Is it better to use:
MonoCamera nodes for left/right?
StereoDepth node outputs?
ISP/video/preview paths for stereo RGB?
What transport format is recommended for real-time host consumption (XLinkOut raw frames, H264/H265 via VideoEncoder, etc.)?
Any best practices for low-latency stereo streaming from PoE devices?
The end goal is a stable, low-latency stereo feed suitable for XR visualization and research experimentation.
Any architectural guidance or example pipelines would be greatly appreciated.
Thanks in advance.