Hi The_Real_Enrico_Pallazzo
Codec: H.265 is more efficient in terms of storage as it can compress the video more than H.264. However, it requires more processing power to encode and decode. H.264 is less efficient in terms of storage but requires less processing power. Given that you plan to save the video for later training and you have a 2TB SSD, it might be better to use H.264 to reduce the processing load on your devices. This will also ensure better real-time performance as H.264 has lower latency compared to H.265.
Resolution: The higher the resolution, the more data needs to be transferred and processed. If real-time performance is crucial, you might want to start with 720p and see if the performance is acceptable. If yes, then you can try increasing the resolution to 1080p.
Saving the Video: Writing video to disk can be IO intensive, especially at higher resolutions and frame rates. Make sure your storage can handle the IO requirements. Alternatively you can employ multithreading if you find you can not write video fast enough.
Latency: Minimize the number of intermediate steps and devices between the camera and the Jetson Orin Nano to reduce latency.
A possible setup:
Connect the OAK-D S2 PoE camera and the Nvidia Jetson Orin Nano to the PoE switch.
Configure the OAK-D camera to stream video using H.264 codec at 720p or 1080p resolution.
Run the hand gesture recognition on the OAK-D camera itself.
On the Nvidia Jetson Orin Nano, run a program that receives the video stream, performs the required inference, and saves the video to disk.
Note: I am not aware of a specific data spreadsheet that compares performance and benchmarks against other types/modes of streaming. You might have to do some testing yourself to see what works best for your setup.
Thanks,
Jaka