I recently purchased a luxonis camera for obstacle detection. I was thinking that I could possibly modify the YOLOv8 architecture so that it can take in a depth map as the fourth channel in addition to the 3 RGB channels (mostly to deal with objects that are similarly colored to the background). However, I haven't been able to find much mention of this online. Has anyone ever heard of something like this being done before? Is it possible, and would it be worth the effort?

    I believe it is possible, but you'd need to use a NeuralNetwork node, instead of Yolo node. Also, you will need custom decoding.
    Would it be worth the effort? - sure if you already have the model and it has shown better accuracy than the standard RGB models.
