Hello,

I have a question about CNN training. Is it possible to include depth in the CNN directly, in order to be able to evaluate the size of the object?

For example: imagine i have 2 identical objects. a red tennis ball, and a red basket ball. they look IDENTICAL except for the size. If it would be possible to include the depth data in the neural network, we could based on the depht information estimate its size, and detect what it is. I could know if it is a near tennisball or a far Basket ball.

Thank you!

Best regards

Igor

    Hey IgorMasin ,

    We don't have an example right now, but something like that would definitely be possible. You could expand a standard object detection to take an input with 4 channels instead of 3, but I wonder how this would work. Another possibility would be to add a separate backbone/different features fusion of depth features and try to improve the object detector. There are a few papers focusing on RGB-D datasets.

    While we don't have any examples, we definitely have it on our TODO list, but at the moment I can't say how fast we could make one.