Hello Luxonis team !

I'm currently working with the OAK-D lite and i'm trying to use a neural network on that camera. With this model, I want to use the mono camera (left or right) as input of my neural network.

The input of my model has 3 channels (even with mono camera i want to have 3 channels because my model was trained on both grayscale and rgb images) and has values between -1 and 1, I apply the normalization during the conversion between ONNX to IR format by using --meanvalues [127.5,127.5,127.5] and --scalevalues [127.5,127.5,127.5] as mention in the documentation here .

When I did the test, with mono camera as input (Left/Right), the model just output random values. I tested the same blob on RGB camera and it worked normally. In my case, how can i normalize correctly the image from mono camera ?

Thanks in advance for your support !

    Hi AnhTuNguyen
    Could you add an minimal code for the pipeline. Maybe the dimensions are off?

    Did you set the type using manip node?:
    # The NN model expects BGR input. By default ImageManip output type would be same as input (gray in this case)
    manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)

    Thanks,
    Jaka

      5 days later

      jakaskerl Hello !

      Sorry for the delay.
      This is my pipeline code

      def create_mono(pipeline: dai.Pipeline, socket):
              cam = pipeline.createMonoCamera()
              cam.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
              cam.setBoardSocket(socket)
      
              # ImageManip for cropping (face detection NN requires input image of 300x300) and to change frame type
              manip = pipeline.create(dai.node.ImageManip)
              manip.initialConfig.setResize(args.HW[1], args.HW[0])
              manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)
              cam.out.link(manip.inputImage)
              return cam, manip
      
      # setup input cameras
      cam_left, processed_left = create_mono(pipeline, dai.CameraBoardSocket.LEFT)
      cam_right, processed_right = create_mono(pipeline, dai.CameraBoardSocket.RIGHT)
      # create left output node
      xout_left = pipeline.createXLinkOut()
      xout_left.setStreamName("left")
      cam_left.out.link(xout_left.input)
      # create right output node
      xout_right = pipeline.createXLinkOut()
      xout_right.setStreamName("right")
      cam_right.out.link(xout_right.input)
      
      # setup neural network node
      model = pipeline.create(dai.node.NeuralNetwork)
      model.setBlob(args.blob_path)
      model.setNumInferenceThreads(2)
      processed_left.out.link(model.inputs["frame1"])
      processed_right.out.link(model.inputs["frame2"])

        Hi AnhTuNguyen
        The code looks fine at first glance. Could you route the image manip directly through Xlink and display its values to see if they are correctly scaled?

        Thanks,
        Jaka

          jakaskerl

          Hello !

          I tested it and the values are in good scaled.

          I think there is a bug in the manip. I explain the way I test:
          - In the output of my model, I also output the input image. The goal is to really see how the input of my model looks like inside the pipeline of depthai. I found very strange input image.

          • from left to right: Left camera - Right camera - Input frame that my model received.
          • Knowing that I have a OAK-D Lite, rvc2. depthai version is 2.21.2.0

            Hi AnhTuNguyen
            So can you confirm the manip output (sent through xlink) and NN passthrough differ? Maybe draw them both side by side if you can. Also, what is the H, W of the right most image?

            Thanks,
            Jaka

              jakaskerl

              Hi !
              That was the figure I sent above: Manip Left - Manip Right - NN input. NN input should be exactly the same as Manip-Left (first image) but somehow it was corrupted into that strange image in my figure.

              I tried with Color camera and the right most image was correct --> I only had problem with input come from mono camera - manip.

              I used 160x160 for the manip and input of model.

              jakaskerl

              Hi !
              I found the problem. Since I use mono camera for my neural network, I must specify -ip U8 when I compile the openvino IR to blob. My model works well now.

              Thanks a lot for your time !

                Hi AnhTuNguyen
                Great! I thought that might be a potential issue, but wouldn't the mismatch on mono cause the

                Input tensor 'inputs' (0) exceeds available data range.

                error? I figured you probably didn't get that error, otherwise you would have said so - and the error is pretty vocal ;-).

                Thanks,
                Jaka

                  10 months later

                  AnhTuNguyen Hi!
                  Can you send images/explain how you trained the model with a mono cam?

                  I've collected data using the mono cam, but I don't know how to train the model (yolo v8), and later how to deploy it back to the camera?
                  Did you use tools.ultralytics to convert it to .blob?

                  Thanks!