I got a problem on my project when i use OAK-D Pro.

FIrst, I convert my custom model onnx to blob, and then check the result using 'openvino.inference_engine import IECore' like this :

ie = IECore()
exec_net = ie.import_network("best.blob",device_name='MYRIAD')
input_blob = next(iter(exec_net.input_info))

Then i got the same result compare with original model.
However, When i use depthai api 'setBlobPath()' with same blob file('best.blob'), I get different result(output). To fix this problem, i try to check input in term of type and normalization, but it is okay.

How can i fix this problem..?

  • erik replied to this.

    Hi socome ,
    My first guess would also be incorrect datatype, or conversion to some datatype (eg. u8->fp16) which could result in inaccuracies. I think it would be best if you could create MRE.
    Thanks, Erik

      erik Thank you for your reply. I checked the datatype, but it is ok. I want to make the MRE but i can't do that because of policy of my company. however, i can share baseline model.

      https://github.com/linklist2/PIAFusion_pytorch
      https://github.com/linklist2/PIAFusion_pytorch/blob/master/pretrained/best_cls.pth

      This repo is baseline. In partialy, i use illumination aware network. so you can download pretrained weight based on pytorch from this repo. Then i convert Torch into onnx using this code.

      # -*- coding: utf-8 -*-
      
      import argparse
      import os
      import random
      
      import numpy as np
      import torch
      import onnx
      import numpy
      
      from onnxsim import simplify
      from torch import optim
      from models.cls_model import Illumination_classifier
      
      import blobconverter
      
      def convert_onnx(model, output, opset=12)  :
          shape = (1, 3, 640, 512)
          img = torch.ones(shape, dtype=torch.float32).cuda()
      
          torch.onnx.export(
              model, 
              img, 
              output, 
              do_constant_folding=True,
          )
      
          onnx_model = onnx.load(output)
          onnx_model, check = simplify(onnx_model)
      
          # assert check, "Simplified ONNX model could not be validated"
          onnx.save(onnx_model, output)
      
      if __name__ == '__main__':
      
          model = Illumination_classifier(input_channels=3)
          weight = torch.load("./best_cls.pth")
          model.load_state_dict(weight)
          model = model.cuda()
          model.eval()
      
          convert_onnx(model, "./onnx_file/best.onnx", opset=9) 

      After that i covert onnx into blob. But as i mentioned, I got different result.

      Could you check this problem..?

      In addition, How can i change the inference part on depthai..?
      i mean when i use this code, then i got same result. so if i can do that, i want to change inference code.

      
      import cv2
      import numpy as np
      from openvino.inference_engine import IECore
      
      shape = (512, 640)
      
      ie = IECore()
      exec_net = ie.import_network("best.blob",device_name='MYRIAD')
      input_blob = next(iter(exec_net.input_info))
      
      # Get video from the computers webcam
      cam = cv2.VideoCapture(0)
      cam.set(cv2.CAP_PROP_FPS, 30)
      
      while True:
          ret, raw_image = cam.read()
          if not ret:
              continue
          
          image = cv2.resize(raw_image, shape)
          cv2.imshow('USB Camera', image)
      
          image = np.expand_dims(image, axis=0)
          image = image.transpose((0, 3, 1, 2))
      
          # Do the inference on the MYRIAD device
          output = exec_net.infer(inputs={input_blob: image})  # blob
      
          print(output)
      
          if cv2.waitKey(1) == ord('q'):
              break

      Thank you..

      Hey @socome,

      So if I understand correctly, for both codes above you get the same results as with PyTorch and ONNX model?

      Can you share the inference code that gives you different outputs? And an example of the outputs? Arrays will be flattened so you have to reshape them.

      If the last code you've shared gives you the outputs that you want, this means that the model is exported correctly. It expects NCHW BGR unnormalized (0-255) images. Make sure you are feeding images of the same format to the device.

      @Matija Thank you for your reply.

      The last code was made to check the result of blob file and then it works.
      but i got a different result when i load this blob file using 'setBlobPath()' on depthai.

      To export blob file, I use 'mo_onnx.py' and 'compile_tool' instead of blobconvert because there are some reason(ex. i can't acess aws server). then i use scale factor option to normalize inputs. Thus, final data type of input is 'NCHW RGB normalized images'. so i try to make the same format to the device and i think that the inputs are exactly same(I checked the input images using passthrough option). but i got different tendency of result.

      I checked the other examples and models using same process, and then all thing work correctly. but only aforementioned baseline model didn't work.. So i think that isn't issues of input format.

      ( I can't share any outputs to debug. I'm sorry)

        socome So, images on OAK-D from the camera node come in BGR 0-255. Which means that you have to provide proper flags to model optimizer (regardless if you use blobconverter or local).

        What I mean -- let's say in Pytorch your input is an RGB image and you just scale it to 0-1 (e.g., load with PILLOW and then call ToTensor()). This means that you need to provide --scale 255 and --reverse_input_channels to model optimizer. You can look at those as preprocessing flags to get the torch model input, assuming your input image on OAK-D will be BGR. Since XML already contains those flags, you can verify the export process with OpenVINO API by feeding it BGR images read with OpenCV and checking if you get the expected output.

        If that works, setBlobPath should work if you use camera node to feed it the images. If you are feeding the images locally by sending them to some input queue, and then forwarding them to the NN, you need to make sure that the images you are sending are in the right format. OpenVINO uses NCHW by default, so if you read images with OpenCV [H, W, 3] you need to transpose them to [3, H, W], flatten() with Numpy, and then send to the input queue.

        Let me know if the above works for you.

        If your files and models are confidential, another option would be to sign a mutual NDA and set up some priority support contract, which we could discuss further over email.

          6 days later

          Matija

          Thank you for your reply. I did some experiment with this code. I use oak-d and ncs2 stick to compare the results. and then i got different results from same code and input image. How can i solve this problem ..? I think that is not related to Color order.

          #!/usr/bin/env python3
          
          import cv2
          import depthai as dai
          import numpy as np
          
          from openvino.inference_engine import IECore
          
          
          # Create pipeline
          p = dai.Pipeline()
          
          # Define sources and outputs
          camRgb = p.create(dai.node.ColorCamera)
          
          # Properties
          camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
          camRgb.setInterleaved(False)
          camRgb.initialControl.setAutoExposureEnable()
          camRgb.setIspScale(2,3)
          
          # Resize ip
          downscale_mainip = p.create(dai.node.ImageManip)
          downscale_mainip.initialConfig.setResize(512,640)
          downscale_mainip.initialConfig.setFrameType(dai.RawImgFrame.Type.RGB888p)
          # downscale_mainip.initialConfig.setFrameType(dai.RawImgFrame.Type.BGR888p)
          
          # NN that detects faces in the image
          nn = p.create(dai.node.NeuralNetwork)
          nn.setBlobPath("./best_xavier.blob")
          nn.input.setBlocking(False)
          
          # Link for Model Inference
          camRgb.isp.link(downscale_mainip.inputImage)
          downscale_mainip.out.link(nn.input)
          
          # Send bouding box from the NN to the host via XLink
          nn_xout = p.create(dai.node.XLinkOut)
          nn_xout.setStreamName("nn")
          nn.out.link(nn_xout.input)
          
          rgb_xout = p.create(dai.node.XLinkOut)
          rgb_xout.setStreamName("rgb")
          camRgb.isp.link(rgb_xout.input)
          
          # Connect to device and start pipeline
          with dai.Device(p) as device:
          
              ie = IECore()
              exec_net = ie.import_network("./best_xavier.blob",device_name='MYRIAD')
              input_blob = next(iter(exec_net.input_info))
          
              # Output queues will be used to get the grayscale frames from the outputs defined above
              qNn = device.getOutputQueue(name="nn", maxSize=4, blocking=False)
              qCam = device.getOutputQueue(name="rgb", maxSize=4, blocking=False)
              time_flag = 0
          
              while True:
          
                  if time_flag > 20 : 
                      time_flag =0
          
                  Nn_out = qNn.get().getFirstLayerFp16()
                  
          rgb = qCam.get().getCvFrame() rgb = rgb.astype(np.uint8).copy() image = cv2.resize(rgb,(512,640)) # image = cv2.cvtColor(rgb,cv2.COLOR_RGB2BGR) # image = cv2.cvtColor(rgb,cv2.COLOR_BGR2RGB) image = np.expand_dims(image, axis=0) image = image.transpose((0, 3, 1, 2)) # Do the inference on the MYRIAD device output = exec_net.infer(inputs={input_blob: image}) # blob if time_flag == 0 : print("OAK-D DAY : ", Nn_out[0], "NCS2 DAY : ", output['36'][0][0], " OAK-D NIGHT : ", Nn_out[1] ," NCS2 NIGHT : ", output['36'][0][1]) else : time_flag = time_flag + 1 cv2.imshow("RGB",rgb) key = cv2.waitKey(1) & 0xFF if key == ord("q"):
          cv2.destroyAllWindows() break ```

            socome Ok I see. I think it could be due to the inputs not actually matching. I would suggest you make another color output node which you link to nn.passthrough output of the NeuralNetwork and then visualize both the RGB frame and the passthrough frame at the same time. If they don't match, it is likely because of that.

            To get them to match, you can try setting setKeepAspectRatio(False) on ImageManip config.

            @Matija


            pass = passthrough / isp = direct
            Thank you for your reply. when i checked the images from passthrough and isp, i found that two images pixel values are little bot different and that is important of my custom network. There are no different other terms like view point(aspect ratio and color order). So, to solve this problem, I send img from host to device like this code :

            #!/usr/bin/env python3
            
            import cv2
            import depthai as dai
            import numpy as np
            
            from time import monotonic
            
            # Create pipeline
            p = dai.Pipeline()
            
            # Define sources and outputs
            camRgb = p.create(dai.node.ColorCamera)
            
            # Properties
            camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
            camRgb.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
            camRgb.initialControl.setAutoExposureEnable()
            camRgb.setPreviewKeepAspectRatio(False)
            camRgb.setIspScale(2,3)
            
            # Resize ip
            # downscale_mainip = p.create(dai.node.ImageManip)
            # downscale_mainip.initialConfig.setResize(512,640)
            # downscale_mainip.initialConfig.setFrameType(dai.RawImgFrame.Type.RGB888p)
            
            # NN that detects faces in the image
            nn = p.create(dai.node.NeuralNetwork)
            nn.setBlobPath("./best_xavier.blob")
            nn.input.setBlocking(False)
            
            # Link for nn
            nn_xout = p.create(dai.node.XLinkOut)
            nn_xout.setStreamName("nn")
            nn.out.link(nn_xout.input)
            
            # Link for RGB
            rgb_xout = p.create(dai.node.XLinkOut)
            rgb_xout.setStreamName("rgb")
            camRgb.isp.link(rgb_xout.input)
            
            input_xin = p.create(dai.node.XLinkIn)
            input_xin.setStreamName("input")
            
            input_xin.out.link(nn.input)
            
            input_image = dai.ImgFrame()
            input_image.setWidth(512)
            input_image.setHeight(640)
            
            # Connect to device and start pipeline
            with dai.Device(p) as device:
            
                # Output queues will be used to get the grayscale frames from the outputs defined above
                qNn = device.getOutputQueue(name="nn", maxSize=1, blocking=False)
                qCam = device.getOutputQueue(name="rgb", maxSize=1, blocking=False)
                qInput = device.getInputQueue(name="input")
            
                while True:
            
                    rgb = qCam.get().getCvFrame()
                    rgb = rgb.astype(np.uint8).copy()
            
                    input_image.setData(cv2.resize(rgb,(512,640)).transpose(2, 0, 1).flatten())
                    qInput.send(input_image)
            
                    nn_out = qNn.tryGet()
            
                    if nn_out is not None : 
                        Nn_out = nn_out.getFirstLayerFp16()
                        print(Nn_out)
            
                    cv2.imshow("RGB",rgb)
            
                    key = cv2.waitKey(1) & 0xFF
                    if key == ord("q"):    
                        cv2.destroyAllWindows()
                        break

            But that is too slow.. I can't use it.

            I'm not sure. This different occurred, when the images format is converted on oak-d device. How can i solve this problem ..?

            I'm really thank you.

            • erik replied to this.

              Hi socome ,

              i found that two images pixel values are little bot different and that is important of my custom network

              Note that might also be due to ISP image format - YUV420. Meaning it only uses 1.5bytes/pixel (see yuv420 codec for more info), and when converting back to RGB (so you can display it) it might not be 100% identical, but very similar. What exactly was the problem with the script you have posted 3 posts above?

              Thanks, Erik

              @erik Hi! You can check the problem with the script posed 7! And If you tell me your e-mail address and then I can share my example blob file right now.

              Thank you!

              • erik replied to this.

                Hi socome

                with the script posed 7

                What do you mean by this? Sorry I am not fully up to date with this discussion. It's erik@luxonis.com, please send full MRE.
                Thanks, Erik