Different Result

Ssocome · Nov 28, 2022

I got a problem on my project when i use OAK-D Pro.

FIrst, I convert my custom model onnx to blob, and then check the result using 'openvino.inference_engine import IECore' like this :

ie = IECore()
exec_net = ie.import_network("best.blob",device_name='MYRIAD')
input_blob = next(iter(exec_net.input_info))

Then i got the same result compare with original model.
However, When i use depthai api 'setBlobPath()' with same blob file('best.blob'), I get different result(output). To fix this problem, i try to check input in term of type and normalization, but it is okay.

How can i fix this problem..?

erik · Nov 28, 2022

Hi socome ,
My first guess would also be incorrect datatype, or conversion to some datatype (eg. u8->fp16) which could result in inaccuracies. I think it would be best if you could create MRE.
Thanks, Erik

Ssocome · Nov 28, 2022

erik Thank you for your reply. I checked the datatype, but it is ok. I want to make the MRE but i can't do that because of policy of my company. however, i can share baseline model.

https://github.com/linklist2/PIAFusion_pytorch
https://github.com/linklist2/PIAFusion_pytorch/blob/master/pretrained/best_cls.pth

This repo is baseline. In partialy, i use illumination aware network. so you can download pretrained weight based on pytorch from this repo. Then i convert Torch into onnx using this code.

# -*- coding: utf-8 -*-

import argparse
import os
import random

import numpy as np
import torch
import onnx
import numpy

from onnxsim import simplify
from torch import optim
from models.cls_model import Illumination_classifier

import blobconverter

def convert_onnx(model, output, opset=12)  :
    shape = (1, 3, 640, 512)
    img = torch.ones(shape, dtype=torch.float32).cuda()

    torch.onnx.export(
        model, 
        img, 
        output, 
        do_constant_folding=True,
    )

    onnx_model = onnx.load(output)
    onnx_model, check = simplify(onnx_model)

    # assert check, "Simplified ONNX model could not be validated"
    onnx.save(onnx_model, output)

if __name__ == '__main__':

    model = Illumination_classifier(input_channels=3)
    weight = torch.load("./best_cls.pth")
    model.load_state_dict(weight)
    model = model.cuda()
    model.eval()

    convert_onnx(model, "./onnx_file/best.onnx", opset=9)

After that i covert onnx into blob. But as i mentioned, I got different result.

Could you check this problem..?

In addition, How can i change the inference part on depthai..?
i mean when i use this code, then i got same result. so if i can do that, i want to change inference code.


import cv2
import numpy as np
from openvino.inference_engine import IECore

shape = (512, 640)

ie = IECore()
exec_net = ie.import_network("best.blob",device_name='MYRIAD')
input_blob = next(iter(exec_net.input_info))

# Get video from the computers webcam
cam = cv2.VideoCapture(0)
cam.set(cv2.CAP_PROP_FPS, 30)

while True:
    ret, raw_image = cam.read()
    if not ret:
        continue
    
    image = cv2.resize(raw_image, shape)
    cv2.imshow('USB Camera', image)

    image = np.expand_dims(image, axis=0)
    image = image.transpose((0, 3, 1, 2))

    # Do the inference on the MYRIAD device
    output = exec_net.infer(inputs={input_blob: image})  # blob

    print(output)

    if cv2.waitKey(1) == ord('q'):
        break

Thank you..

Matija · Nov 29, 2022

Hey @socome,

So if I understand correctly, for both codes above you get the same results as with PyTorch and ONNX model?

Can you share the inference code that gives you different outputs? And an example of the outputs? Arrays will be flattened so you have to reshape them.

If the last code you've shared gives you the outputs that you want, this means that the model is exported correctly. It expects NCHW BGR unnormalized (0-255) images. Make sure you are feeding images of the same format to the device.

Ssocome · Nov 30, 2022

@Matija Thank you for your reply.

The last code was made to check the result of blob file and then it works.
but i got a different result when i load this blob file using 'setBlobPath()' on depthai.

To export blob file, I use 'mo_onnx.py' and 'compile_tool' instead of blobconvert because there are some reason(ex. i can't acess aws server). then i use scale factor option to normalize inputs. Thus, final data type of input is 'NCHW RGB normalized images'. so i try to make the same format to the device and i think that the inputs are exactly same(I checked the input images using passthrough option). but i got different tendency of result.

I checked the other examples and models using same process, and then all thing work correctly. but only aforementioned baseline model didn't work.. So i think that isn't issues of input format.

( I can't share any outputs to debug. I'm sorry)

Matija · Dec 2, 2022

socome So, images on OAK-D from the camera node come in BGR 0-255. Which means that you have to provide proper flags to model optimizer (regardless if you use blobconverter or local).

What I mean -- let's say in Pytorch your input is an RGB image and you just scale it to 0-1 (e.g., load with PILLOW and then call ToTensor()). This means that you need to provide --scale 255 and --reverse_input_channels to model optimizer. You can look at those as preprocessing flags to get the torch model input, assuming your input image on OAK-D will be BGR. Since XML already contains those flags, you can verify the export process with OpenVINO API by feeding it BGR images read with OpenCV and checking if you get the expected output.

If that works, setBlobPath should work if you use camera node to feed it the images. If you are feeding the images locally by sending them to some input queue, and then forwarding them to the NN, you need to make sure that the images you are sending are in the right format. OpenVINO uses NCHW by default, so if you read images with OpenCV [H, W, 3] you need to transpose them to [3, H, W], flatten() with Numpy, and then send to the input queue.

Let me know if the above works for you.

If your files and models are confidential, another option would be to sign a mutual NDA and set up some priority support contract, which we could discuss further over email.

Ssocome · Dec 8, 2022

Matija

Thank you for your reply. I did some experiment with this code. I use oak-d and ncs2 stick to compare the results. and then i got different results from same code and input image. How can i solve this problem ..? I think that is not related to Color order.

#!/usr/bin/env python3

import cv2
import depthai as dai
import numpy as np

from openvino.inference_engine import IECore


# Create pipeline
p = dai.Pipeline()

# Define sources and outputs
camRgb = p.create(dai.node.ColorCamera)

# Properties
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
camRgb.setInterleaved(False)
camRgb.initialControl.setAutoExposureEnable()
camRgb.setIspScale(2,3)

# Resize ip
downscale_mainip = p.create(dai.node.ImageManip)
downscale_mainip.initialConfig.setResize(512,640)
downscale_mainip.initialConfig.setFrameType(dai.RawImgFrame.Type.RGB888p)
# downscale_mainip.initialConfig.setFrameType(dai.RawImgFrame.Type.BGR888p)

# NN that detects faces in the image
nn = p.create(dai.node.NeuralNetwork)
nn.setBlobPath("./best_xavier.blob")
nn.input.setBlocking(False)

# Link for Model Inference
camRgb.isp.link(downscale_mainip.inputImage)
downscale_mainip.out.link(nn.input)

# Send bouding box from the NN to the host via XLink
nn_xout = p.create(dai.node.XLinkOut)
nn_xout.setStreamName("nn")
nn.out.link(nn_xout.input)

rgb_xout = p.create(dai.node.XLinkOut)
rgb_xout.setStreamName("rgb")
camRgb.isp.link(rgb_xout.input)

# Connect to device and start pipeline
with dai.Device(p) as device:

    ie = IECore()
    exec_net = ie.import_network("./best_xavier.blob",device_name='MYRIAD')
    input_blob = next(iter(exec_net.input_info))

    # Output queues will be used to get the grayscale frames from the outputs defined above
    qNn = device.getOutputQueue(name="nn", maxSize=4, blocking=False)
    qCam = device.getOutputQueue(name="rgb", maxSize=4, blocking=False)
    time_flag = 0

    while True:

        if time_flag > 20 : 
            time_flag =0

        Nn_out = qNn.get().getFirstLayerFp16()
        

        rgb = qCam.get().getCvFrame()
        rgb = rgb.astype(np.uint8).copy()

        image = cv2.resize(rgb,(512,640))
        # image = cv2.cvtColor(rgb,cv2.COLOR_RGB2BGR)
        # image = cv2.cvtColor(rgb,cv2.COLOR_BGR2RGB)
        image = np.expand_dims(image, axis=0)
        image = image.transpose((0, 3, 1, 2))


        # Do the inference on the MYRIAD device
        output = exec_net.infer(inputs={input_blob: image})  # blob

        if time_flag == 0 :
            print("OAK-D DAY : ", Nn_out[0], "NCS2 DAY : ", output['36'][0][0], " OAK-D NIGHT : ", Nn_out[1] ," NCS2 NIGHT : ", output['36'][0][1])
        else :
            time_flag = time_flag + 1

        cv2.imshow("RGB",rgb)

        key = cv2.waitKey(1) & 0xFF
        if key == ord("q"):    

            cv2.destroyAllWindows()
            break
```

Matija · Dec 8, 2022

socome Ok I see. I think it could be due to the inputs not actually matching. I would suggest you make another color output node which you link to nn.passthrough output of the NeuralNetwork and then visualize both the RGB frame and the passthrough frame at the same time. If they don't match, it is likely because of that.

To get them to match, you can try setting setKeepAspectRatio(False) on ImageManip config.

Ssocome · Dec 9, 2022

@Matija

pass = passthrough / isp = direct
Thank you for your reply. when i checked the images from passthrough and isp, i found that two images pixel values are little bot different and that is important of my custom network. There are no different other terms like view point(aspect ratio and color order). So, to solve this problem, I send img from host to device like this code :

#!/usr/bin/env python3

import cv2
import depthai as dai
import numpy as np

from time import monotonic

# Create pipeline
p = dai.Pipeline()

# Define sources and outputs
camRgb = p.create(dai.node.ColorCamera)

# Properties
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
camRgb.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
camRgb.initialControl.setAutoExposureEnable()
camRgb.setPreviewKeepAspectRatio(False)
camRgb.setIspScale(2,3)

# Resize ip
# downscale_mainip = p.create(dai.node.ImageManip)
# downscale_mainip.initialConfig.setResize(512,640)
# downscale_mainip.initialConfig.setFrameType(dai.RawImgFrame.Type.RGB888p)

# NN that detects faces in the image
nn = p.create(dai.node.NeuralNetwork)
nn.setBlobPath("./best_xavier.blob")
nn.input.setBlocking(False)

# Link for nn
nn_xout = p.create(dai.node.XLinkOut)
nn_xout.setStreamName("nn")
nn.out.link(nn_xout.input)

# Link for RGB
rgb_xout = p.create(dai.node.XLinkOut)
rgb_xout.setStreamName("rgb")
camRgb.isp.link(rgb_xout.input)

input_xin = p.create(dai.node.XLinkIn)
input_xin.setStreamName("input")

input_xin.out.link(nn.input)

input_image = dai.ImgFrame()
input_image.setWidth(512)
input_image.setHeight(640)

# Connect to device and start pipeline
with dai.Device(p) as device:

    # Output queues will be used to get the grayscale frames from the outputs defined above
    qNn = device.getOutputQueue(name="nn", maxSize=1, blocking=False)
    qCam = device.getOutputQueue(name="rgb", maxSize=1, blocking=False)
    qInput = device.getInputQueue(name="input")

    while True:

        rgb = qCam.get().getCvFrame()
        rgb = rgb.astype(np.uint8).copy()

        input_image.setData(cv2.resize(rgb,(512,640)).transpose(2, 0, 1).flatten())
        qInput.send(input_image)

        nn_out = qNn.tryGet()

        if nn_out is not None : 
            Nn_out = nn_out.getFirstLayerFp16()
            print(Nn_out)

        cv2.imshow("RGB",rgb)

        key = cv2.waitKey(1) & 0xFF
        if key == ord("q"):    
            cv2.destroyAllWindows()
            break

But that is too slow.. I can't use it.

I'm not sure. This different occurred, when the images format is converted on oak-d device. How can i solve this problem ..?

I'm really thank you.

erik · Dec 9, 2022

Hi socome ,

i found that two images pixel values are little bot different and that is important of my custom network

Note that might also be due to ISP image format - YUV420. Meaning it only uses 1.5bytes/pixel (see yuv420 codec for more info), and when converting back to RGB (so you can display it) it might not be 100% identical, but very similar. What exactly was the problem with the script you have posted 3 posts above?

Thanks, Erik

Ssocome · Dec 9, 2022

@erik Hi! You can check the problem with the script posed 7! And If you tell me your e-mail address and then I can share my example blob file right now.

Thank you!

erik · Dec 10, 2022

Hi socome

with the script posed 7

What do you mean by this? Sorry I am not fully up to date with this discussion. It's erik@luxonis.com, please send full MRE.
Thanks, Erik