• DepthAI-v2
  • How to actually run inference on a custom model?

TL/DR: I can turn my PyTorch model into a blob. But I cannot run model(image), the forward function, if the model is a blob.

Long story: I managed to get the model to infer on the RGB camera node, so that is pretty neat. However, I am using the PyTorch model to actually run the inference, because I could not figure a way to do so for the blob version.

This tutorial stops exactly at generating the blob.

Also, this repo is unclear to me. I see that the function get() is being called, but when I do that, it actually freezes my program. What is get() supposed to return?

My (simplified) setup for this is the following:

Convert model to blob:

# Load model checkpoint
checkpoint = torch.load('checkpoints/checkpoint_ssd300.pt', map_location='cuda')
model = checkpoint['model']
model.eval()

# Export the model
dummy_input = torch.randn(1, 3, 300, 300, device='cuda')
torch.onnx.export(model, dummy_input, "ssd300.onnx", verbose=False)

# Simplify the model
model_simp, check = simplify('ssd300.onnx')
assert check, "Simplified ONNX model could not be validated"
onnx.save(model_simp, 'ssd300-sim.onnx')

# Convert the model to blob
blobconverter.from_onnx(model='ssd300-sim.onnx', output_dir='ssd300-sim.blob',
                        shaves=6, use_cache=True, data_type='FP16')

Link neural net (along with RGB cam, skipped):

nn = pipeline.createNeuralNetwork()
xout_nn = pipeline.createXLinkOut()
xout_nn.setStreamName("nn")
nn.out.link(xout_nn.input)

# Load model
nn.setBlobPath(Path('ssd300-sim.blob/ssd300-sim_openvino_2021.4_6shave.blob'))
nn.setNumInferenceThreads(2)
nn.input.setBlocking(False)
nn.setNumPoolFrames(4)

cam_rgb.video.link(nn.input)
cam_rgb.preview.link(nn.input)

Run pipeline

with depthai.Device(pipeline) as device:
    q_rgb = device.getOutputQueue("rgb")
    q_nn = device.getOutputQueue("nn")
    frame = None

    # Main loop
    while True:
        # Instead of get (blocking), we use tryGet (nonblocking)
        # which will return the available data or None otherwise
        in_rgb = q_rgb.tryGet()
        in_nn = q_nn.tryGet()   # this returns None

        if in_rgb is not None:
            # Retrieve 'bgr' (opencv format) frame
            frame = in_rgb.getCvFrame()

        if frame is not None:
            # what could "model" be here, aside from a pytorch checkpoint?
            box_location, text_location, text_box_location, det_labels = detect(model, frame, min_score=0.5,
                                                                                max_overlap=0.5, top_k=20) 

Now the detect function expects a PyTorch checkpoint, because somewhere in the code I call the forward pass:

def detect(model, frame, ....):
      ....
      .... = model(frame) 
      ....

If anyone can help me with this, my question is how can I actually forward pass on the blob model itself. I have no idea how to handle this problem.

Thank you very much if you are reading this. For completeness, the whole code is here.

Edit: Even with the linking to the rgb cam (which I had forgotten) cam_rgb.preview.link(nn.input), my in_nn (=q_nn.tryGet) is just a depthai.NNData, which cannot run a forward pass.

The in_nn.getData() does return a list, but its shape cannot be broken into the expected shapes of the outputs that would normally be optained via inference. Any ideea what this could mean?

The following getters do exist, but naturally they output empty lists:

print(in_nn.getLayerFp16('boxes'))
print(in_nn.getLayerFp16('scores'))
print(in_nn.getLayerFp16('labels'))
print(in_nn.getLayerFp16('detection_out'))
  • erik replied to this.

    Hi AndreiMoraru ,

    cam_rgb.video.link(nn.input)
    cam_rgb.preview.link(nn.input)

    Only preview should be linked directly, as model probably expects RGB image type (not NV12).
    Regarding your question - the depthai.NNData already contains inference results, so you don't need to call forward and run another model inference on your computer - all AI computation is done on the device itself. So you only need to decode the NN results. For Yolo/SSD results we already have that added to the platform (example for yolo), for other models you need to decode results on the host side - with business logic (deeplab segmentation example).
    I hope this helps!
    Thanks, Erik

      Hi erik, thanks for the docs.

      I have managed to find a hack using OpenVINO + business logic = spaghetii code. If I'll manage to do it via blobcleaner, then I will update this topic so that more people who google (or bing) "custom model depthai" may see this.

        So I found the answer to my problem. The in_nn does indeed already contain the predictions, but if the model outputs multiple tensors via the forward call, one will have to make sure to export the names accordingly to onnx. In my case, the detector outputs bounding boxes and class scores for a number of priors, so I must export two outputs:

        # Load model checkpoint
        checkpoint = torch.load(model_path, map_location='cuda')
        model = checkpoint['model']
        model.eval()
        
        # Export to ONNX
        input_names = ['input']
        output_names = ['boxes', 'scores']
        dummy_input = torch.randn(1, 3, 300, 300).to('cuda')
        torch.onnx.export(model, dummy_input, new_model+'.onnx', verbose=True,
                          input_names=input_names, output_names=output_names)

        Which I can the get back from the depthai.NNData object using nn.getLayer:

        predicted_locs = in_nn.getLayerFp16("boxes")
        predicted_scores = in_nn.getLayerFp16("scores") 

        I monitored the whole process of deployment and running, both via OpenVINO and via blob on my git repo