Blobconverter for motion detection model

lasse

Hi folks,

this is a follow up from this post.

I got amazing help from Matija but I ran into a dead end.

With Matijas help, I created a colab notebook, where I convert the pytorch code into a model. The model returns a correct result and the tree looks good:

class Model(nn.Module):
    def forward(self, img1, img2):
        # Calculate the mean of the two input tensors
        mean1 = torch.mean(img1, dim=1, keepdim=True)
        mean2 = torch.mean(img2, dim=1, keepdim=True)

        # Calculate the absolute difference between the two mean tensors
        diff = torch.sqrt(torch.pow(mean1 - mean2, 2)).float()
        print('diff', diff, diff.shape)

        threshold = 30.0

        # Create a binary mask where differences are higher than the threshold
        mask = torch.where(diff > threshold, torch.tensor(1.0), torch.tensor(0.0))
        print('mask', mask, mask.shape)

        # Count the number of moving pixels
        movingPx = torch.sum(mask).view(1,1,1,1) # Ensure the output has the correct dimension
        print('movingPx', movingPx, movingPx.shape)

        #return movingPx
        # Calculate the total number of pixels
        totalPx = torch.tensor(mask.shape[2] * mask.shape[3], dtype=torch.float32)
        print('totalPx', totalPx)

        # Calculate the ratio of moving pixels to the total number of pixels
        movingRatio = movingPx / totalPx
        print('Result', movingRatio)

        return movingRatio  

model = Model()
torch.onnx.export(
    model,
    (torch.randn(1,3,720,720)*100, torch.randn(1,3,720,720)*100),
    "model_diff.onnx",
    opset_version=16,
    input_names=['img1', 'img2'],
    output_names=['movingRatio']
    )

!mo --input_model model_diff.onnx --output_dir /content/out

# Then upload .bin and .xml to blobconverter

and the blob gets manually converted without error but gets stuck in the *get() *function.

To check, I tried to convert a known to work model. Converting worked, but without any return, when running it.

The code, that runs on the oak, has worked with other similar structured models.

installed packages in the colab (because other ones did not work):

blobconverter>=1.2.9
onnx onnx-simplifier
openvino-dev==2022.3

Any ideas?

best regards

Lasse

jakaskerl

lasse Converting worked, but without any return, when running it.

Could you paste a MRE of the code you are running? A working model not producing results indicates a pipeline issue.

Thanks,
Jaka

lasse

@jakaskerl as always thanks for the quick responds!

MRE:

import numpy as np
import cv2
import depthai as dai

# Create DepthAI pipeline
p = dai.Pipeline()

# Create a color camera node
camRgb = p.create(dai.node.ColorCamera)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
# camRgb.setFps(60)
camRgb.setVideoSize(720, 720)  # Set video size
camRgb.setPreviewSize(720, 720)  # Set preview size
camRgb.setInterleaved(False)  # Disable interleaving

# NN
nn = p.create(dai.node.NeuralNetwork)
nn.setBlobPath("./model_diff-3.blob")

# Create a script node
pass_data_script = p.create(dai.node.Script)
# Configure the script node to send frames and threshold, ratio to nn
pass_data_script.setScript("""
# Initialize the 'old' variable with the first frame
old = node.io['in'].get()

# Loop to continuously process frames
while True:

    # Get the current frame
    frame = node.io['in'].get()

    # Send the previous frame ('old') to the output
    node.io['script_out_0'].send(old)

    # Send the current frame to the output
    #node.io['script_out_1'].send(frame)

    # Update the 'old' variable with the current frame for the next iteration
    old = frame
""")
# Link the script outputs to the neural network inputs
pass_data_script.outputs['script_out_0'].link(nn.inputs['img1'])
pass_data_script.outputs['script_out_1'].link(nn.inputs['img2'])
# Link a camera output to the script input
camRgb.preview.link(pass_data_script.inputs['in'])


# Send frame diff to the host
nn_xout = p.create(dai.node.XLinkOut)
nn_xout.setStreamName("nn")
nn.out.link(nn_xout.input)

rgb_xout = p.create(dai.node.XLinkOut)
rgb_xout.setStreamName("rgb")
camRgb.video.link(rgb_xout.input)

# Pipeline is defined, now we can connect to the device
with dai.Device(p) as device:

    # Get output queues for the neural network and the camera
    qNn = device.getOutputQueue(name="nn", maxSize=1, blocking=False)
    qCam = device.getOutputQueue(name="rgb", maxSize=2, blocking=False)

    # Main processing loop
    while True:

        # frame = qCam.get().getCvFrame()
        # printing the sum of all pixels where the difference is higher than the set threshold (in blob)


        # the NN data object contains the output of the neural network
        print('Getting data...')
        nnData = qNn.get() # .tryGet()
        print(nnData.getAllLayerNames())

Thanks

jakaskerl

Uh, @lasse

lasse # frame = qCam.get().getCvFrame()

You have to .get() frames even if you don't read them, otherwise the qCam will fill up and block the pipeline.

Thanks,
Jaka

jakaskerl

lasse
And this 🙂

lasse #node.io['script_out_1'].send(frame)

lasse

True!

But even if I get the frame it gets stuck in the qNn.get().

Running:

        # Get the frame from the camera
        print('Getting frame...')
        frame = qCam.get() #.getCvFrame()

        # the NN data object contains the output of the neural network
        print('Getting nn data...')
        nnData = qNn.get() # .tryGet()
        print(nnData.getAllLayerNames())

Output:

Getting frame...
Getting nn data...

lasse

@jakaskerl how frustrating… -.-

This must have been a leftover from earlier debugging.

Thank you very much!