On-device programming

jvitordm

Hello,

To make sure I'm on the right track: I'm using an OAK1 camera with a Raspberry Pi and a first task being performed is merging a group of frames to compose the final image of a large object passing through an assembly line. Trying to run this concatenation with 4k frames is overloading the Raspberry and at the moment it's only working with 1080p frames.

The idea was to use the OAK1 to make this processing (using pytorch functions?) to free up some memory on the Raspberry. Should I then use the ideas on these sections (https://docs.luxonis.com/en/latest/pages/tutorials/on-device-programming/#on-device-programming) to perform this task? Any further recommendations?

Thanks a lot!

erik

Hello jvitordm ,
WHat kind of concatenation do you have in mind? Something similar to this? And the docs you linked are correct for such custom function.
Thanks, Erik

jvitordm

Hi Erick,
It's different, I have a long object (passing through a mat) that cannot be captured with a single image, so I identify when it starts and when it ends and merge different pictures in time to form a final one with the whole object.

Cheers

erik

Hello jvitordm , that makes sense and it's possible to do. When you detect the start of the object you save a frame (in Script node), and when you detect the end of the project you pass both saved frame and current frame to a custom NN object that concatenates both frames (at a specific offset if that's needed). Thoughts on such an approach?
Thanks, Erik

jvitordm

Perfect Erick, that's the idea. I'm performing it with opencv functions at the moment, comparing images and identifying the correct height to crop and then concatenate. My main doubt then is: will I have to perform it now using pytorch NN functions in order to be able to generate a .blob to process through the camera?

Cheers

erik

Hello jvitordm , you would need to use either pytorch or any other NN framework (tesnorflow, caffee, mlnet, keras..) that's supported by openvino, so you can get the blob. Unfortunately you won't be able to use OpenCV.
Thanks, Erik

jvitordm

Hi erik !

I've been working on this topic and came with the code below to generate the .blob file to perform the concatenation of two images. Do you have any suggestions on how to proceed to process (concatenate with the code below) two images let's say 30 frames apart in time? Should I use some kind of queue to process them?

Thanks once again!

from pathlib import Path
import torch
from torch import nn
import kornia
import onnx
from onnxsim import simplify
import blobconverter

name = 'concat'

class Model(nn.Module):

    def forward(self, img1, img2):
        # Transforming to B&W
        img1bw = kornia.color.rgb_to_grayscale(img1, rgb_weights=torch.tensor([0.299, 0.587, 0.114]))
        img2bw = kornia.color.rgb_to_grayscale(img2, rgb_weights=torch.tensor([0.299, 0.587, 0.114]))

        # Global Parameters - Screening
        band = 300  # Band of height to be analyzed for the images (# of pixels)
        int_ref = 20  # Pixel range +/- to be evaluated for the fine analysis
        end = img1bw.shape[2] - band
        val = 0
        
        reduction = int_ref # Resolution reduction
        val = self.screen(band, val, end, reduction, img2bw, img1bw)
        reduction = 1 # Resolution reduction
        val = self.screen(band, val - int_ref, val + int_ref, reduction, img2bw, img1bw)

        img1cut = torch.narrow(img1, 2, 0, val)
        imgfin = torch.cat((img1cut, img2), 2)

        return imgfin

    def screen(self, band, start, end, imrdc, imgref, imgscr):
        rg = range(start, end, 1)  # Range for y0 - starting point for image being screened
        diff = [None] * len([*rg])
        i = 0

        # Band of reference
        ref_gray = torch.narrow(imgref, 2, 0, band)
        ref_gray = kornia.geometry.transform.rescale(ref_gray, (1/imrdc, 1/imrdc))

        for y0 in rg:
            # Band of comparison
            gray = torch.narrow(imgscr, 2, y0, band)
            gray = kornia.geometry.transform.rescale(gray, (1/imrdc, 1 /imrdc))
            # Absolute difference between bands
            change = torch.abs(torch.sub(gray, ref_gray))
            diff[i] = torch.sum(change) / (change.shape[2] * change.shape[3])  # Average difference measure
            i = i + 1

        val = [*rg][diff.index(min(diff))]  # Cut height

        if val == 0:
            val = 100 # .blob is not generated if val == 0

        return val

# Define the expected input shape (dummy input)
shape = (1, 3, 1080, 1920)
model = Model()
X = torch.ones(shape, dtype=torch.float32)

path = Path("out/")
path.mkdir(parents=True, exist_ok=True)
onnx_path = str(path / (name + '.onnx'))

print(f"Writing to {onnx_path}")
torch.onnx.export(
    model,
    (X, X),
    onnx_path,
    opset_version=12,
    do_constant_folding=True,
)

onnx_simplified_path = str(path / (name + '_simplified.onnx'))

# Use onnx-simplifier to simplify the onnx model
onnx_model = onnx.load(onnx_path)
model_simp, check = simplify(onnx_model)
onnx.save(model_simp, onnx_simplified_path)

# Use blobconverter to convert onnx->IR->blob
blobconverter.from_onnx(
    model=onnx_simplified_path,
    data_type="FP16",
    shaves=6,
    use_cache=False,
    output_dir=path,
    optimizer_params=[]
)

erik

Hello jvitordm ,
I would suggest linking them up to Script node, and having an array in there, see example here. As you receive a frame, append it into the array. If the array is larger than 30 ImgFrames, pop the oldest ImgFrame (arr.pop(0)) and send both the oldest and the most recent to the NeuralNetwork node (for multiple inputs, see example here).
Also note that you would need to use ImageManip node with increased pool size to be able to store 30 ImgFrames (see example here for 20 frames), otherwise it would block the pipeline.
Thanks, Erik

jvitordm

Hi erik !

I'm stuck with an error when running the .blob I've generated. The code is shown below:

from pathlib import Path
import torch
from torch import nn
import kornia
import onnx
from onnxsim import simplify
import blobconverter

name = 'concat'

class Model(nn.Module):

    def forward(self, img1, img2):
        img1bw = kornia.color.rgb_to_grayscale(img1, rgb_weights=torch.tensor([0.299, 0.587, 0.114]))
        img2bw = kornia.color.rgb_to_grayscale(img2, rgb_weights=torch.tensor([0.299, 0.587, 0.114]))

        # Global Parameters - Screening
        band = 20  # Band of height to be analyzed for the images (# of pixels)
        end = img1bw.shape[2] - band
        val = 0
        
        rg = range(0, end, 1)  # Range for y0 - starting point for image being screened
        diff = []
        # Band of reference
        ref_gray = torch.narrow(img2bw, 2, 0, band)

        for y0 in rg:
            # Band of comparison
            gray = torch.narrow(img1bw, 2, y0, band)
            # Absolute difference between bands
            diff.append( torch.mean(torch.abs(torch.sub(gray, ref_gray))) )

        val = rg[diff.index(min(diff))]  # Cut height
        
        if val == 0:
            val = 50

        img1cut = torch.narrow(img1, 2, 0, val)
        imgfin = torch.cat((img1cut, img2), 2)

        return imgfin


# Define the expected input shape (dummy input)
shape = (1, 3, 300, 300)
model = Model()
X = torch.ones(shape, dtype=torch.float32)

path = Path("out/")
path.mkdir(parents=True, exist_ok=True)
onnx_path = str(path / (name + '.onnx'))

print(f"Writing to {onnx_path}")
torch.onnx.export(
    model,
    (X, X),
    onnx_path,
    opset_version=12,
    input_names = ['img1', 'img2'],
    do_constant_folding=True,
)

onnx_simplified_path = str(path / (name + '_simplified.onnx'))

# Use onnx-simplifier to simplify the onnx model
onnx_model = onnx.load(onnx_path)
model_simp, check = simplify(onnx_model)
onnx.save(model_simp, onnx_simplified_path)

# Use blobconverter to convert onnx->IR->blob
blobconverter.from_onnx(
    model=onnx_simplified_path,
    data_type="FP16",
    shaves=6,
    use_cache=False,
    output_dir=path,
    optimizer_params=[]
)

The 'val' variable should never be 0, but if I dont include the if val == 0: val = 50 an error is shown and the .blob is not generated. When running it, I'm always finding val to be 0 and then it is assuming the 50 value.

Any ideas on why I'm always getting a 0 when running the rg[diff.index(min(diff))] function? Maybe something to do with the FP16 limitation?

Thank you in advance!

erik

Hello jvitordm ,
Does the algorithm work as expected when run on the computer? And it could very well be FP16 limitation, as I have experienced a similar issue when creating a model, as max value is 2¹⁶ (_65k), so if you are using a few multiplications in a row, you can quickly get values above that, which I believe will result in INF (infinity).
Thanks, Erik

jvitordm

erik Yes, on the computer its running smoothly! I'll see what I can change to overcome this...

jvitordm

I've made some changes on the code and run some tests and I think that's not the issue...

...

class Model(nn.Module):

    def forward(self, img1, img2):
        img1bw = torch.div(img1, 255)
        img2bw = torch.div(img2, 255)

        # Global Parameters - Screening
        band = 100  # Band of height to be analyzed for the images (# of pixels)
        end = img1bw.shape[2] - band
        val = 0
        
        rg = range(0, end, 1)  # Range for y0 - starting point for image being screened
        diff = []
        # Band of reference
        ref_gray = torch.narrow(img2bw, 2, 0, band)

        for y0 in rg:
            # Band of comparison
            gray = torch.narrow(img1bw, 2, y0, band)
            # Absolute difference between bands
            sub = torch.abs(torch.sub(gray, ref_gray))
            meandiff = torch.mean(sub)
            diff.append(meandiff.item())

        val = rg[diff.index(min(diff))]  # Cut height

        if val == 0:
            val = 50

        img1cut = torch.narrow(img1, 2, 0, val)
        imgfin = torch.cat((img1cut, img2), 2)

        return imgfin


...

When generating the .blob the following warning shows up:

Writing to out\concat.onnx
\\192.168.3.41\pishare\blobtest\multiconcat\test2\generateconcatblob.py:37: 
TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. 
We cant record the data flow of Python values, so this value will be treated as a constant in the future.
This means that the trace might not generalize to other inputs!
  diff.append(meandiff.item())

Maybe the problem is here?

Another idea: for the code I'm running on the computer I'm loading images as shown below (to simulate the input from the camera) - is it right or is the image coming from the camera somehow different?

# Read images
image1 = cv2.imread('1.bmp')
image2 = cv2.imread('2.bmp')

# Convert
img1 = kornia.image_to_tensor(image1)
img1c = torch.unsqueeze(img1.float(), dim=0)  # BxCxHxW
img2 = kornia.image_to_tensor(image2)
img2c = torch.unsqueeze(img2.float(), dim=0)  # BxCxHxW

erik

Hello jvitordm ,
As you mentioned, that warning looks concerning to me as well. I would try using only tensors, not Python values. It might be that OpenVINO later optimized the network and just sets the value as constant, so it would always give you the same result.
Thanks, Erik

jvitordm

erik
Still having a hard time here...
I'm trying to simplify the function to better understand its limitations. I believe one of the main issues with my function is that it is dynamic - has variables/loop that probably should be dealt with Scripting instead of simply using Tracing as explained below...

Q: How to export models containing loops?

See Tracing vs Scripting.
https://pytorch.org/docs/stable/onnx.html#tracing-vs-scripting

What do you think? Do you happen to have any examples?
Losing my hope...

Thanks!

erik

Hello jvitordm ,
I believe I have used looping in some example, like for loop of 10 and it worked as expected. The model took a long time to be compiled, but I believe it worked (slowly). After switching to the same logic but with matrix operations it was faster as well (on OAK cameras).
Thanks, Erik

jvitordm

erik

I was doing a very simple test with the function below, but got an error for the torch.abs funcion when compiling the .blob. For what I checked on the documentations it should be a supported function... Any clue why the error is showing up?

class Model(nn.Module):
    def forward(self, img1, img2):
        return torch.abs(torch.sub(img1, img2))

When compiling...

Writing to out\concat.onnx
Downloading out\concat_openvino_2021.4_6shave.blob...
    "exit_code": 1,
    "message": "Command failed with exit code 1, command:
    /opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/myriad_compile -m
    /tmp/blobconverter/83b4f9a1e1da454cad7c95af06f362a9/concat/FP16/concat.xml -o
    /tmp/blobconverter/83b4f9a1e1da454cad7c95af06f362a9/concat/FP16/concat.blob -c
    /tmp/blobconverter/83b4f9a1e1da454cad7c95af06f362a9/myriad_compile_config.txt -ip U8",

"stderr": "[ GENERAL_ERROR ] \n/home/jenkins/agent/workspace/private-ci/ie/build-linux-ubuntu20/b/repos/openvino/inference-engine/src/vpu/graph_transformer/src/frontend/frontend.cpp:439 
Failed to compile layer \"3\": unsupported layer type \"Abs\"\n",      

"stdout": "Inference Engine: \n\tIE version ......... 2021.4.0\n\tBuild ........... 2021.4.0-3839-cd81789d294-releases/2021/4\n\u001b[1;33m[Warning][VPU][Config] 
Deprecated option was used : VPU_MYRIAD_PLATFORM\u001b[0m\n"

erik

Hello jvitordm ,
It's because Abs layer isn't supported. The workaround is to multiply with the same value and use square root.
Thanks, Erik

erik

Hello jvitordm ,
Great that you got it working! For the second issue, it looks like the ONNX doesn't support it. Maybe look at the kornia implementation and use a different pytorch operations to achieve the same thing? Similar to that, we have changed kornia's implementation for the pointcloud, see tutorial here.
Thanks, Erik

jvitordm

erik Thanks! I've actually got it working in a simpler way, warping the image through the camera:
https://docs.luxonis.com/projects/api/en/latest/samples/ImageManip/rgb_rotate_warp/

Cheers!

erik

Hello jvitordm , Thanks for circling back and awesome that you got it working!