Neural Network Node (multiple inputs to one node)

Mmsee19018 · Mar 4, 2022

Hi, I have an OAK-D-IoT-75. I trying to program it to work in standalone mode. My question is about Neural Network node, I have a trained CRNN model for text recognition I have converted it to blob file. I take frame from camera and then get three bounding boxes from it (this part is hardcoded), I want to pass these three patches of frame from CRNN model. How this can be done? the way i understand it is that I have to create three NeuralNetwork nodes and load same blob file in them and then link these three patches to each of these nodes.
Is there any other way or this is the only way. I think loading same file in three separate nodes is inefficient. Is there any way I can have only one NN node and pass three patches through somehow?

erik · Mar 4, 2022

Hello msee19018 ,
You can specify to which input you want to send certain messages (eg. script.outputs['scale'].link(nn.inputs['scale'])), see these 2 examples:

Thanks, Erik

Mmsee19018 · Mar 4, 2022

Thanks Erik I got the idea I will try it and see what happens.
I have another questions I want the oak-d-iot-75 to run in standalone mode so i can not attach it with any host. I have to postprocess outputs of neural network. For example first I need to take softmax and then I have to run ctc-decoder and beam search to find characters. I have all this code written in numpy will it work with oak-d in standalone mode or do I need to change anything?
If I need to make changes can you suggest how can i do that? I looked into script node but could not find a properly sample.

Mmsee19018 · Mar 7, 2022

erik Hello I am having a little difficulty in understanding your suggested solution can you please elaborate. I looked at two example code they look different. I want a neural network node to process three inputs?

erik · Mar 7, 2022

Hello msee19018 ,
Both of these examples accept multiple inputs - one 3 and the other one 2 inputs.
Regarding your initial question (softwax, ctc-decoder) - you would need to create a custom NN (eg. with pytorch) in order to run it on the device, see documentation here. Script node isn't suited for such a computationally heavy workload.
After that, you can run it whether standalone or not.
Thanks, Erik

Mmsee19018 · Mar 7, 2022

erik Thanks for reply erik, In frame concatenation example they made new graph (custom network) which takes three inputs and all input nodes have names so it can take three inputs. My network is already trained it can only process [1,1,32,100] input size. Forgive my ignorance i could not get how script node can help me. How can i use it to run inference multiple times?

erik · Mar 7, 2022

Hello msee19018 , I'm sorry, I thought your model expects multiple inputs, not a single input [1,1,32,100]. So from my (new) understanding, your 1st model returns 3 outputs, and you want to link all of these 3 outputs to a single model (that only accepts one input). So all 3 outputs should go to the same input. If that's the case, I would suggest using Script node to get all 3 outputs and then send them to the second NN one-by-one.

Mmsee19018 · Mar 9, 2022

erik Hey Erik trying to write a script node to convert NNData object to ImgFrame. Input to scricpt node comes from a normalization network which is NNData object. I have to convert it ImgFrame so that i can feed it to second Neural Network. but the output stream of first network returns lpb.NNData object. Can you please suggest what am i doing wrong? Or is their any way of doing this?
I have a normalization network and its output needs to go into recognition network. I am using script node in between to convert NNData to ImgFrame as Neural Network node does take any other form of input. I have pasted pics for your reference.

erik · Mar 9, 2022

Hello msee19018 ,
The data variable is the NNData message. To get bytes, you should do data.getData(), so your 55th line should look like:

frame.setData(data.getData())

Note that I haven't tried this locally, so it may not work.
Thanks, Erik

Mmsee19018 · Mar 9, 2022

erik Thanks erik, Another thing output of a Neural Network node are bytes and i can convert them into 2d array using setSize() method of ImgFrame but how can I make 3d or four 4d like [1,1,32,100] as my model takes input of this form?

erik · Mar 10, 2022

Hello msee19018 ,
To the NeuralNetwork node, the [32,100] array would work exactly the same way as [1,1,32,100], as it only expects bytes, the shape doesn't matter.
Thanks ,Erik

Mmsee19018 · Mar 10, 2022

erik Thanks Erik. This is my first project with oak-d and depthai that is why i am asking a lot of basic questions. I pasted my code below can you suggest what is wrong in it. As my pipeline gets blocked or i do not know what happens. It keep running or it seems so because it keeps printing warnings that size of input and network does not match but it does not print the other lines that i am using for debugging like print statement in line 67 and 68. I think i gets stuck in script node second time it goes through it. Can you help me here? Last one is pic for what see on command line.

erik · Mar 10, 2022

Hello msee19018 ,
It looks like your first NN model (model_norm_gray.blob) doesn't get the 32x100 frame it expects. Could you share the whole code (not in screenshots), so I can also see the logic inside the Manip_Frame() function? I suspect it doesn't resize frames to the needed size/shape.
Thanks, Erik

Mmsee19018 · Mar 10, 2022

erik

import cv2
import depthai as dai
import numpy as np
time_bb_cord=[200,400,100,350]
local_bb_cord=[300,500,200,350]
visit_bb_cord=[300,500,200,350]
# Create pipeline
def Manip_Frame(pipeline,region_bb_cord):
	manip_time = pipeline.create(dai.node.ImageManip)
	rgbRr_time = dai.RotatedRect()
	rgbRr_time.center.x, rgbRr_time.center.y = (region_bb_cord[0]+region_bb_cord[1]) // 2, (region_bb_cord[2]+region_bb_cord[3]) // 2
	rgbRr_time.size.width, rgbRr_time.size.height = region_bb_cord[1]-region_bb_cord[0], region_bb_cord[3]-region_bb_cord[2]
	rgbRr_time.angle = 0
	manip_time.initialConfig.setCropRotatedRect(rgbRr_time, False)
	manip_time.initialConfig.setResize(32,100)
	return manip_time
def NN_node(pipeline,path):
	nn = pipeline.create(dai.node.NeuralNetwork)
	nn.setBlobPath(path)
	return nn

pipeline = dai.Pipeline()
path_1='model/model_norm_gray.blob'
path_rec='crnn.blob'

cam = pipeline.create(dai.node.ColorCamera)
cam.setBoardSocket(dai.CameraBoardSocket.RGB)
cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
cam.setInterleaved(False)
cam.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)

manip_time=Manip_Frame(pipeline,time_bb_cord)
cam.preview.link(manip_time.inputImage)

detection = pipeline.createNeuralNetwork()
detection.setBlobPath(path_1)
manip_time.out.link(detection.input)

pass_out = pipeline.createXLinkOut()
pass_out.setStreamName('pass_through')
detection.passthrough.link(pass_out.input)

manip_script = pipeline.create(dai.node.Script)
manip_script.inputs['nn_in'].setBlocking(False)
manip_script.inputs['nn_in'].setQueueSize(5)
# manip_time.out.link(manip_script.inputs['nn_in'])
detection.out.link(manip_script.inputs['nn_in'])
manip_script.setScript(""" 
frame=ImgFrame(6400)
data=node.io['nn_in'].get()
node.warn(f"Type of data: {len(data.getData())}")
frame.setData(data.getData())
frame.setWidth(100)
frame.setHeight(32)
node.io['script_out'].send(frame)
""")

x_out = pipeline.createXLinkOut()
x_out.setStreamName('custom')
manip_script.outputs['script_out'].link(x_out.input)
with dai.Device(pipeline) as device:
	nn_queue = device.getOutputQueue(name="custom", maxSize=5, blocking=False)
	pass_queue = device.getOutputQueue(name="pass_through", maxSize=5, blocking=False)
	i=0
	while True:
		pass_frame=pass_queue.get()
		print('pass through ',pass_frame.getSequenceNum())
		print(i)
		frame=nn_queue.get()
		if frame is None:
			print('here')
		else:	
			print('size ',frame.getHeight(),frame.getWidth())
		# print(type(time_frame))
		i=i+1
		print(i)
		if i>=500:
			break`

erik · Mar 11, 2022

Hello msee19018 ,
My bad, it's actually the NN that expects 3x100, ImageManip does send 32x100 frames to it. I would double-check the NN architecture/conversion.
Thanks, Erik

Mmsee19018 · Mar 11, 2022

erik Hi Erik, this is code that i am using for the model. It converts image to grayscale and normalizes it

from typing import Tuple

import torch
import torch.nn as nn

class Grayscale(nn.Module):
    def __init__(self, shape: Tuple[int, int, int, int], dtype=torch.float):
        super(Grayscale, self).__init__()
        self.shape = shape
        self.dtype = dtype

    def forward(self, x):
        y_r = x[ :, :, 0]
        y_g = x[ :, :, 1]
        y_b = x[ :, :, 2]
        g_ = 0.3 * y_r + 0.59 * y_g + 0.11 * y_b
        g_=torch.sub(torch.div(g_,127.5),1)
        return g_
def export_onnx():

    shape = (32, 100, 3)
    model = Grayscale(shape=shape, dtype=torch.float)
    X = torch.ones(shape, dtype=torch.float)
    torch.onnx.export(
        model,
        X,
        'model2.onnx',
        opset_version=12,
        do_constant_folding=True
    )

export_onnx()
print('done')`

Mmsee19018 · Mar 11, 2022

msee19018 and Here is onnx model

erik · Mar 11, 2022

Hello msee19018 ,
your shape is incorrect - you should use (3,100,32), as currently, it's expecting 32 channels and 100x3 images.
Thanks, Erik

Mmsee19018 · Mar 11, 2022

erik Thanks erik for your patience and guidance. I changed the size of input to [3,100,32] as you suggested now the output looks as in first pic below. It did not solve the warning but warning just changes to [warning] Input image (100x32) does not match NN (32x100). In this case the program at least keeps running but it just keeps printing the warning it does not show any thing else that i am printing to console.

Then i changed the input shape to [3,32,100] and the warning resolved but pipeline gets stuck after second passthrough as you can see in second pic. ImageManip node is same for both cases

erik · Mar 11, 2022

msee19018 I would suggest removing the Script node at first, and sending&logging received bytes on the host. The output of the NN is FP16, so I am not sure if you will be able to (easily) convert that into INT8 that is expected for the ImgFrame.
Thanks, Erik