erik Thanks erik, Another thing output of a Neural Network node are bytes and i can convert them into 2d array using setSize() method of ImgFrame but how can I make 3d or four 4d like [1,1,32,100] as my model takes input of this form?
Neural Network Node (multiple inputs to one node)
- Edited
erik Thanks Erik. This is my first project with oak-d and depthai that is why i am asking a lot of basic questions. I pasted my code below can you suggest what is wrong in it. As my pipeline gets blocked or i do not know what happens. It keep running or it seems so because it keeps printing warnings that size of input and network does not match but it does not print the other lines that i am using for debugging like print statement in line 67 and 68. I think i gets stuck in script node second time it goes through it. Can you help me here? Last one is pic for what see on command line.
Hello msee19018 ,
It looks like your first NN model (model_norm_gray.blob) doesn't get the 32x100 frame it expects. Could you share the whole code (not in screenshots), so I can also see the logic inside the Manip_Frame()
function? I suspect it doesn't resize frames to the needed size/shape.
Thanks, Erik
- Edited
import cv2
import depthai as dai
import numpy as np
time_bb_cord=[200,400,100,350]
local_bb_cord=[300,500,200,350]
visit_bb_cord=[300,500,200,350]
# Create pipeline
def Manip_Frame(pipeline,region_bb_cord):
manip_time = pipeline.create(dai.node.ImageManip)
rgbRr_time = dai.RotatedRect()
rgbRr_time.center.x, rgbRr_time.center.y = (region_bb_cord[0]+region_bb_cord[1]) // 2, (region_bb_cord[2]+region_bb_cord[3]) // 2
rgbRr_time.size.width, rgbRr_time.size.height = region_bb_cord[1]-region_bb_cord[0], region_bb_cord[3]-region_bb_cord[2]
rgbRr_time.angle = 0
manip_time.initialConfig.setCropRotatedRect(rgbRr_time, False)
manip_time.initialConfig.setResize(32,100)
return manip_time
def NN_node(pipeline,path):
nn = pipeline.create(dai.node.NeuralNetwork)
nn.setBlobPath(path)
return nn
pipeline = dai.Pipeline()
path_1='model/model_norm_gray.blob'
path_rec='crnn.blob'
cam = pipeline.create(dai.node.ColorCamera)
cam.setBoardSocket(dai.CameraBoardSocket.RGB)
cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
cam.setInterleaved(False)
cam.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
manip_time=Manip_Frame(pipeline,time_bb_cord)
cam.preview.link(manip_time.inputImage)
detection = pipeline.createNeuralNetwork()
detection.setBlobPath(path_1)
manip_time.out.link(detection.input)
pass_out = pipeline.createXLinkOut()
pass_out.setStreamName('pass_through')
detection.passthrough.link(pass_out.input)
manip_script = pipeline.create(dai.node.Script)
manip_script.inputs['nn_in'].setBlocking(False)
manip_script.inputs['nn_in'].setQueueSize(5)
# manip_time.out.link(manip_script.inputs['nn_in'])
detection.out.link(manip_script.inputs['nn_in'])
manip_script.setScript("""
frame=ImgFrame(6400)
data=node.io['nn_in'].get()
node.warn(f"Type of data: {len(data.getData())}")
frame.setData(data.getData())
frame.setWidth(100)
frame.setHeight(32)
node.io['script_out'].send(frame)
""")
x_out = pipeline.createXLinkOut()
x_out.setStreamName('custom')
manip_script.outputs['script_out'].link(x_out.input)
with dai.Device(pipeline) as device:
nn_queue = device.getOutputQueue(name="custom", maxSize=5, blocking=False)
pass_queue = device.getOutputQueue(name="pass_through", maxSize=5, blocking=False)
i=0
while True:
pass_frame=pass_queue.get()
print('pass through ',pass_frame.getSequenceNum())
print(i)
frame=nn_queue.get()
if frame is None:
print('here')
else:
print('size ',frame.getHeight(),frame.getWidth())
# print(type(time_frame))
i=i+1
print(i)
if i>=500:
break`
- Edited
erik Hi Erik, this is code that i am using for the model. It converts image to grayscale and normalizes it
from typing import Tuple
import torch
import torch.nn as nn
class Grayscale(nn.Module):
def __init__(self, shape: Tuple[int, int, int, int], dtype=torch.float):
super(Grayscale, self).__init__()
self.shape = shape
self.dtype = dtype
def forward(self, x):
y_r = x[ :, :, 0]
y_g = x[ :, :, 1]
y_b = x[ :, :, 2]
g_ = 0.3 * y_r + 0.59 * y_g + 0.11 * y_b
g_=torch.sub(torch.div(g_,127.5),1)
return g_
def export_onnx():
shape = (32, 100, 3)
model = Grayscale(shape=shape, dtype=torch.float)
X = torch.ones(shape, dtype=torch.float)
torch.onnx.export(
model,
X,
'model2.onnx',
opset_version=12,
do_constant_folding=True
)
export_onnx()
print('done')`
msee19018 and Here is onnx model
- Edited
erik Thanks erik for your patience and guidance. I changed the size of input to [3,100,32] as you suggested now the output looks as in first pic below. It did not solve the warning but warning just changes to [warning] Input image (100x32) does not match NN (32x100). In this case the program at least keeps running but it just keeps printing the warning it does not show any thing else that i am printing to console.
Then i changed the input shape to [3,32,100] and the warning resolved but pipeline gets stuck after second passthrough as you can see in second pic. ImageManip node is same for both cases
erik Should i make the model int8 type in openvino and similarly the blob file?
msee19018 Since you are manipulating an image (so input is U8), you need to specify the compile_tool argument to -ip U8
(to add conversion layer U8 -> FP16 before the input to the model). What I wanted to say is that the output of the model will be FP16, so converting it back to U8 isn't trivial to do on the Script node, so I would first suggest converting it on the host.
Thanks, Erik
erik Okay I got the idea and i will try and see what happens. But thing is I want to finally run my model on standalone mode. So I wont be able use host for any thing in that case. How can i manage that? Is there any way i can do normalization on device without using script node? In normalization I just need to divide whole image by 127.5 and then subtract 1 from it.
erik Hello Erik, I hope you are doing great. Thanks for your help. I managed to get around the problem i was facing, to solve it for now I removed the normalization code from the model training and retrained my model. So now I do not need to add a neural network node for normalization.
I have one other problem which I also asked you earlier in this thread. I have three patches extracted from same camera frame using three ImageManip nodes and I need to pass them through same neural network one by one. You suggested to use script node can you please explain a little more how can I do that. How can i use script in between to network nodes.
I have pasted my code below for your reference. At the moment i am just passing one image patch through the network. I have uncommented the other two ImageManip nodes in code so that you can see what I want to do. Right I am just passing output manip_time node through network. You might think I am missing connections of two ImageManip nodes but i just added them for you to understand what i want.
`import cv2
import depthai as dai
import numpy as np
time_bb_cord=[0.3617,0.6175,0.6414,0.7887]
local_bb_cord=[0.2750,0.28,0.4365,0.44375]
visit_bb_cord=[0.5846,0.27625,0.7435,0.445]
def Manip_Frame(pipeline,region_bb_cord):
manip_time = pipeline.create(dai.node.ImageManip)
manip_time.initialConfig.setCropRect(region_bb_cord[0],region_bb_cord[1],region_bb_cord[2],region_bb_cord[3])
manip_time.setKeepAspectRatio(False)
manip_time.initialConfig.setResize(100,32)
return manip_time
def NN_node(pipeline,path):
nn = pipeline.create(dai.node.NeuralNetwork)
nn.setBlobPath(path)
return nn
pipeline = dai.Pipeline()
model_path = 'crnn_99_soft_no_norm.blob'
cam = pipeline.create(dai.node.MonoCamera)
cam.setBoardSocket(dai.CameraBoardSocket.RIGHT)
cam.setFps(6)
cam.setResolution(dai.MonoCameraProperties.SensorResolution.THE_800_P)
manip_time=Manip_Frame(pipeline,time_bb_cord)
cam.out.link(manip_time.inputImage)
manip_local=Manip_Frame(pipeline,local_bb_cord)
cam.out.link(manip_local.inputImage)
manip_visit=Manip_Frame(pipeline,visit_bb_cord)
cam.out.link(manip_visit.inputImage)
recognition_nn=NN_node(pipeline,model_path)
manip_time.out.link(recognition_nn.input)
nn_Out = pipeline.create(dai.node.XLinkOut)
nn_Out.setStreamName("rec_time")
recognition_nn.out.link(nn_Out.input)
with dai.Device(pipeline) as device:
nn_queue = device.getOutputQueue(name="rec_time", maxSize=4, blocking=False)
while True:
nn_out = nn_queue.get()
if nn_out is not None:
outname=nn_out.getAllLayerNames()[0]
data=nn_out.getLayerFp16(outname)
raw_preds=[]
for i in range(24):
log_probs=data[i*13:i*13+13]
raw_preds.append(log_probs.index(max(log_probs)))
results=[]
previous=None
for l in raw_preds:
if l != previous:
results.append(l)
previous = l
results = [l for l in results if l != 0]
print(results)
else:
break
`
Hello msee19018 ,
What you need is syncing from multiple inputs (from multiple ImageManip nodes tiling a frame) and then forward frames. Script node would read all frames then sync them based on sequence number. Once it gets all 3 subframes/tiles from the same frame, it forwards them to the NN node (3 outputs, as NN node has 3 inputs). Forwarding frames/demux example here. An example of how to sync multiple frames based on sequence number can be found here.
Thanks, Erik
erik Erik, I got everything from your last reply except where you said NN can have 3 inputs. How can it be done? does not Neural Network only has one input as described in docs? I am confused here.
- Edited
erik one more thing I need a high quality gray scale image. How can I convert ColorCamera Image to gray scale? without using any neural network. Are there any functions inside depthai API that can do this?
Hello msee19018 ,
I don't think that's supported by default, but you could use Kornia's rgb_to_grayscale function in custom NN model.
Thanks, Erik