Incorrect Output visualization
So, I'm trying to run my deeplabv3plus model using an image from the host. I am using depthai oak only as a processing unit here.
I normalize my image by dividng by 255 and this as well: normalized_image = (image - mean) / std and image = cv2.imread(img_path2)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) this as well.
However, I'm still not getting the correct output visualization.
I have used both tensorflow and pytorch deeplabv3plus model and I have confirmed they show correct results on my host device using both onnx, and pytorch models. As for the tensorflow model it works on image from the colorcamera so I think it is something to do with the way I send my image to oakd.
Tanisha frame = dai.ImgFrame()
frame.setData(to_planar(img2,(nn_shape,nn_shape)))
frame.setType(dai.RawImgFrame.Type.RGB888p)
frame.setWidth(nn_shape)
frame.setHeight(nn_shape)
Try using RGB888i. I think the pipeline expects interleaved input and converts it to planar which messes up the image.
Thanks,
Jaka
Tried that, didn't give correct visualization. I changed the code a bit and I'm able to see the outlines now but I'm still not getting the correct output.
`pipeline = depthai.Pipeline()
neural_network = pipeline.create(depthai.node.NeuralNetwork)
neural_network.setBlobPath(nn_path)
xin_nn = pipeline.create(depthai.node.XLinkIn)
xin_nn.out.link(neural_network.input)
xin_nn.setStreamName("nn_in")
xout_nn = pipeline.create(depthai.node.XLinkOut)
xout_nn.setStreamName("nn_out")
neural_network.out.link(xout_nn.input)
image = cv2.imread(img_path2)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
img2 = cv2.resize(image, dsize=(nn_shape, nn_shape))
img2 = img2.transpose((2, 0, 1))
with depthai.Device(pipeline) as device:
q_nn = device.getOutputQueue("nn_out")
q_nn_in = device.getInputQueue("nn_in")
nn_data = dai.NNData()
nn_data.setLayer("input_layer_name", img2)
q_nn_in.send(nn_data)
oot = q_nn.get()
layers = oot.getAllLayers()
for layer_nr, layer in enumerate(layers):
print(f"Layer {layer_nr}")
print(f"Name: {layer.name}")
print(f"Order: {layer.order}")
print(f"dataType: {layer.dataType}")
dims = layer.dims # reverse dimensions
print(f"dims: {dims}")
out1 = (np.array(oot.getLayerFp16(layers[0].name)))
output = out1.reshape(5, nn_shape, nn_shape).astype(np.uint8)
output = np.transpose(output, (1,2,0))
test_pred = np.argmax(output, axis=2)`
I added this line astype(np.uint16) and used dai.NNData() instead of dai.ImgFrame.
I think it's the way I'm quantizing?
I can see outlines of the leaves ^
Can you try adding the following flags to model optimizer:
--mean_values [123.675,116.28,103.53] \
--scale_values [58.395,57.12,57.375] \
This is ImageNet mean and scale multiplied by 255. If you also add --reverse_input_channels
model should expect BGR 0-255 images.
To make sure the normalization is correct, you can try installing openvino-dev==2022.3
first, and calling model optimizer yourself, like:
mo.py \
--input_model model.onnx \
--model_name segmentation_model \
--data_type FP16 \
--output_dir output_dir \
--input_shape [1,3,1088,1088] \
--mean_values [123.675,116.28,103.53] \
--scale_values [58.395,57.12,57.375] \
--reverse_input_channels
This will produce OpenVINO IR, which is intermediate representation.
from openvino.inference_engine import IECore
ie = IECore()
net = ie.read_network(model=model_xml, weights=model_bin)
input_blob = next(iter(net.input_info))
exec_net = ie.load_network(network=net, device_name='CPU')
img = cv2.imread("img.png") # make sure image is of correct shape
image = img.astype(np.float32)
image = np.expand_dims(image, axis=0)
image = np.moveaxis(image, 3, -3)
output = exec_net.infer(inputs={input_blob: image})
You can then find the output and post-process it in the same manner as you would otherwise. This basically takes the exported model in .xml and .bin (intermediate representation before blob) and runs it on your CPU.
Can you let me know what is the result of the above?
Converting to blob gives me this output:
What preprocessing or post processing should I do in this case?
- Edited
Matija
This is how I preprocess:
mean = np.array([0.485, 0.456, 0.406])
scale = np.array([0.229, 0.224, 0.225])
img = cv2.imread(img_path)
img = cv2.resize(img, dsize=(1008, 1008), interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = img.astype(np.float32) / 255.0
normalized_image = (img - mean) / scale
normalized_image2 = np.expand_dims(normalized_image, axis=0)
normalized_image2 = np.moveaxis(normalized_image2, 3, -3)
Code for inference on oakd
pipeline = depthai.Pipeline()
pipeline.setOpenVINOVersion(version = dai.OpenVINO.VERSION_2022_1)
neural_network = pipeline.create(depthai.node.NeuralNetwork)
neural_network.setBlobPath(nn_path)
xin_nn = pipeline.create(depthai.node.XLinkIn)
xin_nn.out.link(neural_network.input)
xin_nn.setStreamName("nn_in")
xout_nn = pipeline.create(depthai.node.XLinkOut)
xout_nn.setStreamName("nn_out")
neural_network.out.link(xout_nn.input)
with depthai.Device(pipeline) as device:
q_nn = device.getOutputQueue("nn_out")
q_nn_in = device.getInputQueue("nn_in")
nn_data = dai.NNData()
nn_data.setLayer("input_layer_name", normalized_image2)
q_nn_in.send(nn_data)
oot = q_nn.get()
layers = oot.getAllLayers()
for layer_nr, layer in enumerate(layers):
print(f"Layer {layer_nr}")
print(f"Name: {layer.name}")
print(f"Order: {layer.order}")
print(f"dataType: {layer.dataType}")
dims = layer.dims # reverse dimensions
print(f"dims: {dims}")
This is my preprocessing:
out1 = (np.array(oot.getLayerFp16(layers[0].name)))
output = out1.reshape(1, 5, nn_shape, nn_shape)
output = output.squeeze(0)
output = np.transpose(output, (1,2,0))
test_pred = np.argmax(output, axis=2)
But it still gives me this output: