stitch camera streams

jakaskerl · Aug 7, 2023

Hi liam
You could use a Camera node (instead of ColorCamera node) to automatically undistort the images. Note that the node is in WIP and might cause some problems. https://docs.luxonis.com/projects/api/en/latest/references/python/#depthai.node.Camera

Alternatively you can manually undistort the ColorCamera with warp node (https://docs.luxonis.com/projects/api/en/latest/components/nodes/warp/).

Thanks,
Jaka

Lliam · Sep 17, 2023

Hi, we are currently facing the problem, that we want to record a video of our panorama picture. How can we do that?

jakaskerl · Sep 19, 2023

Hi liam
What problem are you running into? I would assume something to do with videoEncoder? Perhaps the write speed is to slow and the device closes the connections?

Thanks,
Jaka

Lliam · Sep 19, 2023

Hi, we utilize the NeuralNetwork's multiple input feature to create a panorama picture, like in the example shown https://docs.luxonis.com/projects/api/en/latest/samples/NeuralNetwork/concat_multi_input/ But the videoEncoder cannot receive input from a NeuralNetwork

jakaskerl · Sep 19, 2023

Hi liam
Likely, this is the cause https://docs.luxonis.com/projects/api/en/latest/components/nodes/video_encoder/#limitations. You will probably need to run the encoding on host side. You can use cv2 or ffmpeg for that. Keep in mind this is likely to take up a lot of your host's resources.

Thanks,
Jaka

Lliam · Oct 8, 2023

Hi, we would like to perform object detection on the panoramic image. However, the problem is that the panoramic image is in NN data format, which means we cannot directly input it into another neural network for object detection. Additionally, we intend for the program to run in standalone mode later. Is there a way for us to still conduct object detection on the image?

Thank you for your help

jakaskerl · Oct 8, 2023

Hi liam
That would be a 2 stage NN pipeline (like the one here). You need a script node in the middle that will extract the frame from NNData and send it to the next neural network.
You will likely need to scale down the panoramic view to fit the input size of the second network.

Thanks,
Jaka

Lliam · Nov 1, 2023

Hi, thank you for the response. How can we extract the frames from the NN data?

Thanks

jakaskerl · Nov 2, 2023

Hi liam
Do a NNdata.getAllLayers() to get the layers of the neural network, then use the layer which has image data.

ref: https://docs.luxonis.com/projects/api/en/latest/components/messages/nn_data/#nndata

Thanks,
Jaka

Lliam · Nov 8, 2023

Hi, I am still having difficulties extracting frames from nn data. I thought about using the code right here, but of course without using numpy, depthai and opencv, because in script node it is not possible to use them:

with dai.Device(pipe) as device:

rgb = device.getOutputQueue(name="rgb", maxSize=4, blocking=False)

shape = (3, 640, 640)

while True:

inRgb = np.array(rgb.get().getFirstLayerFp16())

frame = inRgb.reshape(shape).astype(np.uint8).transpose(1, 2, 0)

In the script node, I attempted to adapt this code by replacing the line:

inRgb = np.array(rgb.get().getFirstLayerFp16())

with:

frames = node.io["concat"].get().getLayerFp16("output")

„output“ is the layer name of the layer containing the frames.

However, the program does not continue from that point, and the subsequent print statements are not executed. Is there a way to extract frames in the manner I desire, and do you have an idea why the program does not continue?

Thanks for your help.

This is the code:

from pathlib import Path

import sys

import numpy as np

import cv2

import depthai as dai

import torchvision.transforms as transforms

import time

SHAPE = 300

nnPath = str((Path(file).parent / Path('../../../models/concat_openvino_2022.1_6shave.blob')).resolve().absolute())

if len(sys.argv) > 1:

nnPath = sys.argv[1]

if not Path(nnPath).exists():

import sys

raise FileNotFoundError(f'Required file/s not found, please run "{sys.executable} install_requirements.py"')

pipe = dai.Pipeline()

pipe.setOpenVINOVersion(dai.OpenVINO.VERSION_2022_1)

def camA(p, socket):

cam = p.create(dai.node.ColorCamera)

cam.setBoardSocket(socket)

cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_800_P)

cam.setPreviewSize(SHAPE, SHAPE)

cam.setInterleaved(False)

cam.setPreviewKeepAspectRatio(False)

cam.setFps(6)

maxFrameSizeA = cam.getPreviewHeight() * cam.getPreviewWidth() * 3

manip = p.create(dai.node.ImageManip)

manip.initialConfig.setResize(SHAPE, SHAPE)

manip.initialConfig.setFrameType(dai.RawImgFrame.Type.BGR888p)

rrA = dai.RotatedRect()

rrA.center.x, rrA.center.y = cam.getPreviewHeight() // 2, cam.getPreviewHeight() // 2

rrA.size.width, rrA.size.height = cam.getPreviewHeight(), cam.getPreviewWidth()

rrA.angle = 270

manip.initialConfig.setCropRotatedRect(rrA, False)

manip.setMaxOutputFrameSize(maxFrameSizeA)

cam.preview.link(manip.inputImage)

return manip.out

def camB(p, socket):

cam = p.create(dai.node.ColorCamera)

cam.setBoardSocket(socket)

cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_800_P)

cam.setPreviewSize(SHAPE, SHAPE)

cam.setInterleaved(False)

cam.setPreviewKeepAspectRatio(False)

cam.setFps(6)

maxFrameSizeA = cam.getPreviewHeight() * cam.getPreviewWidth() * 3

manip = p.create(dai.node.ImageManip)

manip.initialConfig.setResize(SHAPE, SHAPE)

manip.initialConfig.setFrameType(dai.RawImgFrame.Type.BGR888p)

rrA = dai.RotatedRect()

rrA.center.x, rrA.center.y = cam.getPreviewHeight() // 3.5, cam.getPreviewHeight() // 2

rrA.size.width, rrA.size.height = cam.getPreviewHeight(), cam.getPreviewWidth()

rrA.angle = 180

manip.initialConfig.setCropRotatedRect(rrA, False)

manip.setMaxOutputFrameSize(maxFrameSizeA)

cam.preview.link(manip.inputImage)

return manip.out

def camC(p, socket):

cam = p.create(dai.node.ColorCamera)

cam.setBoardSocket(socket)

cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_800_P)

cam.setPreviewSize(SHAPE, SHAPE)

cam.setInterleaved(False)

cam.setPreviewKeepAspectRatio(False)

cam.setFps(6)

maxFrameSizeA = cam.getPreviewHeight() * cam.getPreviewWidth() * 3

manip = p.create(dai.node.ImageManip)

manip.initialConfig.setResize(SHAPE, SHAPE)

manip.initialConfig.setFrameType(dai.RawImgFrame.Type.BGR888p)

rrA = dai.RotatedRect()

rrA.center.x, rrA.center.y = cam.getPreviewHeight() // 1.6, cam.getPreviewHeight() // 2

rrA.size.width, rrA.size.height = cam.getPreviewHeight(), cam.getPreviewWidth()

rrA.angle = 0

manip.initialConfig.setCropRotatedRect(rrA, False)

manip.setMaxOutputFrameSize(maxFrameSizeA)

cam.preview.link(manip.inputImage)

return manip.out

nn = pipe.createNeuralNetwork()

nn.setBlobPath(nnPath)

nn.setNumInferenceThreads(2)

camA(pipe, dai.CameraBoardSocket.CAM_A).link(nn.inputs['img3'])

camB(pipe, dai.CameraBoardSocket.CAM_B).link(nn.inputs['img1'])

camC(pipe, dai.CameraBoardSocket.CAM_C).link(nn.inputs['img2'])

rgb = pipe.createXLinkOut()

rgb.setStreamName("rgb")

nn.out.link(rgb.input)

script_text = """

shape = (3, 640, 640)

while True:

frames = node.io["concat"].get().getLayerFp16("output")

node.warn("hello")

#node.io["out"].send(frames)

"""

script = pipe.create(dai.node.Script)

script.setProcessor(dai.ProcessorType.LEON_CSS)

script.setScript(script_text)

script.inputs['concat'].setQueueSize(4)

script.inputs['concat'].setBlocking(False)

nn.out.link(script.inputs["concat"])

nn_xout = pipe.createXLinkOut()

nn_xout.setStreamName("nn")

script.outputs['out'].link(nn_xout.input)

with dai.Device(pipe) as device:

rgb = device.getOutputQueue(name="rgb", maxSize=4, blocking=False)

shape = (3, 640, 640)

while True:

inRgb = np.array(rgb.get().getFirstLayerFp16())

frame = inRgb.reshape(shape).astype(np.uint8).transpose(1, 2, 0)

cv2.imshow("frame", frame)

if cv2.waitKey(1) == ord('q'):

break

jakaskerl · Nov 9, 2023

Hi liam
You can do it using NNData (like done here), but it seems to be very slow since you are copying a lot of bytes using the CSS processor. I'll ask the dev team if we can integrate this FW side to speed it up.

Thanks,
Jaka