Capture Image using RPi GPIO, then feed it to a NN in the OAK-1.

Hhussain_allawati · Mar 18, 2022

Hello everyone,

I am building a system that has multiple neural networks. The user decides what network to use.
In this system, image captures will be used rather than a video stream.
The OAK-1 will be used with a Raspberry Pi 4 as a host.

The scenario that I want to implement is as follows:
1) The user presses a button (from RaspberryPi GPIO) to capture an image.
2) Once the image is captured, the user is prompted to choose what inference to run. There are two options: Object detection NN or OCR NN. (Again, the user will choose the desired one through RPi GPIOs)
3) After the user selects a NN, the image is fed to the selected NN and results are printed on screen.
Note that the result should be given to the user in short time (1 to 2 seconds max.)

Also, the OAK-1 should not be continuously capturing frames. It should only capture a frame once it is asked to do so (to save power)

My questions are:

1) Is it possible to implement such application using the OAK-1 and RPi? If yes, any tips would be appreciated.
2) When using the STILL property in ColorCamera, how can I let the camera be "idle" when the Capture message is not sent? (i.e. the OAK should not be continuously capturing frames. It should only capture a frame once it is asked to do so)
3) Does the NN must be loaded into the OAK each time the user chooses a NN to use? If yes, this would cause a considerable delay.

Your valuable input is much appreciated!
Best,
Hussain

erik · Mar 20, 2022

Hello hussain_allawati !

Yes. You could either load separate pipelines for each of these scenarios (slower, as you need to restart every time which takes ~ 5 sec) or make a larger pipeline that does it all. So you would have these nodes:
- XLinkIn --(still trigger event)--> ColorCamera (still output) -> XLinkOut
- XLinkIn --(frame)--> ObjectDetection node -> XLinkOut
- XLinkIn --(frame)--> NN node for OCR -> XLinkOut
  then on the host you would first capture still image and then send frame to either Object detection node or NN node for OCR, depending on user input.
If this is a requirement to reduce power consumption, you can use setStopStreaming()/setStartStreaming() config. Otherwise you can just not connect other outputs (besides still -> XLinkOut) of the colorCam and it won't stream frames anywhere.
If you load separate pipelines yes, if it's one large one (described in 1), then both OCR and object detection models will get uploaded to the device only once.

Thanks, Erik

Hhussain_allawati · Mar 20, 2022

erik
Erik, thanks a lot for your informative reply.
Will keep them in consideration when working on the project

For (1), do you mean to capture the frame and send it to the host first, and then accordingly send it back to the device to either NN? If so, I have tried sending a frame back to the device, and then again back to the host to display the inference result, however, the output couldn't be displayed. Instead of having an image of (300x300x3), I had an output size of (270000, ). Any thoughts on how to solve this issue?
The pipeline looks like this: I implemented it to solve the issue that ImageManip cannot resize STILL images; so I created this pipeline to capture a still image, resize on the host, and send it back to device for NN inference
xinCaptureCommand --> camRGB -> outStillRGB
then, do the resizing on host, and we have
inResizedImage -> outResult

x = cv2.imread("blue.png")
pipeline = dai.Pipeline()

# Create input control node to acquire capture command
xinCaptureCommand = pipeline.create(dai.node.XLinkIn)
xinCaptureCommand.setStreamName("capture")

# Create Camera node and give its properties
camRGB = pipeline.create(dai.node.ColorCamera)
camRGB.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
camRGB.setStillSize(1080, 1080)
camRGB.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)

# Create output node for still images
outStillRGB = pipeline.create(dai.node.XLinkOut)
outStillRGB.setStreamName("rgbStill")

# Create input node to receive resized imsge from host
inResizedImage = pipeline.create(dai.node.XLinkIn)
inResizedImage.setStreamName("resizedImage")

# Create outResult output node for still images
outResult = pipeline.create(dai.node.XLinkOut)
outResult.setStreamName("outResult")

# Linking
xinCaptureCommand.out.link(camRGB.inputControl)
camRGB.still.link(outStillRGB.input)
inResizedImage.out.link(outResult.input)

# Connect to device and start the pipeline
with dai.Device(pipeline) as device:

    # Create queues
    stillQueue = device.getOutputQueue(name="rgbStill")
    captureInputQueue = device.getInputQueue("capture")
    sendResizedQueue = device.getInputQueue("resizedImage")
    outResultQ = device.getOutputQueue("outResult")

    cv2.imshow("x",x)

    while True:
        stillFrame = stillQueue.tryGet()
        if stillFrame is not None:
            print("still frame:", stillFrame.getHeight(), stillFrame.getWidth(), stillFrame.getType())        
            frame = stillFrame.getCvFrame()
            cv2.imshow("frame", frame)
           
            resized = imutils.resize(frame, width = 300)
            print("resized frame:" , resized.shape)
            sendResized = dai.ImgFrame()
            sendResized.setData(resized)
            #sendResized.setData(to_planar(resized, (300, 300)))
            sendResized.setHeight(300)
            sendResized.setWidth(300)
            #sendResized.setType("RGB888i")
            sendResizedQueue.send(sendResized)
           
           
        testFrame = outResultQ.tryGet()
        if testFrame is not None:
            print("frame W,H & type after receiving back again:", testFrame.getHeight(), testFrame.getWidth(), testFrame.getType())
            result = testFrame.getCvFrame()
            cv2.imshow("result", result)
            print("final result shape:", result.shape)

        # Send capture command from host to device
        key = cv2.waitKey(1)
        if key == ord("q"):
            break
           
        elif key == ord('c'):
            ctrl = dai.CameraControl()
            ctrl.setCaptureStill(True)
            captureInputQueue.send(ctrl)
            print("captured")

For (2), what to you mean by "other outputs"?

Best,
Hussain

erik · Mar 23, 2022

Hello hussain_allawati ,

What output couldn't be displayed? It doesn't look like you have connected the XLinkIn to a Detection Network node / NN node.
Other outputs of the ColorCamera node - preview, video, isp.

Hhussain_allawati · Mar 25, 2022

erik
Hey Erik,
Thank you for your support !
I was able to do it
I implemented a code that captures an image based on a user keypress, send it to host and resize it, send it back to the device to do inference, and finally display the results.

I am attaching here a simple version code for reference in case anyone might need it.
Note that the code attached does NOT contain any NN node. It simply captures an image, resized it on host, sends it back to device (here you can do whatever you want with the resized image), and finally send results back to host

#!/usr/bin/env python3

import cv2
import depthai as dai
import imutils
import time
import numpy as np

xxx = cv2.imread("blue.png")

# Create the pipeline
pipeline = dai.Pipeline()


# HOST --> xinCaptureCommand --> camRGB --> outstillRGB --> HOST
# Resize image in host
# HOST --> inResizedImage --> outResizedImage --> HOST

# Create input control node to acquire capture command
xinCaptureCommand = pipeline.create(dai.node.XLinkIn)
xinCaptureCommand.setStreamName("capture")


# Create Camera node and give its properties
camRGB = pipeline.create(dai.node.ColorCamera)
camRGB.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
camRGB.setStillSize(1080, 1080)
camRGB.setPreviewSize(1080, 1080)
camRGB.setVideoSize(1080, 1080)
# camRGB.setInterleaved(False)
camRGB.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)

# Create output node for still images
outStillRGB = pipeline.create(dai.node.XLinkOut)
outStillRGB.setStreamName("rgbStill")


# Create input node to receive resized image from host
inResizedImage = pipeline.create(dai.node.XLinkIn)
inResizedImage.setStreamName("inResizedImage")

# Create output node for still images
outResizedImage = pipeline.create(dai.node.XLinkOut)
outResizedImage.setStreamName("outResizedImage")

# Link output of xinCaptureCommand to camera input control
xinCaptureCommand.out.link(camRGB.inputControl)

# Link output of camera to input of xlinkout to send to deivce
camRGB.still.link(outStillRGB.input)

# Link output of inResizedImage to input of outResizedImage
inResizedImage.out.link(outResizedImage.input)



# Connect to device and start the pipeline
with dai.Device(pipeline) as device:


    # Create input queue to device, that receives capture command
    captureInputQueue = device.getInputQueue("capture")
    
    
    # Create output queue that will get RGB frame (Output from device, and input to host)
    stillQueue = device.getOutputQueue(name="rgbStill")


    # Create input queue to device, that receives resized image 
    sendResizedQueue = device.getInputQueue("inResizedImage")
    
    # Create output queue to that will send output images to host
    outResizedImageQueue = device.getOutputQueue("outResizedImage")

    cv2.imshow("xxx",xxx)
    
    
    def to_planar(arr: np.ndarray, shape: tuple) -> np.ndarray:
        return cv2.resize(arr, shape).transpose(2,0,1).flatten()
        
        
    while True:
          
        stillFrame = stillQueue.tryGet()
        if stillFrame is not None:
            print("Captured!")
            print(stillFrame.getHeight(), stillFrame.getWidth(), stillFrame.getType())        
            frame = stillFrame.getCvFrame()
            cv2.imshow("frame", frame)
            
            resized = imutils.resize(frame, width = 300)
            print(resized.shape)
            
            # Create message and send resized frame
            sendResized = dai.ImgFrame()
            sendResized.setData(to_planar(resized, (300, 300)))            
            sendResized.setHeight(300)
            sendResized.setWidth(300)
            sendResized.setType(dai.ImgFrame.Type.BGR888p)
            sendResizedQueue.send(sendResized)
            
            


        finalFrame = outResizedImageQueue.tryGet()
        if finalFrame is not None:
            print(finalFrame.getHeight(), finalFrame.getWidth(), finalFrame.getType())
            f= finalFrame.getCvFrame()
            cv2.imshow("finalFrame", f)
            print(f.shape)
        

        # Send capture command from host to device
        key = cv2.waitKey(1)
        if key == ord("q"):
            break
            
        elif key == ord('c'):
            ctrl = dai.CameraControl()
            ctrl.setCaptureStill(True)
            captureInputQueue.send(ctrl)

Hhussain_allawati · Apr 1, 2022

erik
Currently, I have an issue with focusing. I am using ColorCamera STILL output.
Each time the user presses a button, a capture command is issued.
The problem is that, with the first 3-4 button presses, the image is out of focus.
The 5th or 6th press yields an image that is in focus.
It seems that the autofocus needs several frames to adjust itself.

1) Any thoughts how to solve this issue? Is there any property in the ColorCamera node that resolves the issue?

2) One solution came to my mind is to capture several frames (say 6 frames), and do the inference and processing on the 6th frame only (i.e. the first 5 frames will be used for focusing, and the 6th one for inference and processing). Do you think that this makes sense and would work on the DepthAI environment?

Best,
Hussain

erik · Apr 1, 2022

Hello hussain_allawati ,
I believe that's because the AAA algorithms don't kick in until you create a still frame. One workaround would be to have a script node that continuously reads eg. preview frames and just discards them, so the AAA (auto focus/whitebalance/exposure) would always run.

Script node could be just this:

while True:
    node.io['preview'].get()

Thougts?
Thanks, Erik

Hhussain_allawati · Apr 1, 2022

erik
This should work, but, however, it I think that it won't be efficient as my system is battery powered.

How about using the PREVIEW instead of STILL, but with setStopStreaming()/setStartStreaming() configurations.
Here, Once the user presses the button, streaming will be started. After 0.1 seconds, a stop streaming message will be sent. At 60 fps, 6 frames would be sent to the queue. The part that I am missing on how to do is to let the host process only the "latest" frame , i.e. the 6th one.

Thoughts?

erik · Apr 1, 2022

Hello hussain_allawati , how about capturing still image eg. 5 times one after another when Script node starts, and then moving to the main program? I am not sure how effective the start/stop streaming would be in the case of AAA algos.

Hhussain_allawati · Apr 1, 2022

erik
Sounds good and would make my life easier.
Will give it a try.