• DepthAI-v2
  • Capture Image using RPi GPIO, then feed it to a NN in the OAK-1.

Hello everyone,

I am building a system that has multiple neural networks. The user decides what network to use.
In this system, image captures will be used rather than a video stream.
The OAK-1 will be used with a Raspberry Pi 4 as a host.

The scenario that I want to implement is as follows:
1) The user presses a button (from RaspberryPi GPIO) to capture an image.
2) Once the image is captured, the user is prompted to choose what inference to run. There are two options: Object detection NN or OCR NN. (Again, the user will choose the desired one through RPi GPIOs)
3) After the user selects a NN, the image is fed to the selected NN and results are printed on screen.
Note that the result should be given to the user in short time (1 to 2 seconds max.)

Also, the OAK-1 should not be continuously capturing frames. It should only capture a frame once it is asked to do so (to save power)

My questions are:

1) Is it possible to implement such application using the OAK-1 and RPi? If yes, any tips would be appreciated.
2) When using the STILL property in ColorCamera, how can I let the camera be "idle" when the Capture message is not sent? (i.e. the OAK should not be continuously capturing frames. It should only capture a frame once it is asked to do so)
3) Does the NN must be loaded into the OAK each time the user chooses a NN to use? If yes, this would cause a considerable delay.

Your valuable input is much appreciated!
Best,
Hussain

  • erik replied to this.

    Hello hussain_allawati !

    1. Yes. You could either load separate pipelines for each of these scenarios (slower, as you need to restart every time which takes ~ 5 sec) or make a larger pipeline that does it all. So you would have these nodes:

      • XLinkIn --(still trigger event)--> ColorCamera (still output) -> XLinkOut
      • XLinkIn --(frame)--> ObjectDetection node -> XLinkOut
      • XLinkIn --(frame)--> NN node for OCR -> XLinkOut
        then on the host you would first capture still image and then send frame to either Object detection node or NN node for OCR, depending on user input.
    2. If this is a requirement to reduce power consumption, you can use setStopStreaming()/setStartStreaming() config. Otherwise you can just not connect other outputs (besides still -> XLinkOut) of the colorCam and it won't stream frames anywhere.

    3. If you load separate pipelines yes, if it's one large one (described in 1), then both OCR and object detection models will get uploaded to the device only once.

    Thanks, Erik

      erik
      Erik, thanks a lot for your informative reply.
      Will keep them in consideration when working on the project

      For (1), do you mean to capture the frame and send it to the host first, and then accordingly send it back to the device to either NN? If so, I have tried sending a frame back to the device, and then again back to the host to display the inference result, however, the output couldn't be displayed. Instead of having an image of (300x300x3), I had an output size of (270000, ). Any thoughts on how to solve this issue?
      The pipeline looks like this: I implemented it to solve the issue that ImageManip cannot resize STILL images; so I created this pipeline to capture a still image, resize on the host, and send it back to device for NN inference
      xinCaptureCommand --> camRGB -> outStillRGB
      then, do the resizing on host, and we have
      inResizedImage -> outResult

      x = cv2.imread("blue.png")
      pipeline = dai.Pipeline()
      
      # Create input control node to acquire capture command
      xinCaptureCommand = pipeline.create(dai.node.XLinkIn)
      xinCaptureCommand.setStreamName("capture")
      
      # Create Camera node and give its properties
      camRGB = pipeline.create(dai.node.ColorCamera)
      camRGB.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
      camRGB.setStillSize(1080, 1080)
      camRGB.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
      
      # Create output node for still images
      outStillRGB = pipeline.create(dai.node.XLinkOut)
      outStillRGB.setStreamName("rgbStill")
      
      # Create input node to receive resized imsge from host
      inResizedImage = pipeline.create(dai.node.XLinkIn)
      inResizedImage.setStreamName("resizedImage")
      
      # Create outResult output node for still images
      outResult = pipeline.create(dai.node.XLinkOut)
      outResult.setStreamName("outResult")
      
      # Linking
      xinCaptureCommand.out.link(camRGB.inputControl)
      camRGB.still.link(outStillRGB.input)
      inResizedImage.out.link(outResult.input)
      
      # Connect to device and start the pipeline
      with dai.Device(pipeline) as device:
      
          # Create queues
          stillQueue = device.getOutputQueue(name="rgbStill")
          captureInputQueue = device.getInputQueue("capture")
          sendResizedQueue = device.getInputQueue("resizedImage")
          outResultQ = device.getOutputQueue("outResult")
      
          cv2.imshow("x",x)
      
          while True:
              stillFrame = stillQueue.tryGet()
              if stillFrame is not None:
                  print("still frame:", stillFrame.getHeight(), stillFrame.getWidth(), stillFrame.getType())        
                  frame = stillFrame.getCvFrame()
                  cv2.imshow("frame", frame)
                 
                  resized = imutils.resize(frame, width = 300)
                  print("resized frame:" , resized.shape)
                  sendResized = dai.ImgFrame()
                  sendResized.setData(resized)
                  #sendResized.setData(to_planar(resized, (300, 300)))
                  sendResized.setHeight(300)
                  sendResized.setWidth(300)
                  #sendResized.setType("RGB888i")
                  sendResizedQueue.send(sendResized)
                 
                 
              testFrame = outResultQ.tryGet()
              if testFrame is not None:
                  print("frame W,H & type after receiving back again:", testFrame.getHeight(), testFrame.getWidth(), testFrame.getType())
                  result = testFrame.getCvFrame()
                  cv2.imshow("result", result)
                  print("final result shape:", result.shape)
      
              # Send capture command from host to device
              key = cv2.waitKey(1)
              if key == ord("q"):
                  break
                 
              elif key == ord('c'):
                  ctrl = dai.CameraControl()
                  ctrl.setCaptureStill(True)
                  captureInputQueue.send(ctrl)
                  print("captured")

      For (2), what to you mean by "other outputs"?

      Best,
      Hussain

      • erik replied to this.

        Hello hussain_allawati ,

        1. What output couldn't be displayed? It doesn't look like you have connected the XLinkIn to a Detection Network node / NN node.
        2. Other outputs of the ColorCamera node - preview, video, isp.

          erik
          Hey Erik,
          Thank you for your support !
          I was able to do it 😀
          I implemented a code that captures an image based on a user keypress, send it to host and resize it, send it back to the device to do inference, and finally display the results.

          I am attaching here a simple version code for reference in case anyone might need it.
          Note that the code attached does NOT contain any NN node. It simply captures an image, resized it on host, sends it back to device (here you can do whatever you want with the resized image), and finally send results back to host

          #!/usr/bin/env python3
          
          import cv2
          import depthai as dai
          import imutils
          import time
          import numpy as np
          
          xxx = cv2.imread("blue.png")
          
          # Create the pipeline
          pipeline = dai.Pipeline()
          
          
          # HOST --> xinCaptureCommand --> camRGB --> outstillRGB --> HOST
          # Resize image in host
          # HOST --> inResizedImage --> outResizedImage --> HOST
          
          # Create input control node to acquire capture command
          xinCaptureCommand = pipeline.create(dai.node.XLinkIn)
          xinCaptureCommand.setStreamName("capture")
          
          
          # Create Camera node and give its properties
          camRGB = pipeline.create(dai.node.ColorCamera)
          camRGB.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
          camRGB.setStillSize(1080, 1080)
          camRGB.setPreviewSize(1080, 1080)
          camRGB.setVideoSize(1080, 1080)
          # camRGB.setInterleaved(False)
          camRGB.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
          
          # Create output node for still images
          outStillRGB = pipeline.create(dai.node.XLinkOut)
          outStillRGB.setStreamName("rgbStill")
          
          
          # Create input node to receive resized image from host
          inResizedImage = pipeline.create(dai.node.XLinkIn)
          inResizedImage.setStreamName("inResizedImage")
          
          # Create output node for still images
          outResizedImage = pipeline.create(dai.node.XLinkOut)
          outResizedImage.setStreamName("outResizedImage")
          
          # Link output of xinCaptureCommand to camera input control
          xinCaptureCommand.out.link(camRGB.inputControl)
          
          # Link output of camera to input of xlinkout to send to deivce
          camRGB.still.link(outStillRGB.input)
          
          # Link output of inResizedImage to input of outResizedImage
          inResizedImage.out.link(outResizedImage.input)
          
          
          
          # Connect to device and start the pipeline
          with dai.Device(pipeline) as device:
          
          
              # Create input queue to device, that receives capture command
              captureInputQueue = device.getInputQueue("capture")
              
              
              # Create output queue that will get RGB frame (Output from device, and input to host)
              stillQueue = device.getOutputQueue(name="rgbStill")
          
          
              # Create input queue to device, that receives resized image 
              sendResizedQueue = device.getInputQueue("inResizedImage")
              
              # Create output queue to that will send output images to host
              outResizedImageQueue = device.getOutputQueue("outResizedImage")
          
              cv2.imshow("xxx",xxx)
              
              
              def to_planar(arr: np.ndarray, shape: tuple) -> np.ndarray:
                  return cv2.resize(arr, shape).transpose(2,0,1).flatten()
                  
                  
              while True:
                    
                  stillFrame = stillQueue.tryGet()
                  if stillFrame is not None:
                      print("Captured!")
                      print(stillFrame.getHeight(), stillFrame.getWidth(), stillFrame.getType())        
                      frame = stillFrame.getCvFrame()
                      cv2.imshow("frame", frame)
                      
                      resized = imutils.resize(frame, width = 300)
                      print(resized.shape)
                      
                      # Create message and send resized frame
                      sendResized = dai.ImgFrame()
                      sendResized.setData(to_planar(resized, (300, 300)))            
                      sendResized.setHeight(300)
                      sendResized.setWidth(300)
                      sendResized.setType(dai.ImgFrame.Type.BGR888p)
                      sendResizedQueue.send(sendResized)
                      
                      
          
          
                  finalFrame = outResizedImageQueue.tryGet()
                  if finalFrame is not None:
                      print(finalFrame.getHeight(), finalFrame.getWidth(), finalFrame.getType())
                      f= finalFrame.getCvFrame()
                      cv2.imshow("finalFrame", f)
                      print(f.shape)
                  
          
                  # Send capture command from host to device
                  key = cv2.waitKey(1)
                  if key == ord("q"):
                      break
                      
                  elif key == ord('c'):
                      ctrl = dai.CameraControl()
                      ctrl.setCaptureStill(True)
                      captureInputQueue.send(ctrl)
          7 days later

          erik
          Currently, I have an issue with focusing. I am using ColorCamera STILL output.
          Each time the user presses a button, a capture command is issued.
          The problem is that, with the first 3-4 button presses, the image is out of focus.
          The 5th or 6th press yields an image that is in focus.
          It seems that the autofocus needs several frames to adjust itself.

          1) Any thoughts how to solve this issue? Is there any property in the ColorCamera node that resolves the issue?

          2) One solution came to my mind is to capture several frames (say 6 frames), and do the inference and processing on the 6th frame only (i.e. the first 5 frames will be used for focusing, and the 6th one for inference and processing). Do you think that this makes sense and would work on the DepthAI environment?

          Best,
          Hussain

          • erik replied to this.

            Hello hussain_allawati ,
            I believe that's because the AAA algorithms don't kick in until you create a still frame. One workaround would be to have a script node that continuously reads eg. preview frames and just discards them, so the AAA (auto focus/whitebalance/exposure) would always run.

            Script node could be just this:

            while True:
                node.io['preview'].get()

            Thougts?
            Thanks, Erik

              erik
              This should work, but, however, it I think that it won't be efficient as my system is battery powered.

              How about using the PREVIEW instead of STILL, but with setStopStreaming()/setStartStreaming() configurations.
              Here, Once the user presses the button, streaming will be started. After 0.1 seconds, a stop streaming message will be sent. At 60 fps, 6 frames would be sent to the queue. The part that I am missing on how to do is to let the host process only the "latest" frame , i.e. the 6th one.

              Thoughts?

              • erik replied to this.

                Hello hussain_allawati , how about capturing still image eg. 5 times one after another when Script node starts, and then moving to the main program? I am not sure how effective the start/stop streaming would be in the case of AAA algos.