• DepthAI-v2
  • VideoEncoder + ImageManip warpMesh very slow

Hello,

I am trying to unwarp the images using imageManip and then output a h264 encoded stream. Doing just the unwarping or the encoding separately works fine, but running both at the same time results in low fps and high latency.

Am I doing something wrong or am I hitting a hardware limitation ?

Thanks !

Below is my code (this is actually this example, I just added the imageManip node.)

...

# Create pipeline
pipeline = dai.Pipeline()

# Define sources and output
camRgb = pipeline.create(dai.node.ColorCamera)
videoEnc = pipeline.create(dai.node.VideoEncoder)
xout = pipeline.create(dai.node.XLinkOut)
xout.setStreamName("enc")

# Properties
camRgb.setBoardSocket(dai.CameraBoardSocket.RGB)
camRgb.setResolution(res_opts[args.resolution])
camRgb.setFps(args.fps)
camRgb.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)
videoEnc.setDefaultProfilePreset(camRgb.getFps(), enc_opts[args.encode])


manip = pipeline.createImageManip()
mesh, meshWidth, meshHeight = pickle.load(open("mesh.pckl", 'rb'))
manip.setWarpMesh(mesh, meshWidth, meshHeight)
manip.setMaxOutputFrameSize(
    1920 * 1080 * 3
)

# Linking
camRgb.video.link(manip.inputImage)
manip.out.link(videoEnc.input)
videoEnc.bitstream.link(xout.input)

...

    Hi apirrone
    Run your code with DEPTHAI_LEVEL=DEBUG python3 <script.py> to check resource usage (cmx and shaves).

    Thanks,
    Jaka

    Without imageManip :

    With imageManip

    It seems to be a little more cpu intensive with the unwarping, but still seems quite reasonable

    What could cause this then ?

    Thanks !

      Hi apirrone
      Look at CMX slices and shaves allocation.

      I think they might be both using the same slices which would cause problems.

      Thanks,
      Jaka

        hi @jakaskerl,

        I am afraid I don't really know what this means, what can I do to make them use different slices ?

        Thanks a lot,

        Antoine

        Ah I think I understand what you mean, I should look at this ?

        jakaskerl

        I think they might be both using the same slices which would cause problems.

        What can I do about it ?

        Thanks !

        Antoine

          Hi apirrone
          I'll look into it locally. Could you paste the whole code you are using (if just one file) so I don't have to recreate it manually?

          Thanks,
          Jaka

          Here is the full script :

          #!/usr/bin/env python3
          
          import depthai as dai
          import subprocess as sp
          from os import name as osName
          import argparse
          import sys
          import pickle
          
          parser = argparse.ArgumentParser()
          parser.add_argument(
              "-u",
              "--unwrap",
              default=False,
              action="store_true",
              help="Activate imagemanip unwrapping",
          )
          parser.add_argument(
              "-v",
              "--verbose",
              default=False,
              action="store_true",
              help="Prints latency for the encoded frame data to reach the app",
          )
          args = parser.parse_args()
          
          # Create pipeline
          pipeline = dai.Pipeline()
          
          # Define sources and output
          camRgb = pipeline.create(dai.node.ColorCamera)
          videoEnc = pipeline.create(dai.node.VideoEncoder)
          xout = pipeline.create(dai.node.XLinkOut)
          xout.setStreamName("enc")
          
          # Properties
          camRgb.setBoardSocket(dai.CameraBoardSocket.RGB)
          camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
          camRgb.setFps(30)
          camRgb.setImageOrientation(dai.CameraImageOrientation.ROTATE_180_DEG)
          videoEnc.setDefaultProfilePreset(
              camRgb.getFps(), dai.VideoEncoderProperties.Profile.H264_MAIN
          )
          
          # Unwarping imagemanip
          manip = pipeline.createImageManip()
          mesh, meshWidth, meshHeight = pickle.load(open("mesh.pckl", "rb"))
          manip.setWarpMesh(mesh, meshWidth, meshHeight)
          manip.setMaxOutputFrameSize(1920 * 1080 * 3)
          
          # Linking
          if args.unwrap:
              camRgb.video.link(manip.inputImage)
              manip.out.link(videoEnc.input)
              videoEnc.bitstream.link(xout.input)
          else:
              camRgb.video.link(videoEnc.input)
              videoEnc.bitstream.link(xout.input)
          
          
          width, height = 720, 500
          command = [
              "ffplay",
              "-i",
              "-",
              "-x",
              str(width),
              "-y",
              str(height),
              "-framerate",
              "60",
              "-fflags",
              "nobuffer",
              "-flags",
              "low_delay",
              "-framedrop",
              "-strict",
              "experimental",
          ]
          
          if osName == "nt":  # Running on Windows
              command = ["cmd", "/c"] + command
          
          try:
              proc = sp.Popen(command, stdin=sp.PIPE)  # Start the ffplay process
          except:
              exit("Error: cannot run ffplay!\nTry running: sudo apt install ffmpeg")
          
          # Connect to device and start pipeline
          with dai.Device(pipeline) as device:
              # Output queue will be used to get the encoded data from the output defined above
              q = device.getOutputQueue(name="enc", maxSize=30, blocking=True)
          
              try:
                  while True:
                      pkt = q.get()  # Blocking call, will wait until new data has arrived
                      data = pkt.getData()
                      if args.verbose:
                          latms = (dai.Clock.now() - pkt.getTimestamp()).total_seconds() * 1000
                          # Writing to a different channel (stderr)
                          # Also `ffplay` is printing things, adding a separator
                          print(f"Latency: {latms:.3f} ms === ", file=sys.stderr)
                      proc.stdin.write(data)
              except:
                  pass
          
              proc.stdin.close()

          You will also need "mesh.pckl"

          Thanks !

            Hi apirrone
            I tested the warp manip alone and I am getting basically the same result as with manip+encoder. This means the warp is too resource intensive. Lowering the input resolution considerably improved the latency.

            EDIT: Setting the preview to interleaved = false further improved the performance of manip node.

            Thoughts?
            Jaka

              jakaskerl

              I was pretty sure I tested the warp manip alone at some point and it was fine. I actually did, but using isp instead of video when linking the camera to the manip node.

              # Fast (no more latency than without manip)
              camRgb.isp.link(manip.inputImage)
              manip.out.link(xout.input)
              
              # Slow
              camRgb.video.link(manip.inputImage)
              manip.out.link(xout.input)

              But I can't do

              camRgb.isp.link(manip.inputImage) # instead of camRgb.video.link(manip.inputImage)
              manip.out.link(videoEnc.input)
              videoEnc.bitstream.link(xout.input)

              I get

              [194430109153DF1200] [1.2] [23.699] [VideoEncoder(1)] [warning] Arrived frame type (2) is not either NV12 or YUV400p (8-bit Gray)

              I don't exactly know what's going on, but I guess It has something to do with the image format

              Is there a way to do a magic image type conversion that would allow me to get good unwarp performance and video encoding at the same time ? Or another way ?

              Thanks !

              Antoine

              EDIT: Ok, so adding :

              manip.initialConfig.setFrameType(dai.ImgFrame.Type.NV12)

              Seems to make everything work fine 😃

              So I have the following :

              # Unwarping imagemanip
              manip = pipeline.createImageManip()
              mesh, meshWidth, meshHeight = pickle.load(open("mesh.pckl", "rb"))
              manip.setWarpMesh(mesh, meshWidth, meshHeight)
              manip.setMaxOutputFrameSize(1920 * 1080 * 3)
              manip.initialConfig.setFrameType(dai.ImgFrame.Type.NV12)
              
              # Linking
              
              camRgb.isp.link(manip.inputImage)
              manip.out.link(videoEnc.input)
              videoEnc.bitstream.link(xout.input)

              And I get pretty much the same performance as without the imagemanip (a tiny bit more latency)

              Without :

              With

              Thank you a lot for your precious help !

              Antoine