Hi chandrian ,
I assume you are using something similar to Lossless Zooming. So first you would want to rotate the frame 90deg (so people are upright), do the face detection, crop the original (rotated) 4k image into 1080x1920 (as in the lossless zooming example), then rotate that to 1080P, which you can feed into the UVC node. Thougths?
Thanks ,Erik

Ok so this wouldnt be in the script then. I realize script is mostly for changing the pipeline anyway. Yes that sounds like a plan for me. I will attempt and let you know. Thanks!!

I will probably need to remove this before the rotate then? : cam.setVideoSize(1920, 1080)

  • erik replied to this.

    Hi chandrian , by default you will want to rotate the images by 90deg. So you will likely want 4k, then rotate it by 90deg, then do inference, then crop, then rotate back by -90deg to get to 1920x1080.

    Ok thanks! Is all of this happening in before the script node? Or is that unnecessary.

    And how does the script node work in terms of code path. I see a "while true" in the script with no breaks and a while true after the script. do they run in parallel?

    I tried keeping the same dimensions as my working code and just flipping twice and I didnt not get an output stream and then I tried a zero degree turn twice and still no stream. Am I messing something up here:

            manipRgb = pipeline.createImageManip()
            rgbRr = dai.RotatedRect()
            rgbRr.center.x, rgbRr.center.y = cam.getPreviewWidth() // 2, cam.getPreviewHeight() // 2
            rgbRr.size.width, rgbRr.size.height = cam.getPreviewHeight(), cam.getPreviewWidth()
            rgbRr.angle = 0
            manipRgb.initialConfig.setCropRotatedRect(rgbRr, False)
            cam.preview.link(manipRgb.inputImage)
    
            manipRgb2 = pipeline.createImageManip()
            manipRgb2.initialConfig.setCropRotatedRect(rgbRr, False)
            manipRgb.out.link(manipRgb2.inputImage)
    
            # Create an UVC (USB Video Class) output node. It needs 1920x1080, NV12 input
            uvc = pipeline.createUVC()
            manipRgb2.out.link(uvc.input)

    I actually cant get the cam.video to go through any manipulation node and into the UVC

    I tried passing the cam.video into the manip node and into the uvc. Then I tried setting the preview to 1920x1080 (is that a possible size?) and feeding that into the manip node and into uvc and I still could not get that working either.

    • erik replied to this.

      Hi chandrian ,
      With the new depthai you can also use cam.video with ImageManip. I believe we plan to update the depthai uvc branch to latest, so you will be able to achieve this. Regarding the issue, please submit the full MRE.
      Thanks, Erik

      Ok I will try to submit that. I have a deadline soon so I am not sure that will be done in time. Do you think it would be possible to rotate the facial recognition input so that, if the camera is 90 rotated, it will still recognize faces? I will try that today but no luck so far. Actually I think its working now.. more details to come
      Thanks,
      Aaron

      edit:
      Facial recognition seems to be working (blue square coming up) but not tracking at this moment.
      edit2:
      I think the blue squares were windows camera app tracking face, not the depthai.

      I am not having much success with rotating the input to the facial recognition. Do you think this is possible? If not, do you have another suggestion?

      • erik replied to this.
        7 days later

        Thanks for the reply Eric,

        I have been trying to get this to work with no luck. It seems when I add multiple nodes, it does not function well. I am just feeding isp output into a mobilenet.

        I tried just duplicating the resize image ImageManip above to prove functionality since I got the single resize working and I cannot get it to pass to the mobilnet and function.

        This is using the UVC demo.

        Thanks!


        Here's the code;

        #!/usr/bin/env python3
        
        import cv2
        import depthai as dai
        import blobconverter
        
        # Create pipeline
        pipeline = dai.Pipeline()
        
        # Define source and output
        camRgb = pipeline.create(dai.node.ColorCamera)
        xoutVideo = pipeline.create(dai.node.XLinkOut)
        
        xoutVideo.setStreamName("video")
        
        # Properties
        camRgb.setBoardSocket(dai.CameraBoardSocket.RGB)
        camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
        camRgb.setVideoSize(1920, 1080)
        
        xoutVideo.input.setBlocking(False)
        xoutVideo.input.setQueueSize(1)
        
        # Create MobileNet detection network
        mobilenet = pipeline.create(dai.node.MobileNetDetectionNetwork)
        mobilenet.setBlobPath(
            blobconverter.from_zoo(name="face-detection-retail-0004", shaves=3)
        )
        mobilenet.setConfidenceThreshold(0.7)
        
        # manipRgb = pipeline.createImageManip()
        # rgbRr = dai.RotatedRect()
        # rgbRr.center.x, rgbRr.center.y = camRgb.getPreviewWidth() // 2, camRgb.getPreviewHeight() // 2
        # rgbRr.size.width, rgbRr.size.height = camRgb.getPreviewHeight(), camRgb.getPreviewWidth()
        # rgbRr.angle = 0
        # manipRgb.initialConfig.setCropRotatedRect(rgbRr, False)
        #
        #
        # camRgb.isp.link(manipRgb.inputImage)
        # manipRgb.out.link(mobilenet.input)
        
        
        
        crop_manip2 = pipeline.create(dai.node.ImageManip)
        crop_manip2.initialConfig.setResize(300, 300)
        crop_manip2.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)
        camRgb.isp.link(crop_manip2.inputImage)
        #crop_manip2.out.link(mobilenet.input)
        
        
        crop_manip = pipeline.create(dai.node.ImageManip)
        crop_manip.initialConfig.setResize(300, 300)
        crop_manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)
        crop_manip.out.link(crop_manip2.inputImage)
        
        # camRgb.isp.link(crop_manip.inputImage)
        # crop_manip2.out.link(crop_manip.inputImage)
        crop_manip.out.link(mobilenet.input)
        
        
        
        
        # Script node
        script = pipeline.create(dai.node.Script)
        mobilenet.out.link(script.inputs["dets"])
        script.outputs["cam_cfg"].link(camRgb.inputConfig)
        script.outputs["cam_ctrl"].link(camRgb.inputControl)
        script.setScript(
            """
            ORIGINAL_SIZE = (5312, 6000) # 48MP with size constraints described on IMX582 luxonis page
            SCENE_SIZE = (1920, 1080) # 1080P
            x_arr = []
            y_arr = []
            AVG_MAX_NUM=7
            limits = [0, 0] # xmin and ymin limits
            limits.append((ORIGINAL_SIZE[0] - SCENE_SIZE[0]) / ORIGINAL_SIZE[0]) # xmax limit
            limits.append((ORIGINAL_SIZE[1] - SCENE_SIZE[1]) / ORIGINAL_SIZE[1]) # ymax limit
            cfg = ImageManipConfig()
            ctrl = CameraControl()
            def average_filter(x, y):
                x_arr.append(x)
                y_arr.append(y)
                if AVG_MAX_NUM < len(x_arr): x_arr.pop(0)
                if AVG_MAX_NUM < len(y_arr): y_arr.pop(0)
                x_avg = 0
                y_avg = 0
                for i in range(len(x_arr)):
                    x_avg += x_arr[i]
                    y_avg += y_arr[i]
                x_avg = x_avg / len(x_arr)
                y_avg = y_avg / len(y_arr)
                if x_avg < limits[0]: x_avg = limits[0]
                if y_avg < limits[1]: y_avg = limits[1]
                if limits[2] < x_avg: x_avg = limits[2]
                if limits[3] < y_avg: y_avg = limits[3]
                return x_avg, y_avg
            while True:
            
        
                dets = node.io['dets'].get().detections
                if len(dets) == 0: continue
                coords = dets[0] # take first
                width = (coords.xmax - coords.xmin) * ORIGINAL_SIZE[0]
                height = (coords.ymax - coords.ymin) * ORIGINAL_SIZE[1]
                x_pixel = int(max(0, coords.xmin * ORIGINAL_SIZE[0]))
                y_pixel = int(max(0, coords.ymin * ORIGINAL_SIZE[1]))
                # ctrl.setAutoFocusRegion(x_pixel, y_pixel, int(width), int(height))
                # ctrl.setAutoExposureRegion(x_pixel, y_pixel, int(width), int(height))
                # Get detection center
                x = (coords.xmin + coords.xmax) / 2
                y = (coords.ymin + coords.ymax) / 2
                x -= SCENE_SIZE[0] / ORIGINAL_SIZE[0] / 2
                y -= SCENE_SIZE[1] / ORIGINAL_SIZE[1] / 2
                # node.warn(f"{x=} {y=}")
                x_avg, y_avg = average_filter(x,y)
                # node.warn(f"{x_avg=} {y_avg=}")
                cfg.setCropRect(x_avg, y_avg, 0, 0)
                node.io['cam_cfg'].send(cfg)
                node.io['cam_ctrl'].send(ctrl)
            """
        )
        
        # Linking
        camRgb.video.link(xoutVideo.input)
        
        # Connect to device and start pipeline
        with dai.Device(pipeline) as device:
        
            video = device.getOutputQueue(name="video", maxSize=1, blocking=False)
        
            while True:
                videoIn = video.get()
                print("Done in seconds")
        
        
                # Get BGR frame from NV12 encoded video frame to show with opencv
                # Visualizing the frame on slower hosts might have overhead
                cv2.imshow("video", videoIn.getCvFrame())
        
                if cv2.waitKey(1) == ord('q'):
                    break
        • erik replied to this.

          erik
          Hi Erik,

          What do you mean when you say that? Do I post those things here in the forum post? Or do I submit something like a git issue?

          • erik replied to this.

            Hi chandrian ,
            You can submit it here, just make is minimal, as the code above isn't.
            Thanks, Erik

            Ah ok thanks. I tried to keep it minimal with the images above but maybe that was too much? Basically I fed the camera.isp output through two image crops to see if it would work and I got an error. Both image crops worked independently but it feeding through both nodes gave me an error.

            what should the general approach be? This is the camera source flow through two nodes. The crop node works fine but I am not seeing the face recognition working when I add the rotation:

            # Define source and output
            camRgb = pipeline.create(dai.node.ColorCamera)
            xoutVideo = pipeline.create(dai.node.XLinkOut)
            
            xoutVideo.setStreamName("video")
            
            # Properties
            camRgb.setBoardSocket(dai.CameraBoardSocket.RGB)
            camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
            camRgb.setVideoSize(1920, 1080)
            camRgb.setPreviewSize(300, 300)
            
            xoutVideo.input.setBlocking(False)
            xoutVideo.input.setQueueSize(1)
            
            # Create MobileNet detection network
            mobilenet = pipeline.create(dai.node.MobileNetDetectionNetwork)
            mobilenet.setBlobPath(
                blobconverter.from_zoo(name="face-detection-retail-0004", shaves=3)
            )
            mobilenet.setConfidenceThreshold(0.7)
            
            
            crop_manip = pipeline.createImageManip()
            rgbRr = dai.RotatedRect()
            rgbRr.center.x, rgbRr.center.y = camRgb.getPreviewWidth() // 2, camRgb.getPreviewHeight() // 2
            rgbRr.size.width, rgbRr.size.height = camRgb.getPreviewHeight(), camRgb.getPreviewWidth()
            rgbRr.angle = 90
            crop_manip.initialConfig.setCropRotatedRect(rgbRr, False)
            camRgb.isp.link(crop_manip.inputImage)
            
            crop_manip2 = pipeline.create(dai.node.ImageManip)
            crop_manip2.initialConfig.setResize(300, 300)
            crop_manip2.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)
            crop_manip.out.link(crop_manip2.inputImage)
            
            crop_manip2.out.link(mobilenet.input)
            • erik replied to this.

              Hi chandrian , I believe NNs should be run after rotating (90deg) the camera, as otherwise you will try to run inference on rotated frames (-90deg) which will have much worse performance?