Hi erik , Brandon

Thanks for your reply...

I've used same code from your example as below:

	camRgb = pipeline.createColorCamera()
	camRgb.setPreviewSize(640, 400)
	camRgb.setResolution(depthai.ColorCameraProperties.SensorResolution.THE_1080_P)
	camRgb.setInterleaved(False)

	manipRgb = pipeline.createImageManip()
	rgbRr = depthai.RotatedRect()
	rgbRr.center.x, rgbRr.center.y = camRgb.getPreviewWidth() // 2, camRgb.getPreviewHeight() // 2
	rgbRr.size.width, rgbRr.size.height = camRgb.getPreviewHeight(), camRgb.getPreviewWidth()
	rgbRr.angle = 90
	manipRgb.initialConfig.setCropRotatedRect(rgbRr, False)
	camRgb.preview.link(manipRgb.inputImage)

	cropManip = pipeline.createImageManip()
	cropManip.initialConfig.setResize(300, 300)
	manipRgb.out.link(cropManip.inputImage)

	manipRgbOut = pipeline.createXLinkOut()
	manipRgbOut.setStreamName("cam_out")
	cropManip.out.link(manipRgbOut.input)

I still get same error as below:
[14442C1021FB92CD00] [140.524] [NeuralNetwork(4)] [warning] Input image (640x400) does not match NN (300x300)

I do get the output frame, however, without inference results...

Really need help to understand what I am doing wrong.
Alternatively, can you please provide an updated script of gen2-fatigue-detection with rotate option using the code suggested for rotate?

I am using this as an example... want to run inference with camera placed horizontally.

This is kind of important, thanks in advance for your time and help.

Thanks & Best Regards,
Ram

  • erik replied to this.

    ramkunchur yes, that's the correct code. I have created another demo code that links 300x300 rotated frames to mobilenet. You will need to place this script into depthai-python/examples, as it requires mobilenet blob. Unfortunately, I don't have time to update the script you mentioned, but I am sure you will be able to update it yourself with the help of the demo script I have just created - it should be straightforward.
    Thanks, Erik

      Hi erik ...

      Thanks I'm able to get it right this time..

      However, my full screen mode doesn't work with this, probably as it needs output resolution to be in multiples of 16...

      Not sure how to resolve this as having full-screen output would have been nice

      Thanks so much for your time and help... 🙂

      Thanks & Best Regards,
      Ram

      • erik replied to this.

        ramkunchur You could just use cv2.resize() function to upscale the 300x300 frame to the desired size. You could also stream 1080P video output to the device and display detections on the video frames - not 300x300 preview frame. So something similar to this example.
        Thanks, Erik

        a year later

        Hello All,

        I am trying to rotate my camera but I am confused by the links and syntax of this api and I need this done very soon for production. Here is my code:

        def get_pipeline():
            pipeline = dai.Pipeline()
        
            # # Define a source - color camera
            cam = pipeline.createColorCamera()
            cam.setBoardSocket(dai.CameraBoardSocket.RGB)
            # cam.setInterleaved(False)
            cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_48_MP)
            cam.setVideoSize(1920, 1080)
            cam.initialControl.setSceneMode(dai.CameraControl.SceneMode.FACE_PRIORITY)
        
            # Create MobileNet detection network
            mobilenet = pipeline.create(dai.node.MobileNetDetectionNetwork)
            mobilenet.setBlobPath(
                blobconverter.from_zoo(name="face-detection-retail-0004", shaves=3)
            )
            mobilenet.setConfidenceThreshold(0.7)
        
            crop_manip = pipeline.create(dai.node.ImageManip)
            crop_manip.initialConfig.setResize(300, 300)
            crop_manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)
            cam.isp.link(crop_manip.inputImage)
            crop_manip.out.link(mobilenet.input)
        
            # Create an UVC (USB Video Class) output node. It needs 1920x1080, NV12 input
            uvc = pipeline.createUVC()
            cam.video.link(uvc.input)

        This is what I tried but I am just guessing.

        def get_pipeline():
        pipeline = dai.Pipeline()

            # # Define a source - color camera
            cam = pipeline.createColorCamera()
            cam.setBoardSocket(dai.CameraBoardSocket.RGB)
            # cam.setInterleaved(False)
            cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_48_MP)
            cam.setVideoSize(1920, 1080)
            cam.initialControl.setSceneMode(dai.CameraControl.SceneMode.FACE_PRIORITY)
        
            # Create MobileNet detection network
            mobilenet = pipeline.create(dai.node.MobileNetDetectionNetwork)
            mobilenet.setBlobPath(
                blobconverter.from_zoo(name="face-detection-retail-0004", shaves=3)
            )
            mobilenet.setConfidenceThreshold(0.7)
        
            #
        
            manipRgb = pipeline.createImageManip()
            rgbRr = dai.RotatedRect()
            rgbRr.center.x, rgbRr.center.y = cam.getPreviewWidth() // 2, cam.getPreviewHeight() // 2
            rgbRr.size.width, rgbRr.size.height = cam.getPreviewHeight(), cam.getPreviewWidth()
            rgbRr.angle = 90
            manipRgb.initialConfig.setCropRotatedRect(rgbRr, False)
            cam.preview.link(manipRgb.inputImage)
        
        
            #
        
            crop_manip = pipeline.create(dai.node.ImageManip)
            crop_manip.initialConfig.setResize(300, 300)
            crop_manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)
            manipRgb.out.link(crop_manip.inputImage) #added
            cam.isp.link(crop_manip.inputImage)
            crop_manip.out.link(mobilenet.input)
        • erik replied to this.

          We're those guidelines for posting to the forum or submitting for review?
          We are using a UVC and I was trying to flip the image before output but I think it needs to be 1920,1080 so it is faulting.. Is it possible to rotate the image from a script?

          • erik replied to this.

            Some general feedback here would be great. I do not know enough to ask the right questions yet. We have a camera using UVC and face detection but it was longer than it was tall (1920, 1080) so we wanted to rotate the and camera and stream (1080,1920). When we rotate the camera, the face detection is not looking for the sideways faces so I need to flip the stream before it goes in to that I believe but not before the UVC input? :
            What is the max camRgb video size? We are using the OAK SOM.

            import os
            import sys
            import time
            
            import blobconverter
            import click
            import depthai as dai
            
            if sys.version_info[0] < 3:
                raise Exception["Doesn't work with Py2"]
            
            MJPEG = False
            
            os.environ["DEPTHAI_LEVEL"] = "debug"
            
            progressCalled = False
            # TODO move this under flash(), will need to handle `progressCalled` differently
            def progress(p):
                global progressCalled
                progressCalled = True
                print(f"Flashing progress: {p*100:.1f}%")
            
            
            # Will flash the bootloader if no pipeline is provided as argument
            def flash(pipeline=None):
                (f, bl) = dai.DeviceBootloader.getFirstAvailableDevice()
                bootloader = dai.DeviceBootloader(bl, True)
            
                startTime = time.monotonic()
                if pipeline is None:
                    print("Flashing bootloader...")
                    bootloader.flashBootloader(progress)
                else:
                    print("Flashing application pipeline...")
                    bootloader.flash(progress, pipeline)
            
                if not progressCalled:
                    raise RuntimeError("Flashing failed, please try again")
                elapsedTime = round(time.monotonic() - startTime, 2)
                print("Done in", elapsedTime, "seconds")
            
            
            @click.command()
            @click.option(
                "-fb",
                "--flash-bootloader",
                is_flag=True,
                help="Updates device bootloader prior to running",
            )
            @click.option(
                "-fp",
                "--flash-pipeline",
                is_flag=True,
                help="Flashes pipeline. If bootloader flash is also requested, this will be flashed after",
            )
            @click.option(
                "-gbs",
                "--get-boot-state",
                is_flag=True,
                help="Prints out the boot state of the connected MX"
            )
            def main(flash_bootloader, flash_pipeline, get_boot_state):
                
                def get_pipeline():
                    pipeline = dai.Pipeline()
            
                    # # Define a source - color camera
                    cam = pipeline.createColorCamera()
                    cam.setBoardSocket(dai.CameraBoardSocket.RGB)
                    cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_48_MP)
                    cam.setVideoSize(1920, 1080)
                    cam.initialControl.setSceneMode(dai.CameraControl.SceneMode.FACE_PRIORITY)
            
                    # Create MobileNet detection network
                    mobilenet = pipeline.create(dai.node.MobileNetDetectionNetwork)
                    mobilenet.setBlobPath(
                        blobconverter.from_zoo(name="face-detection-retail-0004", shaves=3)
                    )
                    mobilenet.setConfidenceThreshold(0.7)
            
                    crop_manip = pipeline.create(dai.node.ImageManip)
                    crop_manip.initialConfig.setResize(300, 300)
                    crop_manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)
                    cam.isp.link(crop_manip.inputImage)
                    crop_manip.out.link(mobilenet.input)
            
                    # Create an UVC (USB Video Class) output node. It needs 1920x1080, NV12 input
                    uvc = pipeline.createUVC()
                    cam.video.link(uvc.input)
            
                    # Script node
                    script = pipeline.create(dai.node.Script)
                    mobilenet.out.link(script.inputs["dets"])
                    script.outputs["cam_cfg"].link(cam.inputConfig)
                    script.outputs["cam_ctrl"].link(cam.inputControl)
                    script.setScript(
                    """
                    ORIGINAL_SIZE = (5312, 6000) # 48MP with size constraints described on IMX582 luxonis page
                    SCENE_SIZE = (1920, 1080) # 1080P
                    x_arr = []
                    y_arr = []
                    AVG_MAX_NUM=7
                    limits = [0, 0] # xmin and ymin limits
                    limits.append((ORIGINAL_SIZE[0] - SCENE_SIZE[0]) / ORIGINAL_SIZE[0]) # xmax limit
                    limits.append((ORIGINAL_SIZE[1] - SCENE_SIZE[1]) / ORIGINAL_SIZE[1]) # ymax limit
                    cfg = ImageManipConfig()
                    ctrl = CameraControl()
                    def average_filter(x, y):
                        x_arr.append(x)
                        y_arr.append(y)
                        if AVG_MAX_NUM < len(x_arr): x_arr.pop(0)
                        if AVG_MAX_NUM < len(y_arr): y_arr.pop(0)
                        x_avg = 0
                        y_avg = 0
                        for i in range(len(x_arr)):
                            x_avg += x_arr[i]
                            y_avg += y_arr[i]
                        x_avg = x_avg / len(x_arr)
                        y_avg = y_avg / len(y_arr)
                        if x_avg < limits[0]: x_avg = limits[0]
                        if y_avg < limits[1]: y_avg = limits[1]
                        if limits[2] < x_avg: x_avg = limits[2]
                        if limits[3] < y_avg: y_avg = limits[3]
                        return x_avg, y_avg
                    while True:
                        dets = node.io['dets'].get().detections
                        if len(dets) == 0: continue
                         coords = dets[0] # take first
                        # Get detection center
                        x = (coords.xmin + coords.xmax) / 2
                        y = (coords.ymin + coords.ymax) / 2
                        x -= SCENE_SIZE[0] / ORIGINAL_SIZE[0] / 2
                        y -= SCENE_SIZE[1] / ORIGINAL_SIZE[1] / 2
                        # node.warn(f"{x=} {y=}")
                        x_avg, y_avg = average_filter(x,y)
                        
                        # node.warn(f"{x_avg=} {y_avg=}")
                        cfg.setCropRect(x_avg, y_avg, 0, 0)
                        node.io['cam_cfg'].send(cfg)
                        node.io['cam_ctrl'].send(ctrl)
                    """
                    )
                    return pipeline
            
                if flash_bootloader or flash_pipeline:
                    if flash_bootloader: flash()
                    if flash_pipeline: flash(get_pipeline())
                    print("Flashing successful. Please power-cycle the device")
                    quit()
            
                if get_boot_state:
                    (f, bl) = dai.DeviceBootloader.getFirstAvailableDevice()
                    print(f"Device state: {bl.state.name}")
            
            
                # with dai.Device(get_pipeline(), usb2Mode=True) as dev:
                with dai.Device(get_pipeline()) as dev:
                    print(f"Connection speed: {dev.getUsbSpeed()}")
            
                    # Doing nothing here, just keeping the host feeding the watchdog
                    while True:
                        try:
                            time.sleep(0.1)
                        except KeyboardInterrupt:
                            break
            
            
            if __name__ == "__main__":
                try:
                    main()
                except KeyboardInterrupt:
                    sys.exit(0)

            Hi chandrian ,
            For UVC, I believe the current limitation is that frames need to be 720P and in NV12 format, so you would likely need to rotate the image after retrieving it on the host, or use some other option (eg streaming via dephtai library, then creating virtual camera on the host). Would that work for your application?
            THanks, Erik

              erik

              Thanks erik you've been so helpful on this. I dont think we can flip it after the host has it... I think the idea was to rotate it so that it has more height to work with in the frame analyzing.

              What size is the image coming out?

              What does this crop do?:
              crop_manip = pipeline.create(dai.node.ImageManip)
              crop_manip.initialConfig.setResize(300, 300)
              crop_manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)
              cam.isp.link(crop_manip.inputImage)
              crop_manip.out.link(mobilenet.input)

              I think the idea was to rotate it so that it has more height to work with in the frame analyzing.
              This was the wrong assumption above. I think I can just make the face-detection-crop more tall than long and I'll be ok. It is hard to follow the dimensions.

              Is it possible to crop into a different (smaller) size image for the face tracker? Where in the code does it need to be 1920x1080? before or after the script running?

              • erik replied to this.

                Hi chandrian ,

                1. The image should be full HD if you are using depthai with UVC pipeline (docs here).
                2. The code snippet resizes input frame to 300x300 and converts it to 8bit BGR format.
                3. Yep that should be possible🙂
                4. Can you please share what exactly you want to achieve?

                Thanks, Erik


                Thanks again for the response Erik. Basically we need to zoom in on the person like this and crop to this more vertical size.

                • erik replied to this.

                  Hi chandrian ,
                  with UVC mode this (currently) isn't possible, as UVC node needs full hd images. You could, however, stream exact same image but rotated by 90deg. Thoughts?
                  Thanks, Erik

                  Yes I attempted that but was not successful. Can you give me general instructions of where to implement that? The problems I faced were that the UVC needed 1920x1080 and when I rotated that, it was 1080x1920, and that the face recognition did not work when the camera was rotated 90 degrees.

                  Thanks,
                  Aaron

                  • erik replied to this.

                    Hi chandrian ,
                    I assume you are using something similar to Lossless Zooming. So first you would want to rotate the frame 90deg (so people are upright), do the face detection, crop the original (rotated) 4k image into 1080x1920 (as in the lossless zooming example), then rotate that to 1080P, which you can feed into the UVC node. Thougths?
                    Thanks ,Erik