Hi Luxonis-Adam

Thanks for responding. I set up the debug and here are my test videos.

With image_publisher_ros, it failed 2 times out of 5 runs. Here is the video.

I created a node with only one color camera with detection network. It worked 10 times out of 10 runs. Here is the video.

Also, when I run the code without pipeline_graph consuming the terminal output. I noticed some red debug messages. But it shows those messages regardless success or failure. And regardless of two detection network or just one.

== FSYNC enabled for cam mask 0x0
CAM ID: 1, width: 1920, height: 1080, orientation: 0
CAM ID: 2, width: 1920, height: 1080, orientation: 0
== SW-SYNC: 0, cam mask 0x6
!!! Master Slave config is: single_master_slave !!!
Starting camera 1
[E] app_guzzi_command_callback():173: command->id:1
[E] app_guzzi_command_callback():193: command "1 1" sent

[18443010D1F3F40800] [3.1] [2.609] [system] [warning] PRINT:LeonCss: [E] iq_debug_create():161: iq_debug address 0x88837680
[E] hai_cm_driver_load_dtp():852: Features for camera IMX214R0 (imx214) are received
[E] set_dtp_ids():396: //VIV HAL: Undefined VCM DTP ID 0
[E] set_dtp_ids():405: //VIV HAL: Undefined NVM DTP ID 0
[E] set_dtp_ids():414: //VIV HAL: Undefined lights DTP ID 0
[18443010D1F3F40800] [3.1] [2.619] [system] [warning] PRINT:LeonCss: [E] camera_control_start():347: Camera_id = 1 started.

[E] hai_cm_sensor_select_mode():164: No suitable sensor mode. Selecting default one - 0 for start 1920x1080 at 0x0 fps min 0.000000 max 30.000000
[E] hai_cm_sensor_select_mode():164: No suitable sensor mode. Selecting default one - 0 f
[18443010D1F3F40800] [3.1] [2.581] [DetectionNetwork(2)] [info] Inference thread count: 1, number of shaves allocated per thread: 6, number of Neural Compute Engines (NCE) allocated per thread: 1
[18443010D1F3F40800] [3.1] [2.630] [system] [warning] PRINT:LeonCss: or start 1920x1080 at 0x0 fps min 0.000000 max 30.000000
[18443010D1F3F40800] [3.1] [2.653] [system] [warning] PRINT:LeonCss: [E] vpipe_conv_config():1465: Exit Ok
[E] callback():123: Camera CB START_DONE event.
Starting camera 2
[E] app_guzzi_command_callback():173: command->id:1
[E] app_guzzi_command_callback():193: command "1 2" sent

[18443010D1F3F40800] [3.1] [2.675] [system] [warning] PRINT:LeonCss: [E] iq_debug_create():161: iq_debug address 0x88375cc0
[18443010D1F3F40800] [3.1] [2.686] [system] [warning] PRINT:LeonCss: [E] hai_cm_driver_load_dtp():852: Features for camera IMX214R0 (imx214) are received
[E] set_dtp_ids():396: //VIV HAL: Undefined VCM DTP ID 0
[E] set_dtp_ids():405: //VIV HAL: Undefined NVM DTP ID 0
[E] set_dtp_ids():414: //VIV HAL: Undefined lights DTP ID 0
[E] camera_control_start():347: Camera_id = 2 started.

[18443010D1F3F40800] [3.1] [2.696] [system] [warning] PRINT:LeonCss: [E] hai_cm_sensor_select_mode():164: No suitable sensor mode. Selecting default one - 0 for start 1920x1080 at 0x0 fps min 0.000000 max 30.000000
[E] hai_cm_sensor_select_mode():164: No suitable sensor mode. Selecting default one - 0 for start 1920x1080 at 0x0 fps min 0.000000 max 30.000000
inc_camera_process set exposure and gain
[E] vpipe_conv_config():1465: Exit Ok
[18443010D1F3F40800] [3.1] [2.707] [system] [warning] PRINT:LeonCss: [E] guzzi_event_send():324: Send: Event ID=20003 no registered recipient
[E] guzzi_event_send():324: Send: Event ID=20004 no registered recipient
[E] guzzi_event_send():324: Send: Event ID=20005 no registered recipient
osDrvImx214Control:514: Start stream
[E] callback():123: Camera CB START_DONE event.
inc_camera_process set exposure and gain
osDrvImx214Control:514: Start stream
[18443010D1F3F40800] [3.1] [2.718] [system] [warning] PRINT:LeonCss: AF_TRIGGER on camera 1
[E] app_guzzi_command_callback():173: command->id:5
[E] camera_control_focus_trigger():591: Focus trigger succeeded camera_id = 1.

[E] app_guzzi_command_callback():218: command "5 1" sent

AF_TRIGGER on camera 2
[E] app_guzzi_command_callback():173: command->id:5
[E] camera_control_focus_trigger():591: Focus trigger succeeded camera_id = 2.

[E] app_guzzi_command_callback():218: command "5 2" sent

Starting Guzzi command handling loop...

Let me know what do you think

Thanks

Lincoln

5 days later

Hi @lincolnxlw, thanks for the information, to narrow it down more, could you try replicating this setup without ROS, using bare depthai-core library?

    Hi Luxonis-Adam

    I switched both cameras from IMX477 to IMX214 and ran the same MRE extensively, and I didn't see any blocking issue. I have 10 of the IMX477 modules and no matter how I pair them up (switch cables as well), I can always see the issue. So I don't think it is an individual camera problem. What do you think? Is this something on firmware side or hardware side? If it is on the hardware side, is it from Luxonis side or ArduCam side?

    @erik, @jakaskerl feel free to let me know what you think as well. This issue has been blocking us from releasing our depthai-powered product.

    Thanks

    Lincoln

    Hi Lincoln,
    So to confirm, any type of pipeline (just streaming color stream to XLinkOut) with 2x FFC-IMX477 with FFC-3P using latest depthai version there's a chance it will block one camera at the start, and it won't recover? I'm just trying to understand what would be the smallest repro solution, so the engineering team can fix the issue.
    Thanks, Erik

      erik

      Yes one camera will be blocked and won't recover. The tricky part is the issue not appear for all the pipelines with IMX477. But as comparison, with IMX214, NO issue for all the pipelines.

      In the latest C++ MRE, I have 3 different nodes

      • image_publisher_node: color camera nodes and detection nodes for CAM_B and CAM_C, no ROS.
      • image_publisher_ros_node: color camera nodes and detection nodes for CAM_B and CAM_C, also ROS bridges to publish image and detection to ROS.
      • image_publisher_ros_single_detection_node: color camera nodes for both cameras, only one camera has detection node. ROS publishing image from two cameras and detection from one camera.

      In term of severity of the issue

      For image_publisher_node, the blocking issue can be seen directly from the screen, it happens around 1 out of 10

      For image_publisher_ros_node, we need pipeline_graph to see the FPS. It happens around 2 out of 5

      For image_publisher_ros_single_detection_node, we need pipeline_graph to see the FPS, I have NOT see the blocking happen yet

      Also, like I mentioned a while ago, the issue rarely happens with the python MRE even with IMX477 (1 out of 30?). But I did see it happen before.

      So in summary

      • The inconsistency of the issue with IMX477 modules in different pipelines, make it looks like it is more on the firmware side.
      • But absolutely no issue with IMX214 in all pipelines make it looks like IMX477 hardware is the one to blame.

      Sorry for the mess. Let me know what do you think.

      Thanks

      Lincoln

      Hi @lincolnxlw ,
      Thank you for all the details, we will try to reproduce this locally and get back to you asap.

      @lincolnxlw , just to confirm, is the hardware absolutely still when you ran these tests, or do you eg. move it even a tiny bit between runs? It might be an improper connection, which is quite common for FFC cameras, and results in camera not being able to stream.

        Hi erik

        The hardware: OAK-FFC-3P with two IMX477 cameras are still during run. The OAK-FFC-3P is powered by the external adapter the came with the product instead of just the USB cable.

        Let me know if you have any other questions.

        Thanks

        Lincoln

        erik

        Another information may or may not be relevant. Even though the Luxonis shop is still branding the module as IMX477, but Arducam actually updated the sensor to IMX577, see the description on their own shop for the same module. And the cameras I received and am testing is the updated version with IMX577 sensors. So in order to reproduce the issues, you may need to make sure using the same version.

        Thanks

        Lincoln

        Hi @lincolnxlw ,
        We just ordered the exact same cameras (from Arducam), hopefully they will arrive in the next 2 weeks and we'll try to repro then. We tried with the IMX477 many times and didn't encounter the issue.
        Thanks, Erik

          Hi erik,

          Thanks for trying to repro the issue. I really appreciate it. Just to make sure I did it the right way. What arguments did you provide to launch the container while reproducing the C++ MRE? Here is my command

          docker container run --net=host -it --rm \
              --privileged \
              -e DISPLAY=$DISPLAY \
              -e QT_X11_NO_MITSHM=1 \
              -v /dev/bus/usb:/dev/bus/usb \
              -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
              -v $HOME/.Xauthority:/root/.Xauthority:rw \
              -v $HOME/data/:/data:rw \
              --device /dev/i2c-1 \
              depthai-ros \
              bash

          Also, I guess there are no issue with all three nodes while using IMX477?

          Thanks

          Lincoln

            Hi lincolnxlw
            I was not able to setup a ros environment since I am running MacOS, but tried it with python and ran the script 300+ times (had it on script) and encountered no issues. Will try with 577s once I receive them. The sensor should be fully supported (https://docs.luxonis.com/projects/hardware/en/latest/pages/articles/supported_sensors/), but was likely not tested to such an extent.

            Thanks,
            Jaka

              Hi jakaskerl

              Thanks for the extensive test with python! But see my description in post #21, for some reason the python MRE is the one with the lowest severity comparing to all the other cases. I can rarely produce a failure with it. Did someone in the team try to reproduce with the c++ code?

              Thanks

              Lincoln


                Hi lincolnxlw
                Sorry if I've missed the code somewhere, but afaik the cpp code was for ROS only? The is no cpp code that runs standalone like python right? Please correct me if I've missed something.
                The python API are just bindings for cpp so in theory, it should have the same fault occurrence frequency as the cpp examples, unless there is something else going on.

                Thanks,
                Jaka

                  Hi jakaskerl

                  Sorry I meant the C++ code for ROS I provided above. Have someone from the team try to run them with IMX477 extensively like you do with the python example? I know we are waiting for IMX577 for further testing according to @erik. Just want to make sure so far we can confirm IMX477 do not have issue with all three ROS nodes I provided (all the modules I purchased turned out to be IMX577). If that is the case, for the time being I can request Arducam to ship us IMX477 specifically so we can release our product without further delay (looks like they have same PCB dimension so our existing plastic mount should still work). Does that make sense?

                  Thanks

                  Lincoln

                    Hi jakaskerl,

                    Got it. Thanks.

                    Hi @DaniloPejovic,

                    thanks again for testing the C++ MRE! Like I mentioned earlier in #27, I want to make sure the arguments I passed to docker run look good to you (the OAK-FFC-3P is connected to an Pi4 running precompiled OS OAK_CM4_POE_V10_64bit). what do you think

                    docker container run --net=host -it --rm \
                        --privileged \
                        -e DISPLAY=$DISPLAY \
                        -e QT_X11_NO_MITSHM=1 \
                        -v /dev/bus/usb:/dev/bus/usb \
                        -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
                        -v $HOME/.Xauthority:/root/.Xauthority:rw \
                        -v $HOME/data/:/data:rw \
                        --device /dev/i2c-1 \
                        depthai-ros \
                        bash

                    Thanks

                    Lincoln

                    Hi @jakaskerl, @DaniloPejovic, @erik, @Luxonis-Adam,

                    Sorry for continuously spamming you guys with this issue.

                    Try to make everyone's life easier, I built a public docker image (amd64 and arm64) with depthai ros noetic and the MRE. And I created a github repo to launch the MRE with pipeline_graph output with one command. So to reproduce, simply three steps

                    • Clone this repo by git clone https://github.com/lincolnxlw/depthai_dual_cam_ros_detector.git.

                    • Download my prebuilt docker image by docker pull lincolnxlw/my-depthai-ros.

                    • Run the main script in the repo's root folder ./run.sh.

                    I also provided details of my testing result in the README. When using IMX577, around 30% failure rate for two different hosts (my laptop and CM4). 0% failure rate when using IMX214.

                    @DaniloPejovic, could you reconfirm with the prebuilt docker image that there are no issues with using IMX477? If yes, do you have the model number for the IMX477 you used (should be something like B0346)? I will need to request the exact same model from Arducam and with the Chinese New Year around the corner, I may only have one shot to get a production batch before the New Year.

                    Thanks

                    Lincoln

                      Hi lincolnxlw
                      By using your docker image, I was able to reproduce the issue on both IMX577 and IMX477. Tested with ov9782 and AR0234 and had no issues. I was unable to reproduce the issue in python, so I'm assuming the problem lies somewhere in ROS or docker.
                      Try running this on non-containerized ROS noetic.

                      Thanks,
                      Jaka