Hello,

I'm currently trying to run 3 models on a OAK-D CM4 PoE. It works perfectly fine when operating at 1920 x 1080, but when I bump it up to say 2160 x 2160, I start getting OUT_OF_MEMORY errors.

I've tried adding in a node cam_rgb.setNumFramesPool(2,2,2,2,2) but that didn't solve the issue.

In the documentation, it says that the Raspberry Pi has 4 GB of ram, while the RVC2 has 512 MiB. I believe right now my Python script only utilizes the RVC2 memory. Is it possible to offload some of the memory consumption to the Raspberry Pi's 4 GB? I wasn't able to find any posts or documentation regarding this matter. Thank you!

Hi @CharlieLiu
It's a neural network issue not a camera one. Which depthai are you using; update to the latest 2.28. Make sure the model you are trying to run is not too large.

Thanks,
Jaka

    jakaskerl

    Hey Jaka, I was running version 2.22 of depthai originally, but I updated depthai to 2.28 and its sdk from 1.12.1 to 1.15.0 and I'm still facing the same issues.

    My apologies, I forgot to mention that I would first get this error before the neural network OUT_OF_MEMORY errors:

    ERROR - Exception while reconfiguring the camera: Camera's memory can not fit more than one model with the camera being setup with a resolution higher than 1080_P. Camera will try to recover from previous co the setup

    This leads me back to my original question - can we offload some of the memory usage of the camera onto the Raspberry Pi's RAM? Then would it be possible to run more than one model with the camera being setup with a resolution higher than 1080p?

    Raspberry Pi memory:

    Thanks,

    Charlie

      CharlieLiu
      TBF this is the first time I am seeing the Camera's memory can not fit more than one model with the camera being error... Seems like something very old / custom.
      You can not share resources between MX SOM (RVC2) and CM4.

      Can you create a MRE of your code, which I can run locally to test?

      Thanks,
      Jaka

        jakaskerl

        Hey Jaka, there's quite a bit of code so it would be quite difficult to make an MRE…

        Here's a general rundown. Hopefully you have some insights/ideas based off of this?

        Our code prepares the pipeline, configures the camera, and loads all the ML models initially and when we do inferencing it does the inferencing from that model instead of swapping. 

        We intentionally wrote this code to make the inferencing faster. 

        To provide more context based on the code:

        General Flow:
        The code sets up the DepthAI pipeline, configures the camera, and preloads multiple models (e.g., 3 models). This avoids model swapping during inference for speed.

        Pipeline Configuration:
        Uses DepthAI's Pipeline API. The prepare_pipeline function configures camera settings (resolution, FPS) and links camera preview to each model's detection network (e.g., YOLO). It handles high resolution setups (e.g., 4K) and ensures all models are ready for inference.

        Camera Focus:
        The code includes manual and automatic camera focus control with reset_focus(), using DepthAI's CameraControl.

        Inference Process:
        infer() handles inference using the preloaded model. It processes frames, sends them through the model, and builds a structured message using build_inference_msg(), including object detection polygons and confidence scores. Threshold checks are done before results are logged.

        System Health Monitoring:
        get_device_health() monitors memory (DDR, CMX), CPU load, and subsystem temperatures (e.g., UPA, DSS) using the system logger (SystemLogger node). It returns a summary of device health.

        Exception Handling:
        Custom exceptions like FrameEncodingError and PipelinePreparationError are used to trace and log errors during frame acquisition, encoding, and pipeline setup.

        Queue Linking:
        link_queues() connects the camera to output queues for real-time inference, setting up OutputQueue for each model. close_camera() and open_camera() handle device lifecycle.

        Design Intent:
        All models are loaded upfront to minimize runtime operations during inference, optimizing for speed.

        Thank you!

        Charlie

          CharlieLiu The code sets up the DepthAI pipeline, configures the camera, and preloads multiple models (e.g., 3 models). This avoids model swapping during inference for speed.

          What do you mean by preloads multiple models?

          The limitation of the device would be RAM where the models are loaded. If the model size is large (or there are many models as in your case), then this would raise memory issues. Depending on the resolution and other pipeline processes, the headroom for models might not be enough..

          Why use 4K in the first place?

          Thanks,
          Jaka