Some assistance would be greatly appreciated: I have built a "new" model that fuses [RTMPose3D with AdaPose](https://github.com/thehiddenwaffle/rtm3d-adapose-fusion-pytorch) to lift the keypoints into metric space. It takes 3 inputs, RGB image(NCHW 1,3,384,288), z-mapped depth in meters(NCHW 1,1,384,288), and the inverse of the RGB calibration K_inv(?C?HW 1,3,3). All expect float16 in the original model. Ideally I want to quantize each input separately, and it would be even more ideal if I could do so by providing only statistics rather than a quantization dataset, as I know the operating parameters of each of my inputs. * color image: I need to apply standard imagenet scale/mean to the RGB image as well as convert BGR->RGB, easily accomplished in both NNArchive or yaml modelconverter config. * depth image: I *think* that I want to calibrate(quantize?) using min/max of .5-4.5 meters, but keep this field at higher precision to prevent spatial precision loss(uint8 is 0-255 and range 4000mm/255=\~16mm loss of precision if my understanding is right). * intrinsic inverse: similar to depth, need very high precision during camera ray projection. Can all this even be accomplished with NNArchive? Or is it just easier to use a modelconverter yaml(and how does yaml interact with HubAI?)? Do I absolutely need a calibration dataset for every input or can it be provided statistically? If I do need a dataset could I just write a script to generate a dataset within my set bounds? And could QuantizationMode.INT8_16_MIX provide better results? I guess what am really asking: what would an actual ML engineer at Luxonis do in this situation? How would you set it up, and what kinds of docs should I be looking at here?

Hi @"TheHiddenWaffle"#1171 , Very interesting case you have going on here. First, I’d like to give a bit more context around NNArchive, the modelconverter YAML, and HubAI interaction. Modelconverter is the backbone of the HubAI conversion process. As you’ve noticed, we don’t expose everything that’s available in modelconverter through the HubAI UI, because it would be too much and too confusing for most people. In your (quite advanced) case, you’ll want to use modelconverter directly, i.e., handle the conversion locally. NNArchive is a description of a particular model (e.g., the NNArchive for an ONNX model describes mean and scale for that ONNX model; once converted to RVC4, the NNArchive then describes how you need to preprocess data for the RVC4 model input). What NNArchive doesn’t store is the calibration information needed to get from ONNX to the RVC4 model. That information is supplied through the modelconverter YAML config (you can specify a [calibration block](https://github.com/luxonis/modelconverter/blob/80a2e00a41d45b086fb16eeae406c55ea5c6cfe5/shared_with_container/configs/defaults.yaml#L27-L43) for each [input block](https://github.com/luxonis/modelconverter/blob/80a2e00a41d45b086fb16eeae406c55ea5c6cfe5/shared_with_container/configs/defaults.yaml#L97-L115)). So for your case, I’d suggest starting with only the YAML configuration. Once that works, you can consider putting together an ONNX NNArchive while still keeping some conversion info (e.g., calibration paths per input) in YAML or via CLI overrides. So for your case, I’d suggest starting with only the YAML configuration. Once that works, you can consider putting together an ONNX NNArchive while still keeping some conversion info (e.g., calibration paths per input) in YAML or via CLI overrides. **Now to the gist of your question: How to calibrate this model** For quantization, the [snpe-dlc-quant](https://docs.qualcomm.com/doc/80-70015-15B/topic/snpe-port-model.html#model-quantization) command is called. In simplified terms (check out Qualcomm docs for more in depth explanation): - For weights and biases, it looks at the values, maps them to a target precision (e.g., INT8 range), and stores scale values. This is independent of calibration data. - For activations, it runs calibration data through the model, tracks outputs, and computes scale and zero‑point per layer so that transformation between precisions is possible. What this command doesn’t support is passing a static list of expected ranges and quantizing from that. That would also be hard, because you’d need expected ranges for every intermediate layer, not just the inputs. What you can do is create “dummy” calibration data for the depth and intrinsic‑inverse inputs that match the distribution you expect in practice. You might think the distribution doesn’t matter because only min/max values are tracked. That’s true for `INT8_STANDARD`, but for `INT8_ACC` we add the [enhanced flag to the quantization](https://github.com/luxonis/modelconverter/blob/80a2e00a41d45b086fb16eeae406c55ea5c6cfe5/modelconverter/packages/rvc4/exporter.py#L169-L171) flag to quantization. This uses Enhanced Quantization Mode (search for it [here](https://docs.qualcomm.com/doc/80-63442-10/topic/quantized_models.html)) and considers the data distribution, sometimes filtering out obvious outliers. It works well for some models but is not always better, which is why we still keep a mode without it. Lastly, INT8_16_MIX: this mode [set the activation bidwith to 16](https://github.com/luxonis/modelconverter/blob/80a2e00a41d45b086fb16eeae406c55ea5c6cfe5/modelconverter/packages/rvc4/exporter.py#L175), so activations become INT16, which can yield better accuracy at the cost of some FPS. Due to DepthAI limitations, we explicitly set input and output nodes to INT8 ([here](https://github.com/luxonis/modelconverter/blob/80a2e00a41d45b086fb16eeae406c55ea5c6cfe5/modelconverter/packages/rvc4/exporter.py#L262-L275)). We’ve recently seen models where this breaks. A working solution is to set the last 2–3 nodes to INT8 instead of only the last one; this stabilizes the flow and yields expected results. But that change isn’t merged to `main` yet, and we’re still actively testing it. In general, think of INT8_ACC and INT8_16_MIX as experimental modes. Hopefully this gives a bit more clarity to the whole process. Overall Qualcomm conversion offers quite a bit of flags that one can use and test out. We try to group some of the flags under clearer quatization modes and those also can have edge cases so it does come down to the specific architecture and how it behaves Overall I would suggest you start of with INT8_STANDARD mode and then work your way up with some of the more advanced flags. Best, Klemen

@"KlemenSkrlj"#p33005 Excellent, thank you this is a great starting point. Digging into the docs I found that the snpe-onnx-to-dlc I see the switch `` `--input_type "camera_K_inv" opaque` `` which seems like it would help with my issues around precision if I applied it to the camera matrix input to not quantize it ``` Opaque: Assumes input is float because the consumer layer(i.e next layer) requires it as float, therefore it won't be quantized. ``` Would this be incompatible with future steps/the OAK4's specific NPU? Also how do I handle depth input? There don't seem to be any examples that I can find of converting models which take depth maps. I'm happy to make the input(at the ONNX export time) either uint16 or fp16, whichever will be easier to convert. Obviously I want it to receive RAW16 depth images when it's actually in the pipeline, which I'm used to NNArchives doing this for me. I'm currently trying uint16 with the settings: ``` - name: depth shape: [ 1, 1, 384, 288 ] layout: "NCHW" data_type: uint16 scale_values: [ 1.0 ] mean_values: [ 0.0 ] encoding: GRAY calibration: path: "calibration_data/RandomKInvCalibration/" resize_method: RESIZE ``` But getting some obscure errors no matter what quantization_mode is set to. Edit: I've been thinking about it more and just for the sake of figuring out a roadmap of where I need to go with this, is there a tool I can use to examine intermediate dlc models? I see **snpe-dlc-info** in the docs, would that be helpful? And if so then what should I be trying to get out of each intermediate step in terms of input/output? Really I'm just wondering what the debug process is for members of the AI team, or if the answer is simply to read the docs a couple more times carefully to make up for my lack of experience on the subject.

@"TheHiddenWaffle"#p33019 Qualcomm tools indeed offer a lot of different flags that can be configured. To be honest we haven't really explored the `--input_type opaque`flag yet since we hadn't had a usecase for it. So I can't say how it will work. We however did some exploration into mixed precision models (INT8 + FP16). And the way we've done this is by passing `--quantization_overrides ` to the `snpe-onnx-to-dlc` and in this .json file you can specify for each of the nodes in which precision you want it to be. So you could theoretically lock parts of the "camere matrix branch" specifically to FP16 and have the rest be INT8. Few notes on this: - Data precision changes also cost FPS. If you do a lot of them in your NN you can end up with the model that is even slower than base FP16 (all nodes on FP16). - We got a weird case where multiple nodes being on FP16 yielded worse overall performance than the whole model being INT8. As per Qualcomm support team this is a bug which we encountered on SNPE 2.32.6. We'll be updating our stack to SNPE 2.41 in the next couple of weeks so feel free to use that one for experimentation as it has some improvements. - Prepare your environment so you can test out models quantitatively and qualitatively. You want to be probably testing this even outside of DepthAI, so directly with snpe-net-run, and do comparisons on each step so you can note down what works or doesn't for your particular model. And I would advise you to start with simplest cases first (full FP16 export, full INT8 export, etc). - Since this is a very custom exploration you are doing make use of the `--dev` flag on the `modelconverter` if you'll be editing the source code (mainly the [RVC4 Exporter](https://github.com/luxonis/modelconverter/blob/main/modelconverter/packages/rvc4/exporter.py)) which rebuilds the image each time and uses the changes. Or if you find modelconverter limiting you you can do it completely outside of it. And we are happy to get feedback on the painpoints and make the modelconverter work better even for this kinds of conversions. - If you'll be trying to pass in some custom snpe arguments to e.g. snpe-onnx-to-dlc (or other commands) set the QuantizationMode.CUSTOM so you are not inadvertently mixing some params. > Also how do I handle depth input? I'd say best way is to have .npy files in the calibration folders and esentially save your depth that you want to use in a "raw" numpy format there. > But getting some obscure errors no matter what quantization_mode is set to. Feel free to share the error (whole logs) so we can see if there is anything informative we see from it. > is there a tool I can use to examine intermediate dlc models? I see snpe-dlc-info in the docs, would that be helpful? And if so then what should I be trying to get out of each intermediate step in terms of input/output? There is `snpe-dlc-info` and `snpe-dlc-viewer` which can be helpful (especially the viewer one) when you are trying to see what is happening with precisions and scale values at each block. But I'd say that you'll get the most "return on investment" if you setup a test environment like this: `test inputs -> (preprocessing) -> snpe-net-run (or similar command - this has to be ran on device) -> postprocesing -> visualization and/or metrics` With this kind of setup you can then test out different modes/params/flags and you can compare between them. As you can imagine we've hit this challenge as well and we are currently in development of a "eval framework" where this can be done in a more simplified way (but for your case it would be still quite involved since its a multi-input custom model). We are not yet ready to share it publicly but when we'll have it hopefully you'll be able to test it out for yourself and give feedback. Best, Klemen

> @"KlemenSkrlj"#p33022 Feel free to share the error (whole logs) so we can see if there is anything informative we see from it. Well my current issue is that snpe-dlc-graph-prepare(SNPE 2.41) blows up to >75GB of ram usage before crashing when it runs out of memory ``` # Name of the model. Will be the stem of the input model if undefined. name: rtm_ada_depth_fusion # Whether to enable rich formatting for the log messages. rich_logging: true # Path to an ONNX or IR model. IR is supported only for RVC2 and RVC3 conversions. # It can be either a local file or an S3 url. Required. # TODO input_model: "models/rtm_ada_WEIGHTLESS.onnx" # Keep the intermediate files created during the compilation. keep_intermediate_outputs: true # Do not run ONNX simplifier on the provided model. disable_onnx_simplification: false # Do not run ONNX graph optimizations on the provided model. disable_onnx_optimization: false inputs: - name: input shape: [ 1, 3, 384, 288 ] layout: "NCHW" data_type: float16 scale_values: "imagenet" mean_values: "imagenet" encoding: from: RGB to: BGR calibration: path: "calibration_data/RandomRGBCalibration/" resize_method: RESIZE - name: depth shape: [ 1, 1, 384, 288 ] layout: "NCHW" data_type: float16 scale_values: [ 1.0 ] mean_values: [ 0.0 ] encoding: GRAY calibration: path: "calibration_data/RandomDepthCalibration/" resize_method: RESIZE - name: camera_K_inv shape: [ 1, 2, 2 ] layout: "NCD" data_type: float16 scale_values: [ 1.0 ] mean_values: [ 0.0 ] encoding: NONE calibration: path: "calibration_data/RandomKInvCalibration/" resize_method: RESIZE outputs: - name: "kps_xyz" shape: [ 1, 133, 3 ] layout: "NCW" data_type: "float16" - name: "dbg_torso_root_center_pred" "shape": [ 1, 1, 3 ] "layout": "NCH" "data_type": "float16" - name: "dbg_kp_pix_confidence" "shape": [ 1, 133, 2 ] "layout": "NCH" "data_type": "float16" - name: "dbg_kp_z_pred" "shape": [ 1, 19 ] "layout": "ND" "data_type": "float16" - name: "dbg_px_coords" "shape": [ 1, 133, 2 ] "layout": "NCD" "data_type": "float16" - name: "dbg_z_prior" "shape": [ 1, 133, 1 ] "layout": "NCD" "data_type": "float16" # --- RVC4-Specific Arguments --- rvc4: # List of additional arguments to pass to SNPE onnx-to-dlc. # The additional arguments are passed as-is and always take precedence # over the default arguments. snpe_onnx_to_dlc_args: [ "--input_type", "input", "image", # "--input_type", "camera_K_inv", "opaque", ] # List of additional arguments to pass to SNPE snpe-dlc-quant. snpe_dlc_quant_args: [ "--verbose" ] # List of additional arguments to pass to SNPE snpe-dlc-graph-prepare. snpe_dlc_graph_prepare_args: [ ] # Disables calibration/quantization. disable_calibration: false # Whether to include the raw images in the intermediate outputs. # Warning: the raw images can get very large. keep_raw_images: False # Selects per-axis-element quantization for the weights # and biases of certain layer types. # Only Convolution, Deconvolution, and FullyConnected are supported. use_per_channel_quantization: True # Enables row wise quantization of Matmul and FullyConnected ops. use_per_row_quantization: False # List of platforms to pre-compute the DLC graph for. htp_socs: [ "sm8550" ] # Optimization level for the DLC graph preparation. The available levels are: 1, 2, and 3. Higher optimization levels incur longer offline prepare time but yields more optimal graph and hence faster execution time for most graphs. optimization_level: 3 # Pre-defined quantization modes for the RVC4 exporter. Pre-defined modes (except CUSTOM) will override any user-provided SNPE arguments via `snpe_onnx_to_dlc_args`, `snpe_dlc_quant_args`, and `snpe_dlc_graph_prepare_args`. The available quantization modes are: INT8_STANDARD, INT8_ACCURACY_FOCUSED, INT8_INT16_MIXED, FP16_STANDARD, and CUSTOM. # quantization_mode: CUSTOM quantization_mode: INT8_STANDARD ``` updated onnx model: [upl-image-preview uuid=f21e5e9c-f565-456e-868f-f6a84d2d8541 url=https://discuss.luxonis.com/assets/files/2026-02-10/1770746853-720766-image.png alt={TEXT?}] updated the camera matrix to just be channelwise split terms so that the quantization scale is more consistent. uploading logs in next message because of character limit.

Logs: ``` 2026-02-10 12:45:44.158406864 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor" Container tmp-modelconverter-run-3cc40fc1cc45 Creating Container tmp-modelconverter-run-3cc40fc1cc45 Created [INFO] QAIRT_SDK_ROOT=/opt/snpe [WARN] QNN_SDK_ROOT/SNPE_ROOT set to QAIRT_SDK_ROOT for backwards compatibility and will be deprecated in a future release. [INFO] QAIRT SDK environment setup complete 2026-02-10 17:45:47.530343503 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor" INFO Detected grayscale input. Setting encoding to GRAY. config.py:164 WARNING Using pre-defined arguments for quantization mode exporter.py:55 INT8_STANDARD, which will override user-provided SNPE arguments. If you need full control of SNPE arguments, set `rvc4.quantization_mode: CUSTOM` in the config or CLI. WARNING Input 'depth' has 1 channels, but normalization is onnx_tools.py:77 only supported for 3 channels. Skipping. WARNING Input 'camera_K_inv' has layout 'NCD', but only 'NCHW' onnx_tools.py:67 and 'NHWC' are supported for normalization. Skipping. INFO Loading model: rtm_ada_WEIGHTLESS-modified onnx_tools.py:265 WARNING Failed: Substitute Div -> Mul nodes with error: onnx_tools.py:1247 [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Load model from /tmp/tmp7tdbacqb.onnx failed:Invalid model. Node input 'ada_Div_678/Mul_output' is not a graph input, initializer, or output of a previous node., reverting changes... WARNING Failed: Substitute Sub -> Add nodes with error: onnx_tools.py:1247 [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Load model from /tmp/tmp8utlpda4.onnx failed:Invalid model. Node input 'ada_Div_678/Mul_output' is not a graph input, initializer, or output of a previous node., reverting changes... WARNING Failed: Fuse Add and Mul nodes to BatchNormalization onnx_tools.py:1247 nodes with error: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Load model from /tmp/tmpi0tk9xh5.onnx failed:Invalid model. Node input 'ada_Div_678/Mul_output' is not a graph input, initializer, or output of a previous node., reverting changes... WARNING No applicable Add-Mul-Conv pattern found for fusion. onnx_tools.py:1111 WARNING Failed: Fuse Add and Mul nodes to Conv nodes onnx_tools.py:1247 (single) with error: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Load model from /tmp/tmpbewtuh0_.onnx failed:Invalid model. Node input 'ada_Div_678/Mul_output' is not a graph input, initializer, or output of a previous node., reverting changes... WARNING Failed: Fuse Split and Concat nodes to Conv nodes onnx_tools.py:1247 with error: 'NoneType' object has no attribute 'op', reverting changes... INFO ONNX model has been optimized for RVC4. exporter.py:85 INFO Exporting for RVC4 exporter.py:278 INFO Executing `snpe-onnx-to-dlc -i subprocess.py:171 shared_with_container/outputs/rtm_ada_depth_fusion_to _rvc4_2026_02_10_17_45_48/intermediate_outputs/rtm_ad a_WEIGHTLESS-modified.onnx --input_dim input 1,3,384,288 --input_dim depth 1,1,384,288 --input_dim camera_K_inv 1,2,2 --input_dtype input float16 --input_dtype depth float16 --input_dtype camera_K_inv float16 --out_name kps_xyz --out_name dbg_torso_root_center_pred --out_name dbg_kp_pix_confidence --out_name dbg_kp_z_pred --out_name dbg_px_coords --out_name dbg_z_prior --input_layout input NCHW --input_layout depth NCHW --input_layout camera_K_inv NCF` INFO 2026-02-10 17:46:13,929 - 278 - INFO - Input shape subprocess.py:190 info INFO 2026-02-10 17:46:16.596040942 [W:onnxruntime:Default, subprocess.py:190 device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor" INFO 2026-02-10 17:46:20.708994184 [W:onnxruntime:Default, subprocess.py:190 device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor" INFO WARNING: The argument `input_shapes` is deprecated. subprocess.py:190 Please use INFO `overwrite_input_shapes` and/or `test_input_shapes` subprocess.py:190 instead. An error will be INFO raised in the future. subprocess.py:190 INFO 2026-02-10 17:46:25,965 - 283 - WARNING - subprocess.py:190 WARNING_OPSET_VERSION: Warning multiple opset versions specified, using highest. INFO 2026-02-10 17:46:25,987 - 278 - INFO - subprocess.py:190 user_provided_output is same as graph_output..skipping update_output_names call INFO 2026-02-10 17:46:25,997 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_normalized_input will be interpreted at conversion time INFO 2026-02-10 17:46:26,092 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_rtm_/backbone/stage2/stage2 .1/final_conv/activate/Mul_output_0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,162 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_rtm_/backbone/stage3/stage3 .1/final_conv/activate/Mul_output_0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,262 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: rtm_/neck/upsample/Resize_output_cast0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,263 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_rtm_/neck/Concat_output_0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,292 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: rtm_/neck/upsample_1/Resize_output_cast0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,293 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_rtm_/neck/Concat_1_output_0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,502 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_rtm_/gau/Add_output_0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,508 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_rtm_/gau/Transpose_output_0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,510 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_rtm_/gau/Squeeze_output_0 will be interpreted at conversion time INFO 2026-02-10 17:46:26,519 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1340 will be interpreted at conversion time INFO 2026-02-10 17:46:26,533 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_320 will be interpreted at conversion time INFO 2026-02-10 17:46:26,542 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_dbg_z_prior will be interpreted at conversion time INFO 2026-02-10 17:46:26,543 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_output will be interpreted at conversion time INFO 2026-02-10 17:46:26,555 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_230 will be interpreted at conversion time INFO 2026-02-10 17:46:26,563 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_353 will be interpreted at conversion time INFO 2026-02-10 17:46:26,564 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1338 will be interpreted at conversion time INFO 2026-02-10 17:46:26,575 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_275 will be interpreted at conversion time INFO 2026-02-10 17:46:26,583 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_355 will be interpreted at conversion time INFO 2026-02-10 17:46:26,585 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_401 will be interpreted at conversion time INFO 2026-02-10 17:46:26,587 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_402 will be interpreted at conversion time INFO 2026-02-10 17:46:26,592 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_425 will be interpreted at conversion time INFO 2026-02-10 17:46:26,593 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_433 will be interpreted at conversion time INFO 2026-02-10 17:46:26,599 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_496 will be interpreted at conversion time INFO 2026-02-10 17:46:26,600 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_504 will be interpreted at conversion time INFO 2026-02-10 17:46:26,605 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: ada_Cast_481 will be interpreted at conversion time INFO 2026-02-10 17:46:26,615 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_depth will be interpreted at conversion time INFO 2026-02-10 17:46:26,616 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_49 will be interpreted at conversion time INFO 2026-02-10 17:46:26,620 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_669 will be interpreted at conversion time INFO 2026-02-10 17:46:26,622 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_381 will be interpreted at conversion time INFO 2026-02-10 17:46:26,624 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_376 will be interpreted at conversion time INFO 2026-02-10 17:46:26,625 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_640 will be interpreted at conversion time INFO 2026-02-10 17:46:26,629 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_670 will be interpreted at conversion time INFO 2026-02-10 17:46:26,633 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_679 will be interpreted at conversion time INFO 2026-02-10 17:46:26,634 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_685 will be interpreted at conversion time INFO 2026-02-10 17:46:26,658 - 283 - WARNING - GEMM subprocess.py:190 operation is not supported in the general case, attempting to interpret as FC INFO 2026-02-10 17:46:26,662 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_713 will be interpreted at conversion time INFO 2026-02-10 17:46:26,665 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_dbg_torso_root_center_pred will be interpreted at conversion time INFO 2026-02-10 17:46:26,667 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_781 will be interpreted at conversion time INFO 2026-02-10 17:46:26,668 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_dbg_px_coords will be interpreted at conversion time INFO 2026-02-10 17:46:26,669 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_383 will be interpreted at conversion time INFO 2026-02-10 17:46:26,671 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_776 will be interpreted at conversion time INFO 2026-02-10 17:46:26,673 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_783 will be interpreted at conversion time INFO 2026-02-10 17:46:26,676 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_784 will be interpreted at conversion time INFO 2026-02-10 17:46:26,678 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_684 will be interpreted at conversion time INFO 2026-02-10 17:46:26,680 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_787 will be interpreted at conversion time INFO 2026-02-10 17:46:26,685 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_817 will be interpreted at conversion time INFO 2026-02-10 17:46:26,688 - 283 - WARNING - GEMM subprocess.py:190 operation is not supported in the general case, attempting to interpret as FC INFO 2026-02-10 17:46:26,691 - 283 - WARNING - GEMM subprocess.py:190 operation is not supported in the general case, attempting to interpret as FC INFO 2026-02-10 17:46:26,694 - 283 - WARNING - GEMM subprocess.py:190 operation is not supported in the general case, attempting to interpret as FC INFO 2026-02-10 17:46:26,707 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_863 will be interpreted at conversion time INFO 2026-02-10 17:46:26,709 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_874 will be interpreted at conversion time INFO 2026-02-10 17:46:26,709 - 283 - WARNING - GEMM subprocess.py:190 operation is not supported in the general case, attempting to interpret as FC INFO 2026-02-10 17:46:26,712 - 283 - WARNING - GEMM subprocess.py:190 operation is not supported in the general case, attempting to interpret as FC INFO 2026-02-10 17:46:26,716 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_dbg_kp_z_pred will be interpreted at conversion time INFO 2026-02-10 17:46:26,718 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_932 will be interpreted at conversion time INFO 2026-02-10 17:46:26,733 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_880 will be interpreted at conversion time INFO 2026-02-10 17:46:26,751 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_942 will be interpreted at conversion time INFO 2026-02-10 17:46:26,753 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_990 will be interpreted at conversion time INFO 2026-02-10 17:46:26,779 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1000 will be interpreted at conversion time INFO 2026-02-10 17:46:26,781 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1005 will be interpreted at conversion time INFO 2026-02-10 17:46:26,785 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1017 will be interpreted at conversion time INFO 2026-02-10 17:46:26,787 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1022 will be interpreted at conversion time INFO 2026-02-10 17:46:26,790 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1033 will be interpreted at conversion time INFO 2026-02-10 17:46:26,792 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1038 will be interpreted at conversion time INFO 2026-02-10 17:46:26,795 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_1049 will be interpreted at conversion time INFO 2026-02-10 17:46:26,795 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_883 will be interpreted at conversion time INFO 2026-02-10 17:46:26,800 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_344 will be interpreted at conversion time INFO 2026-02-10 17:46:26,803 - 283 - WARNING - Only subprocess.py:190 numerical type cast is supported. The cast op: InsertedPrecisionFreeCast_332 will be interpreted at conversion time INFO 2026-02-10 17:46:28,774 - 283 - WARNING - Op subprocess.py:190 ada_Mul_549 with input_buf axis format NONTRIVIAL and data_axis_format NCF is no need to revert. INFO 2026-02-10 17:46:28,780 - 283 - WARNING - Op subprocess.py:190 ada_Mul_562 with input_buf axis format NONTRIVIAL and data_axis_format NCF is no need to revert. INFO 2026-02-10 17:46:28,822 - 283 - WARNING - Op subprocess.py:190 ada_Mul_310 with input_buf axis format NONTRIVIAL and data_axis_format NCF is no need to revert. INFO 2026-02-10 17:46:28,826 - 283 - WARNING - Op subprocess.py:190 ada_Mul_668 with input_buf axis format NONTRIVIAL and data_axis_format NCF is no need to revert. INFO 2026-02-10 17:46:28,826 - 283 - WARNING - Op subprocess.py:190 ada_Sub_669 with input_buf axis format NONTRIVIAL and data_axis_format NCF is no need to revert. INFO 2026-02-10 17:46:28,859 - 283 - WARNING - Op subprocess.py:190 ada_Div_784 with input_buf axis format NONTRIVIAL and data_axis_format NCF is no need to revert. INFO 2026-02-10 17:46:28,881 - 283 - WARNING - Op subprocess.py:190 ada_Mul_887 with input_buf axis format NONTRIVIAL and data_axis_format ANY is no need to revert. INFO 2026-02-10 17:46:29,549 - 278 - INFO - subprocess.py:190 INFO_INITIALIZATION_SUCCESS: INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 ada_GlobalAveragePool_596_intermediate due to IdentityOp. Input shape: IrTensorShape(dims = [1,512,1,1] , dy_axes=[]), shape after IrTensorShape(dims = [1,512] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 ada_Conv_595_intermediate due to IdentityOp. Input shape: IrTensorShape(dims = [1,512,1,49] , dy_axes=[]), shape after IrTensorShape(dims = [1,512,1,49] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 InsertedPrecisionFreeCast_335_reshape due to IdentityOp. Input shape: IrTensorShape(dims = [1,133,768] , dy_axes=[]), shape after IrTensorShape(dims = [1,133,768] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 InsertedPrecisionFreeCast_334_flatten due to IdentityOp. Input shape: IrTensorShape(dims = [1,133,768] , dy_axes=[]), shape after IrTensorShape(dims = [1,133,768] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 Reshape_post_ada_Gather_175 due to IdentityOp. Input shape: IrTensorShape(dims = [1,1,768] , dy_axes=[]), shape after IrTensorShape(dims = [1,1,768] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 Reshape_post_ada_Gather_49 due to IdentityOp. Input shape: IrTensorShape(dims = [1,1,768] , dy_axes=[]), shape after IrTensorShape(dims = [1,1,768] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 InsertedPrecisionFreeCast_323_reshape due to IdentityOp. Input shape: IrTensorShape(dims = [1,133,576] , dy_axes=[]), shape after IrTensorShape(dims = [1,133,576] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 InsertedPrecisionFreeCast_322_flatten due to IdentityOp. Input shape: IrTensorShape(dims = [1,133,576] , dy_axes=[]), shape after IrTensorShape(dims = [1,133,576] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 Reshape_post_ada_Gather_133 due to IdentityOp. Input shape: IrTensorShape(dims = [1,1,576] , dy_axes=[]), shape after IrTensorShape(dims = [1,1,576] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 Reshape_post_ada_Gather_7 due to IdentityOp. Input shape: IrTensorShape(dims = [1,1,576] , dy_axes=[]), shape after IrTensorShape(dims = [1,1,576] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 InsertedPrecisionFreeCast_347_reshape due to IdentityOp. Input shape: IrTensorShape(dims = [1,133,576] , dy_axes=[]), shape after IrTensorShape(dims = [1,133,576] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 InsertedPrecisionFreeCast_346_flatten due to IdentityOp. Input shape: IrTensorShape(dims = [1,133,576] , dy_axes=[]), shape after IrTensorShape(dims = [1,133,576] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 Reshape_post_ada_Gather_217 due to IdentityOp. Input shape: IrTensorShape(dims = [1,1,576] , dy_axes=[]), shape after IrTensorShape(dims = [1,1,576] , dy_axes=[]) INFO subprocess.py:190 INFO handleReshapeOp: Squash Reshape Op: subprocess.py:190 Reshape_post_ada_Gather_91 due to IdentityOp. Input shape: IrTensorShape(dims = [1,1,576] , dy_axes=[]), shape after IrTensorShape(dims = [1,1,576] , dy_axes=[]) INFO subprocess.py:190 INFO 2026-02-10 17:46:30,461 - 278 - INFO - subprocess.py:190 INFO_CONVERSION_SUCCESS: Conversion completed successfully INFO 2026-02-10 17:46:30,464 - 278 - INFO - subprocess.py:190 INFO_WRITE_SUCCESS: INFO Command `snpe-onnx-to-dlc` finished in 18.03 s with subprocess.py:267 return code 0. INFO Exported for RVC4 exporter.py:346 INFO Preparing calibration data. exporter.py:148 INFO Quantizing model. exporter.py:156 INFO Executing `snpe-dlc-quant --input_list subprocess.py:171 shared_with_container/outputs/rtm_ada_depth_fusion_to _rvc4_2026_02_10_17_45_48/intermediate_outputs/img_li st.txt --input_dlc shared_with_container/outputs/rtm_ada_depth_fusion_to _rvc4_2026_02_10_17_45_48/intermediate_outputs/rtm_ad a_WEIGHTLESS-modified.dlc --output_dlc shared_with_container/outputs/rtm_ada_depth_fusion_to _rvc4_2026_02_10_17_45_48/intermediate_outputs/rtm_ad a_WEIGHTLESS-modified-quantized.dlc --use_per_channel_quantization` INFO [INFO] InitializeStderr: DebugLog initialized. subprocess.py:190 INFO [INFO] Processed command-line arguments subprocess.py:190 INFO 1.1ms [ INFO ] Inferences will run in sync mode subprocess.py:190 INFO Processing inference input(s): subprocess.py:190 INFO [INFO] Quantized parameters subprocess.py:190 INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_max.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_max.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_max.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_min.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_min.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_min.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_rand_000.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_rand_000.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_rand_000.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_rand_001.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_rand_001.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_rand_001.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_rand_002.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_rand_002.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_rand_002.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_rand_003.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_rand_003.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_rand_003.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_rand_004.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_rand_004.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_rand_004.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_rand_005.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_rand_005.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_rand_005.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_rand_006.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_rand_006.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_rand_006.raw INFO input -> subprocess.py:190 shared_with_container/calibration_data/RandomRGBCalib ration/rgb_input_rand_007.raw INFO depth -> subprocess.py:190 shared_with_container/calibration_data/RandomDepthCal ibration/depth_input_rand_007.raw INFO camera_K_inv -> subprocess.py:190 shared_with_container/calibration_data/RandomKInvCali bration/k_inv_rand_007.raw INFO 1.8ms [ INFO ] No BackendExtensions lib subprocess.py:190 provided;initializing NetRunBackend Interface INFO [INFO] Generated activations subprocess.py:190 INFO 1.8ms [ INFO ] No platform-specific callback subprocess.py:190 implementation available INFO 0.6ms [ INFO ] [QNN_CPU] CpuBackend creation start subprocess.py:190 INFO 0.6ms [ INFO ] [QNN_CPU] CpuBackend creation end subprocess.py:190 INFO 2.4ms [WARNING] Unable to find a device with subprocess.py:190 NetRunDeviceKeyDefault in Library NetRunBackendLibKeyDefault INFO 2.4ms [WARNING] Profile Logger with name = defaultKey subprocess.py:190 doesn't exist! Returning nullptr INFO 0.6ms [ INFO ] [QNN_CPU] QnnContext create start subprocess.py:190 INFO 0.6ms [ INFO ] [QNN_CPU] QnnContext create end subprocess.py:190 INFO 2.7ms [ INFO ] Entering QuantizeRuntimeApp flow subprocess.py:190 INFO 2.7ms [WARNING] Profile Logger with name = defaultKey subprocess.py:190 doesn't exist! Returning nullptr INFO 1.0ms [ INFO ] [QNN_CPU] CpuGraph creation start subprocess.py:190 INFO 1.6ms [ INFO ] [QNN_CPU] CpuGraph creation end subprocess.py:190 INFO 1.6ms [ INFO ] [QNN_CPU] QnnGraph create end subprocess.py:190 INFO 144.2ms [ INFO ] [QNN_CPU] QnnGraph finalize start subprocess.py:190 INFO 257.9ms [ INFO ] [QNN_CPU] QnnGraph finalize end subprocess.py:190 INFO 270.7ms [ INFO ] [QNN_CPU] QnnGraph execute start subprocess.py:190 INFO 1131.6ms [ INFO ] [QNN_CPU] QnnGraph execute end subprocess.py:190 INFO 1131.4ms [ INFO ] cleaning up resources for input subprocess.py:190 tensors INFO 1131.6ms [ INFO ] cleaning up resources for output subprocess.py:190 tensors INFO 1143.9ms [ INFO ] [QNN_CPU] QnnGraph execute start subprocess.py:190 INFO 2041.5ms [ INFO ] [QNN_CPU] QnnGraph execute end subprocess.py:190 INFO 2041.6ms [ INFO ] cleaning up resources for input subprocess.py:190 tensors INFO 2041.6ms [ INFO ] cleaning up resources for output subprocess.py:190 tensors INFO 2053.7ms [ INFO ] [QNN_CPU] QnnGraph execute start subprocess.py:190 INFO 2904.3ms [ INFO ] [QNN_CPU] QnnGraph execute end subprocess.py:190 INFO 2904.3ms [ INFO ] cleaning up resources for input subprocess.py:190 tensors INFO 2904.3ms [ INFO ] cleaning up resources for output subprocess.py:190 tensors INFO 2916.5ms [ INFO ] [QNN_CPU] QnnGraph execute start subprocess.py:190 INFO 3794.5ms [ INFO ] [QNN_CPU] QnnGraph execute end subprocess.py:190 INFO 3793.8ms [ INFO ] cleaning up resources for input subprocess.py:190 tensors INFO 3793.8ms [ INFO ] cleaning up resources for output subprocess.py:190 tensors INFO 3806.0ms [ INFO ] [QNN_CPU] QnnGraph execute start subprocess.py:190 INFO 4603.2ms [ INFO ] [QNN_CPU] QnnGraph execute end subprocess.py:190 INFO 4602.7ms [ INFO ] cleaning up resources for input subprocess.py:190 tensors INFO 4602.7ms [ INFO ] cleaning up resources for output subprocess.py:190 tensors INFO Scatter: Skip Scatter Optimization since input subprocess.py:190 (InsertedPrecisionFreeCast_1340) has >1 consumers INFO Scatter: Skip Scatter Optimization since input (185) subprocess.py:190 has >1 consumers INFO Scatter: Skip Scatter Optimization since input subprocess.py:190 (InsertedPrecisionFreeCast_output) has >1 consumers INFO Scatter: Skip Scatter Optimization since input (95) subprocess.py:190 has >1 consumers INFO Scatter: Skip Scatter Optimization since input subprocess.py:190 (InsertedPrecisionFreeCast_1338) has >1 consumers INFO Scatter: Skip Scatter Optimization since input (140) subprocess.py:190 has >1 consumers INFO Scatter: Skip Scatter Optimization since input (411) subprocess.py:190 has >1 consumers INFO Scatter: Skip Scatter Optimization since input (494) subprocess.py:190 has >1 consumers INFO Scatter: Skip Scatter Optimization since input subprocess.py:190 (InsertedPrecisionFreeCast_685) has >1 consumers INFO [USER_INFO] Successfully saved DLC to subprocess.py:190 /app/shared_with_container/outputs/rtm_ada_depth_fusi on_to_rvc4_2026_02_10_17_45_48/intermediate_outputs/r tm_ada_WEIGHTLESS-modified-quantized.dlc INFO [INFO] Saved quantized dlc to: subprocess.py:190 shared_with_container/outputs/rtm_ada_depth_fusion_to _rvc4_2026_02_10_17_45_48/intermediate_outputs/rtm_ad a_WEIGHTLESS-modified-quantized.dlc INFO [INFO] DebugLog shutting down. subprocess.py:190 INFO 8972.8ms [ INFO ] Freeing graphsInfo subprocess.py:190 INFO 8972.8ms [WARNING] Profile Logger with name = subprocess.py:190 defaultKey doesn't exist! Returning nullptr INFO 8973.8ms [ INFO ] [QNN_CPU] QnnContext Free start subprocess.py:190 INFO 9022.4ms [ INFO ] [QNN_CPU] QnnContext Free end subprocess.py:190 INFO 9021.6ms [WARNING] Profile Logger with name = subprocess.py:190 defaultKey doesn't exist! INFO 9022.6ms [ INFO ] [QNN_CPU] QnnBackend Free start subprocess.py:190 INFO 9022.6ms [ INFO ] [QNN_CPU] QnnBackend Free end subprocess.py:190 INFO Command `snpe-dlc-quant` finished in 11.00 s with subprocess.py:267 return code 0. INFO Quantization finished in 11.00 seconds exporter.py:183 INFO Performing offline graph preparation. exporter.py:114 INFO Executing `snpe-dlc-graph-prepare --input_dlc subprocess.py:171 shared_with_container/outputs/rtm_ada_depth_fusion_to _rvc4_2026_02_10_17_45_48/intermediate_outputs/rtm_ad a_WEIGHTLESS-modified-quantized.dlc --output_dlc shared_with_container/outputs/rtm_ada_depth_fusion_to _rvc4_2026_02_10_17_45_48/rtm_ada_WEIGHTLESS.dlc --set_output_tensors kps_xyz,dbg_torso_root_center_pred,dbg_kp_pix_confide nce,dbg_kp_z_pred,dbg_px_coords,dbg_z_prior --optimization_level 3 --htp_socs sm8550` INFO [INFO] InitializeStderr: DebugLog initialized. subprocess.py:190 INFO [INFO] SNPE HTP Offline Prepare: Attempting to create subprocess.py:190 cache for SM8550 INFO [USER_INFO] Target device backend record identifier: subprocess.py:190 HTP_V73_SM8550_8MB INFO [USER_INFO] No cache record in the DLC matches the subprocess.py:190 target device (HTP_V73_SM8550_8MB). Creating a new record INFO [USER_INFO] Checking unsigned PD session subprocess.py:190 INFO [INFO] Attempting to open dynamically linked lib: subprocess.py:190 libHtpPrepare.so INFO [INFO] dlopen libHtpPrepare.so SUCCESS handle subprocess.py:190 0x6157d8731830 INFO [INFO] Found Interface Provider (v2.31) subprocess.py:190 INFO [USER_WARNING] Initializing HtpProvider subprocess.py:190 INFO [USER_WARNING] HTP arch will be deprecated, subprocess.py:190 please set SoC id instead. INFO [USER_WARNING] Performance Estimates unsupported subprocess.py:190 on soc SM8550 INFO [USER_INFO] Platform option not set subprocess.py:190 INFO [USER_INFO] Created ctx=0x1 for Graph Id=0 subprocess.py:190 backend=HTP SNPE Id=0x6157d808f8e8 INFO [USER_INFO] Context [0x1] Setting priority to: subprocess.py:190 default INFO [USER_INFO] FP16 precision enabled for graph with subprocess.py:190 id=0 INFO [USER_INFO] Offline Prepare VTCM size(MB) selected = subprocess.py:190 0 INFO [USER_INFO] Optimization Level passed = 3 subprocess.py:190 INFO [USER_WARNING] Sanitizing the value for subprocess.py:190 hvx_threads and setting to default INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. INFO [USER_WARNING] graph_prepare.cc:3565:WARNING:Exceeded subprocess.py:190 maximum scale size. ```

> Well my current issue is that snpe-dlc-graph-prepare(SNPE 2.41) blows up to >75GB of ram usage before crashing when it runs out of memory Unfortunately I can't help you more about this. We have hit a similar issue (graph exploding when trying to quantize the model) and what we did at the end is wrote to Qualcomm support directly but their answer at the time was to try to use a machine with more RAM - which is not something we wanted to hear. Essentially you stumbled upon some bug in the Qualcomm tools so the fix has to come from their side. I would encourage you to write through their customer support channels. What you should be able to do in the meantime though is try and export the model for FP16. Depending on the model size perhaps the FP16 one will already be fast enough on RVC4 (they are quite powerful ;) ).

@"KlemenSkrlj"#p33063 > @"TheHiddenWaffle"#p33061 snpe-dlc-graph-prepare So I'm thinking about it and graph prepare is just there to decrease time to load the model up on the device right? So using the quantized dlc from the second step would result in the same accuracy and inference speed but would just take longer to initially load? Also on another subject, my model wants fp16 depth images in millimeters, do I just specify this in the config.json and the NeuralNetwork node will handle the conversion from DepthAI depth frame which is RAW16/uint16? Or was this conversion supposed to be done in one of the dlc steps? There's just no examples of using depth maps as input to NeuralNetwork nodes that I've seen so I'm unsure. And a final thing, I make a config.json, tar.gz'd it up with the quantized dlc, and tried to upload the archive to hub so that I(and eventually my team) can easily retrieve it from hub using model slug, but whether I upload that archive as a base model or variant it always just says "missing". I've had that happen before and it usually resolves in 10-45 min. Is there any tricks or nuance that I'm missing?

> So I'm thinking about it and graph prepare is just there to decrease time to load the model up on the device right? So using the quantized dlc from the second step would result in the same accuracy and inference speed but would just take longer to initially load? Per some of the Qualcomm forum answers (e.g. [here](https://mysupport.qualcomm.com/supportforums/s/question/0D5dK000006VAYWSA4/why-does-snpedlcgraphprepare-hang-when-trying-to-prepare-graph-for-quantized-dlc)) it should be possible to run the quantized model without doing the prepare step. With modelconverter you'll need to do this manually by commenting out some of the calls (e.g. [snpe-dlc-graph-prepare here](https://github.com/luxonis/modelconverter/blob/main/modelconverter/packages/rvc4/exporter.py#L139)). But overall I would be interested what effect does skipping this step have on the FPS of the model then (please feel free to share your observations if you try it out). > Also on another subject, my model wants fp16 depth images in millimeters, do I just specify this in the config.json and the NeuralNetwork node will handle the conversion from DepthAI depth frame which is RAW16/uint16? Or was this conversion supposed to be done in one of the dlc steps? There's just no examples of using depth maps as input to NeuralNetwork nodes that I've seen so I'm unsure. No, this conversion won't happen automatically by DepthAI, this is a feature that is not yet added to the library. I think in your case with this multi input model its best if you make the first, test version of the application in a way where you have a `Preprocessing dai.node.HostNode` to which you link all the inputs that the model need. And then inside its `process()` function create a dai.NNData() object and populate it with tensors that are preprocessed correctly and match the model input types. So something like this: ```python def process(): ... nn_data = dai.NNData() nn_data.addTensor("depth", preprocessed_depth, dataType=...) nn_data.addTensor("image", preprocessed_image, dataType=...) ... self.out.send(nn_data) ``` And you'd link the output of this node to the `dai.node.NeuralNetwork` input. Once you have this pinned down and working as expected you can potentially move this away from `dai.node.HostNode` and use `dai.node.ScriptNode` which will minimize the latency even more because there will be no device->host->device data transfer happening, the data will stay on the device side. > And a final thing, I make a config.json, tar.gz'd it up with the quantized dlc, and tried to upload the archive to hub so that I(and eventually my team) can easily retrieve it from hub using model slug, but whether I upload that archive as a base model or variant it always just says "missing". I've had that happen before and it usually resolves in 10-45 min. Is there any tricks or nuance that I'm missing? The delay between "Missing" and base model being available is due to the upload job happening in the background. So this is currently expected and as you said, it should resolve by itself on a successful model upload. I'll sync with the team internally if we can make this clearer. Best, Klemen

@"KlemenSkrlj"#p33081 After conversion I'm getting mysterious errors from both depthai and snpe-net-run on the device attempting to run any model i convert(I tested some simpler ones). The ones pulled down from Luxonis AI hub work fine and I compared them using snpe-dlc-info and the only difference seems to be snpe version. Error from running any model I converted: ``` error_code=307; error_message=Model record is missing in dlc. Missing mandatory record model; error_component=Dl Container; line_no=539; thread_id=548016562208 ``` similar error arises from inside DepthAI but more vague(something like "Container failed to start") https://mysupport.qualcomm.com/supportforums/s/question/0D5dK00000BJzfuSAD/running-ai-hub-models-on-rb3-vision-kit-gen2-fails This is the only thing that comes up when you search their forum, so I suspect that the snpe version may be indeed interfering? Do you have a download link for 2.23? The one in the modelconverter readme only goes back to [2.32](https://softwarecenter.qualcomm.com/catalog/item/Qualcomm_AI_Runtime_Community).

> @"KlemenSkrlj"#p33081 this is a feature that is not yet added to the library Also I have an idea to cheat this functionality, should I bother or will this soon be added to DepthAI? My idea is to set up the model to consume 1,384,288,1 uint16's by just pretending that they're 1,384,288,2 uint8's then multiplying the large byte by 256.0 inside the net and adding it to the small byte because the HTP doesn't really like to work with uint16's but has plenty of supported ops for uint8.

Model Design And Conversion

TheHiddenWaffle

KlemenSkrlj And you'd link the output of this node to the dai.node.NeuralNetwork input.
Once you have this pinned down and working as expected you can potentially move this away from dai.node.HostNode and use dai.node.ScriptNode which will minimize the latency even more because there will be no device->host->device data transfer happening, the data will stay on the device side.

Okay so I converted to just FP16 and now I'm having some issues with the Neural Network node and I'm very unsure of what it wants from me because the error doesn't say anything related to which input is wrong and expected vs found type:

[1633257918] [192.168.1.124] [1772048869.060] [NeuralNetwork(18)] [error] Input image type is not supported.
[1633257918] [192.168.1.124] [1772048869.060] [NeuralNetwork(18)] [error] Node threw exception Error while preparing the input buffer.

As an aside, this is definitely and example of what I was saying here:

TheHiddenWaffle As far as I know such things are closed source and my most common way of debugging complex operations in DepthAI is to just stare at the source code for depthai-core, but that's less helpful when the on-device pipeline starts throwing errors

Current structure of my model per snpe-dlc-info:

--------------------------------------------------------------------------------
| Input Name    | Dimensions   | Type      | Encoding Info                     |
--------------------------------------------------------------------------------
| rtm_input     | 1,384,288,3  | Float_16  | No encoding info for this tensor  |
| depth         | 1,384,288,1  | Float_16  | No encoding info for this tensor  |
| camera_K_inv  | 1,2,2        | Float_16  | No encoding info for this tensor  |
--------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
| Output Name                 | Dimensions  | Type      | Encoding Info                     |
---------------------------------------------------------------------------------------------
| dbg_px_coords               | 1,133,2     | Float_16  | No encoding info for this tensor  |
| dbg_torso_root_center_pred  | 1,1,3       | Float_16  | No encoding info for this tensor  |
| dbg_kp_pix_confidence       | 1,133,2     | Float_16  | No encoding info for this tensor  |
| dbg_z_prior                 | 1,133,1     | Float_16  | No encoding info for this tensor  |
| dbg_kp_z_pred               | 1,19        | Float_16  | No encoding info for this tensor  |
| kps_xyz                     | 1,3,133     | Float_16  | No encoding info for this tensor  |
---------------------------------------------------------------------------------------------

Current config.json:

{
    "config_version": "1.0",
    "model": {
        "metadata": {
            "name": "rtm_ada_fusion_fpdepth",
            "path": "rtm_ada_fusion_fpdepth.dlc",
            "precision": "float16"
        },
        "inputs": [
            {
                "name": "rtm_input",
                "dtype": "float16",
                "input_type": "image",
                "shape": [
                    1,
                    384,
                    288,
                    3
                ],
                "layout": "NHWC",
                "preprocessing": {
                    "mean": [
                        0.0,
                        0.0,
                        0.0
                    ],
                    "scale": [
                        1.0,
                        1.0,
                        1.0
                    ],
                    "reverse_channels": false,
                    "interleaved_to_planar": true,
                    "dai_type": "BGRF16F16F16i"
                }
            },
            {
                "name": "depth",
                "dtype": "float16",
                "input_type": "image",
                "shape": [
                    1,
                    384,
                    288,
                    1
                ],
                "layout": "NHWC",
                "preprocessing": {
                    "mean": [
                        0.0
                    ],
                    "scale": [
                        1.0
                    ],
                    "reverse_channels": false,
                    "interleaved_to_planar": true,
                    "dai_type": "GRAYF16"
                }
            },
            {
                "name": "camera_K_inv",
                "dtype": "float16",
                "input_type": "image",
                "shape": [
                    1,
                    2,
                    2
                ],
                "layout": "NCD",
                "preprocessing": {
                    "mean": [
                        0.0
                    ],
                    "scale": [
                        1.0
                    ],
                    "reverse_channels": false,
                    "interleaved_to_planar": false,
                    "dai_type": "GRAYF16"
                }
            }
        ],
        "outputs": [
            {
                "name": "kps_xyz",
                "dtype": "float16",
                "shape": [
                    1,
                    133,
                    3
                ],
                "layout": "NCW"
            },
            {
                "name": "dbg_torso_root_center_pred",
                "dtype": "float16",
                "shape": [
                    1,
                    1,
                    3
                ],
                "layout": "NCH"
            },
            {
                "name": "dbg_kp_pix_confidence",
                "dtype": "float16",
                "shape": [
                    1,
                    133,
                    2
                ],
                "layout": "NCH"
            },
            {
                "name": "dbg_kp_z_pred",
                "dtype": "float16",
                "shape": [
                    1,
                    19
                ],
                "layout": "ND"
            },
            {
                "name": "dbg_px_coords",
                "dtype": "float16",
                "shape": [
                    1,
                    133,
                    2
                ],
                "layout": "NCD"
            },
            {
                "name": "dbg_z_prior",
                "dtype": "float16",
                "shape": [
                    1,
                    133,
                    1
                ],
                "layout": "NCD"
            }
.........head stuff omitted for brevity
        ]
    }
}

input image is being fed in as-is and preprocessing script is just extracting depth.getCvFrame(), converting to np.float16 and feeding both it and calibration into the node with type TensorInfo.DataType.FP16.

Any insight on what's causing the error? I tried a lot of variation on the json config params including modelling them off of Hub's config for yolov6 but at the end of the day I'm feeling pretty blind and a bit frustrated 🫠 .

KlemenSkrlj

Would you be able to share a MRE with me (so the code + the model)? It can be privately through email (klemen.skrlj@luxonis.com)
The DLC info does look correct, the type Float_16 is expected. So just based on this I'm not sure what could be wrong yet.

MatevzMorato

    // First do some basic checks, that DAI type is supported
    using Type = ImgFrame::Type;
    auto supportedTypes = {Type::BGR888i, Type::RGB888i, Type::GRAY8, Type::RAW8};
    if(std::find(supportedTypes.begin(), supportedTypes.end(), daiType) == supportedTypes.end()) {
        logger->error("Input image type is not supported.");
        return std::nullopt;
    }

Snippet from FW @TheHiddenWaffle , for FP16 you'll need to use dai.NNData which goes through a different code path.

No pressure but is there an approximate release date for 3.5? Based off current trends the release date seems to be about every 40 days so maybe early April?

Mid to end of march is the current plan.

« Previous Page