I created a NN that subtracts 2 depth frames: `class DiffImgs(nn.Module):` ` def forward(self, img1, img2):` `if img1.shape != (1, 1, 960, 1632) or img2.shape != (1, 1, 960, 1632):` `raise ValueError('Input images must have shape (1, 1, 960, 1632)')` `return torch.sub(img1, img2)` I get the output as a vector by using: `auto depth_diff_message = queues["depth_diff"]->get();` ` std::vector depth_diff_data = depth_diff_message->getFirstLayerFp16()` * The input of StereoDepth with subpixel to the model is UINT16 * The model does the diff and outputs at Fp16 * Because I subtracted it means the results can be negative or positive * getFirstLayerFp16 outputs a 1D vector of floats * In the depthai C++ repo it looks like there needs to be some conversion from NNData to get into a CV frame: https://github.com/luxonis/depthai-core/blob/main/examples/utility/utility.cpp `cv::Mat fromPlanarFp16(const std::vector& data, int w, int h, float mean, float scale){` ` cv::Mat frame = cv::Mat(h, w, CV_8UC3);` ` for(int i = 0; i < w*h; i++) {` ` auto b = data.data()[i + w*h * 0] * scale + mean;` ` frame.data[i*3+0] = (uint8_t)b;` ` }` ` for(int i = 0; i < w*h; i++) {` ` auto g = data.data()[i + w*h * 1] * scale + mean;` ` frame.data[i*3+1] = (uint8_t)g;` ` }` ` for(int i = 0; i < w*h; i++) {` ` auto r = data.data()[i + w*h * 2] * scale + mean;` ` frame.data[i*3+2] = (uint8_t)r;` ` }` ` return frame;` `}` **What is the proper way to convert a diff'd StereoDepth frame from vector into cv::Mat?**

Hi @"AdamPolak"#p8825 , As it's INT16, you'd need to implement something similar to depth -> pointcloud conversion docs for your "model" as well: https://docs.luxonis.com/en/latest/pages/tutorials/device-pointcloud/#on-device-pointcloud-nn-model Thanks, Erik

@"erik"#p8849 Huge help thank you. Will dig into this.

@"erik"#p8849 I have updated the model to take in the conversion of a depth map from U8 to Fp16 [upl-image-preview url=https://discuss.luxonis.com/assets/files/2023-06-03/1685810734-453629-image.png] I can't figure out how to "get back" to the depth frame from the resulting NNData. **1. If it is a vector of floats that I pull out of NNData that are fp16, does that mean I need to reduce it again to U8? How is the conversation made so I know how to undo it.** 2. **If I set sub-pixel = true, the data type changes to UINT16, how would that change what I need to do?**

Hi @"AdamPolak"#p8870 , 1. I don't think that's possible - afaik NN can only output FP16. 2. It's the same, you are expecting depth (INT16) anyways, not disparity (where it changes from INT8 to INT16 when subpixel is enabled).

@"erik"#p8872 I am close, I am getting depth values that make sense in fp16 from the model. The issue is I do not know how to convert those floats back into a cv::mat I tried adding the floats to a cv::Mat.data() just by going 1 by 1 but it seems to be encoded differently. How does the 1D vector of floats compare to a 2D 1 channel image? Should I be alternating every float or something for height/width? This seems to be my last step.

@"AdamPolak"#p8873 please check how it's done on other demos: https://github.com/luxonis/depthai-experiments/blob/master/gen2-custom-models/concat.py#L58 ```python inNn = np.array(qNn.get().getData()) frame = inNn.view(np.float16).reshape(shape).transpose(1, 2, 0).astype(np.uint8) cv2.imshow("Concat", frame) ```

@"erik"#p8875 I have been staring at that example so hard I think I have it memorized lol. The example uses 3 channels and uses numpy to reshape. The C++ version uses this utility function to shape the 3 channels: https://github.com/luxonis/depthai-core/blob/main/examples/utility/utility.cpp https://github.com/luxonis/depthai-core/blob/main/examples/NeuralNetwork/concat_multi_input.cpp So it doesn't have to deal with translating it from 0-65535 to 0-255 In a way that can be displayed. Interpreting it as a CV32FC1 frame and then using this approach: https://github.com/luxonis/depthai-core/blob/125feb8c2e16ee4bf71b7873a7b990f1c5f17b18/examples/StereoDepth/depth_preview.cpp#LL54C43-L54C43 `frame.convertTo(frame, CV_8UC1, 255 / depth->initialConfig.getMaxDisparity());` Leaves it scrambled as well. I can't figure what the heck I am doing wrong.

@"AdamPolak"#p8886 have you consulted GPT4 already? Usually it's fairly smart about such python->cpp conversions.

@"erik"#p8887 Like you wouldn't believe. It seems like since it got nerfed in the latest update it can't do hardcore things anymore. It got the order of how pytorch interprets columns/rows wrong and threw me off for half a day lol.

@"erik"#p8887 1. I subtract the depth frame from 0's in the model: [upl-image-preview url=https://discuss.luxonis.com/assets/files/2023-06-03/1685826452-32483-image.png] So it is just outputting the same depth frame values. 2. I get the the NNData from the queue and check that the size is the same as the expected 1632x960 and it is [upl-image-preview url=https://discuss.luxonis.com/assets/files/2023-06-03/1685826590-213004-image.png] 3. I create a cv::Mat and then iterate through the floats, normalize to 0-255 and save as uint8_t [upl-image-preview url=https://discuss.luxonis.com/assets/files/2023-06-03/1685826679-718537-image.png] 4. I then normalize and and display [upl-image-preview url=https://discuss.luxonis.com/assets/files/2023-06-03/1685826749-381884-image.png] And what shows up: [upl-image-preview url=https://discuss.luxonis.com/assets/files/2023-06-03/1685826904-727368-image.png] It is maddening

If I subtract 2 StereoDepth frames from each other how to output in OpenCV

AdamPolak

jeremie_m

Yes I have.

Turns out the issue was that the model I created was expecting the input I entered for the StereoDepth.preview size.

But instead the depth frames output at the resolution you provide for the depth.

Let me know if you have any questions I know it pretty well now.

jeremie_m

AdamPolak

Thank you Adam, I have questions about the model 'diff_images_simplified_openvino_2021.4_4shave.blob', is it generated by the pytorch code here?

AdamPolak This is the pytorch code.

Is it still using the dummy input or the depth input here is the result of the subtraction?

AdamPolak def forward(self, depth):

jeremie_m

AdamPolak

Any suggestions? 😃

AdamPolak

jeremie_m

This is the "final" version to do a diff between 2 depth map images:

#! /usr/bin/env python3

from pathlib import Path
import torch
from torch import nn
import blobconverter
import onnx
from onnxsim import simplify
import sys

# Define the model
class DiffImgs(nn.Module):
    def forward(self, img1, img2):
        # We will be inputting UINT16 but interprets as UINT8
        # So we need to adjust to account of the 8 bit shift
        img1DepthFP16 = 256.0 * img1[:,:,:,1::2] + img1[:,:,:,::2]
        img2DepthFP16 = 256.0 * img2[:,:,:,1::2] + img2[:,:,:,::2]

        # Create binary masks for each image
        # A pixel in the mask is 1 if the corresponding pixel in the image is 0, otherwise it's 0
        img1Mask = (img1DepthFP16 == 0)
        img2Mask = (img2DepthFP16 == 0)

        # If a pixel is 0 in either image, set the corresponding pixel in both images to 0
        img1DepthFP16 = img1DepthFP16 * (~img1Mask & ~img2Mask)
        img2DepthFP16 = img2DepthFP16 * (~img1Mask & ~img2Mask)

        # Compute the difference between the two images
        diff = torch.sub(img1DepthFP16, img2DepthFP16)

        # Square the difference
        # square_diff = torch.square(diff)

        # # Compute the square root of the square difference
        # sqrt_diff = torch.sqrt(square_diff)

        # sqrt_diff[sqrt_diff < 1500] = 0

        return diff

# Instantiate the model
model = DiffImgs()

# Create dummy input for the ONNX export
input1 = torch.randn(1, 1, 320, 544 * 2, dtype=torch.float16)
input2 = torch.randn(1, 1, 320, 544 * 2, dtype=torch.float16)

onnx_file = "diff_images.onnx"

# Export the model
torch.onnx.export(model,               # model being run
                  (input1, input2),    # model input (or a tuple for multiple inputs)
                  onnx_file,        # where to save the model (can be a file or file-like object)
                  opset_version=12,    # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input1', 'input2'],   # the model's input names
                  output_names = ['output'])

# Simplify the model
onnx_model = onnx.load(onnx_file)
onnx_simplified, check = simplify(onnx_file)
onnx.save(onnx_simplified, "diff_images_simplified.onnx")

# Use blobconverter to convert onnx->IR->blob
blobconverter.from_onnx(
    model="diff_images_simplified.onnx",
    data_type="FP16",
    shaves=4,
    use_cache=False,
    output_dir="../",
    optimizer_params=[],
    compile_params=['-ip U8'],    
)

Important to note! This does not take in dynamic image sizes. It must be a certain size. For some reason dynamic dimensions are not supported. So these 2 lines:

# Create dummy input for the ONNX export

input1 = torch.randn(1, 1, 320, 544 * 2, dtype=torch.float16)

input2 = torch.randn(1, 1, 320, 544 * 2, dtype=torch.float16)

Define what size of depth images are coming in. change 320 (height) and 544 (width) to your actual depth image size.

These lines are what changes the depth input from U8 (1 byte) to U16 (2 bytes):

depthFP16 = 256.0 * depth[:,:,:,1::2] + depth[:,:,:,::2]

The reason is because the depth image comes into the model at U16. We then convert it to U8 when it enters the model. We tell the nn to do that by this compile command: compile_params=['-ip U8']

So the data comes in twice as big because it changes from U16 to U8. It needs twice as many bytes to represent the image. What that operation does is a little trick to turn the U8 data into FP16 data (which is required by the NN). So what that does is it unconverts the input data back from U8 to U16 (in this case FP16).

What is your use case, do you also want to diff a "control depth" from new depth or something else.

jeremie_m

AdamPolak depthFP16 = 256.0 * depth[:,:,:,1::2] + depth[:,:,:,::2]

Is the subtraction executed here?

jeremie_m

AdamPolak

Thank you Adam, that helps a lot!

I thought the image size is always the same once the camera config is fixed.

And the transforms from U16 to U8, then unconverted to FP16, the procedure seems tricky.

I will try to understand the dynamic dimensions and the procedure of transform.

In fact, my case is just as your 'control depth', I want to make a subtraction of 2 successive depth frames to find the moving pixels, but I'm not so skilled at the NN model, and the subtraction must be done by NN model in the device.

Adam, you help a lot 😃

jeremie_m

AdamPolak (input1, input2), # model input (or a tuple for multiple inputs)

And this single input doesn't match the inputs here if there's no change?

AdamPolak stereo.disparity.link(nn.inputs["input1"])

Maybe I should look into this example for more information?

https://github.com/luxonis/depthai-experiments/tree/master/gen2-custom-models#difference-between-2-frames

Thanks again Adam.

jeremie_m

AdamPolak

Thanks for reply.

I got images like this with the diff process above:

AdamPolak This is the "final" version to do a diff between 2 depth map images:

And I added time_diff code:

timestamp = dai.Clock.now();

with dai.Device(p) as device:

…

time_diff = depthDiff.getTimestamp() - timestamp print('time_diff = ', time_diff) timestamp = depthDiff.getTimestamp()

Which the output is always 0.0

I'm confused now.

AdamPolak

jeremie_m

You are right, the image is the same size once it is fixed. I just meant that if all of a sudden you wanted to increase/decrease resolution on your depth frame to improve, you would need to create a new model.

Heads up, you need to have quite a lot of depth filters enabled to make this diff work, the original depth frames are too noisy without post processing.

And when you do basically any type of depth processing, like MedianFilter, it slows down the depth FPS to ~9-11.

But it will take your diff from this (no processing):
(2 identical frames, nothing moved in the scene)

to this (median filter 7x7 and high_density):

To this ( a lot of processing):

jeremie_m

AdamPolak quite a lot of depth filters enabled to make this diff work

Thanks, Adam, 9-11 FPS maybe enough for me, I have to try to make the filters work in the host if the rate is too low.

Is the config of depth filters is set as you mentioned in the code here or it's more complex than the parameters here?

AdamPolak This is the depthai code

jeremie_m

AdamPolak

Hello Adam, I've add the following process for the inputs, but it seems something not right.

Could you let me know how to adapt the depthai code.

Thank you Adam.

script.setScript(""" old = node.io['in'].get() while True: frame = node.io['in'].get() node.io['img1'].send(old) node.io['img2'].send(frame) old = frame """) script.outputs['img1'].link(nn.inputs['input1']) script.outputs['img2'].link(nn.inputs['inout2'])

jeremie_m

@erik

Hello erik, do you have tutorials or examples in processing depth frames?

jakaskerl

Hi jeremie_m
What kind of depth frame processing are you thinking of?

Thanks,
Jaka

jeremie_m

jakaskerl

Thank you for reply.

As the subtraction of depth frames as we mentioned here.
Selection of the farthest or nearest area of pixels in depth frame.
Mask of specific shape from or to depth frame.

(4. Some NN models use both depth and RGB image maybe.)

.etc

Some thoughts for now, thank you 😃

erik

jeremie_m does the code Adam provided above not work? Besides the tutorials we have on documentation / depthai-experiments, we don't have any additional ones.

jeremie_m

erik

Thank you erik.

I'm not sure if the first code works, cause it doesn't match the inputs with the second code.

And I'm looking into how to generate the shave.blob, I'm not clear in processing the NN model.

jakaskerl

Hi jeremie_m
Here are some guides on how to create NN models for image processing.

Hope it helps,
Jaka

jeremie_m

jakaskerl thank you 😃

AdamPolak

jeremie_m

Hey it seems like you are updating the "old" frame each time.

Which means you are basically subtracting 2 immediate frames from each other, is that what you are trying to do?

If you want a "control" frame, then update to remove old = frame from your code.

Also, put in a sleep at the top of the while loop or you get unexpected behavior.

jeremie_m

AdamPolak

Thanks you Adam.

I've tried what you said, but there is some problem also.

May I get your email address, I'd like to give you more details.

Thank you again Adam.

AdamPolak

jeremie_m

Hey not really looking to publically post my email address. What is the issue that is happening?

jeremie_m

AdamPolak

All I want to do, is just to subtract two frames in sequence, just use the older frame as the 'control' frame.

« Previous Page Next Page »