On-device Pointcloud NN model with C++

FredericGauthier · Mar 8, 2023

Hello,
I try to convert depth map to point cloud in C++ using this python example : https://docs.luxonis.com/en/latest/pages/tutorials/device-pointcloud/ and this code : https://github.com/luxonis/depthai-experiments/tree/master/gen2-pointcloud
In the python example there are these lines of code (https://github.com/luxonis/depthai-experiments/blob/master/gen2-pointcloud/device-pointcloud/main.py) :

# Creater xyz data and send it to the device - to the pointcloud generation model (NeuralNetwork node)
xyz = create_xyz(resolution[0], resolution[1], np.array(M_right).reshape(3,3))
matrix = np.array([xyz], dtype=np.float16).view(np.int8)
buff = dai.Buffer()
buff.setData(matrix)
device.getInputQueue("xyz_in").send(buff)

The shape of matrix variable is 1 x 400 x 640 x 6. I think that it is possible to create a similar structure in the c++ but in c++, the dai::Buffer::setData function only takes a std::vector<std::uint8_t>. I am not a NN specialist but I suppose that it is not possible to use directly this model in c++ ? Am I right ? Is there any solution to achieve this ?

If someone could give me some advice on how to calculate a point cloud from depth on a device with an NN model, that would be great!

Thanks,
Fred

erik · Mar 9, 2023

Hi FredericGauthier ,
The .view(np.int8) will view the created matrix (xyz) as an int8 vector, so it can be correctly inserted into the dai.Buffer() msg. I have no experience in C++, but it looks like chatgpt has some idea, perhaps useful. Thoughts?
Thanks, Erik

FredericGauthier · Mar 10, 2023

erik I test different solution but it is not working. I'm probably doing something wrong but I don't understand how to order the data in a simple vector std::vector<std::uint8_t> when the data structure in python is more complex (1x400x640x3 or 1x400x640x6 if we convert FP16 to 2*int8). I think we're going to do some more test as we really need this working but we are not very confident.
Thanks

erik · Mar 10, 2023

Hi FredericGauthier ,
The n-dimensional array actually gets "flattened" when you pass it to the buffer.setData(), so it shouldn't be much more complex. That said - for trackability, could you create an Issue on depthai-core for on-device pointcloud support? We do have it somewhere on roadmap.
Thanks, Erik

FredericGauthier · Mar 10, 2023

Hi erik I create a new issue : https://github.com/luxonis/depthai-core/issues/756

I agree with you but I not very familiar with this and for me the question is : how is the data "flattened"?
Let's say we have the X, Y and Z coordinates of shape (2x2) : [[ X11,X12], [X21,X22]] , [[ Y11,Y12], [Y21,Y22]] , [[ Z11,Z12], [Z21,Z22]]. Then we want to create a (1x12) vector to send to the buffer. The data are flattened like this:
[ X11,X12, X21,X22 ,Y11,Y12,Y21,Y22,Z11,Z12,Z21,Z22] ?
or [X11, Y11,Z11,X12,Y12,Z12, X21,Y21,Z21,X22,Y22,Z22]?
or a different way?
In addition, as X, Y and Z are FP16 type we need to convert it to 2 uint8 to pass it to the model. So if X is represent by X_1 and X_2 then the "flattened" data are:
[ X11_1,X11_2,X12_1,X12_2,X21_1,X21_2,X22_1,X22_2,Y11_1,Y11_2,.......,Z21_1,Z21_2,Z22_1,Z22_2]? Otherwise?

Thanks

erik · Mar 11, 2023

Hi FredericGauthier ,
By flatten, my guess is that it just copies how it's currently written in RAM. It would probably be easiest to just examine what the xyz looks like in terms of bytes (debug/print).
Thanks, Erik

FredericGauthier · Mar 13, 2023

Hi erik ,
I finally got it to work. Here is my code:

The create_xyz function create the std::vector<std::uint8_t> to use with de buffer.setData() function :

std::vector<std::uint8_t> create_xyz(int width, int height, std::vector<std::vector<float>> camera_matrix)
{
    cv::Range xs = cv::Range(0,width-1);
    cv::Range ys = cv::Range(0,height-1);

    std::vector<int> t_x, t_y;
    for (int i = xs.start; i <= xs.end; i++) t_x.push_back(i);
    for (int i = ys.start; i <= ys.end; i++) t_y.push_back(i);

    float fx = camera_matrix.at(0).at(0);
    float fy = camera_matrix.at(1).at(1);
    float cx = camera_matrix.at(0).at(2);
    float cy = camera_matrix.at(1).at(2);

    cv::Mat y_coord(height,width,CV_32FC1);
    cv::Mat x_coord(height,width,CV_32FC1);
    for (int row_index = 0 ; row_index < height ; row_index++)
    {
        for (int col_index = 0 ; col_index < width ; col_index++)
        {
            float v = t_y.at(row_index);
            y_coord.at<float>(row_index,col_index) = (v-cy)/fy;

            float u = t_x.at(col_index);
            x_coord.at<float>(row_index,col_index) =  (u-cx)/fx;
        }
    }

    std::vector<uint8_t> result;

    for (int row_index = 0 ; row_index < height ; row_index++)
    {
        for (int col_index = 0 ; col_index < width ; col_index++)
        {
            float X = x_coord.at<float>(row_index,col_index);
            float Y = y_coord.at<float>(row_index,col_index);
            float Z = 1.0f;

            std::vector<uint8_t> X_int = convert_fp32_to_uint8_array(X);
            std::vector<uint8_t> Y_int = convert_fp32_to_uint8_array(Y);
            std::vector<uint8_t> Z_int = convert_fp32_to_uint8_array(Z);

            result.push_back(X_int[0]);
            result.push_back(X_int[1]);
            result.push_back(Y_int[0]);
            result.push_back(Y_int[1]);
            result.push_back(Z_int[0]);
            result.push_back(Z_int[1]);
        }
    }
    return result;
}

std::vector<uint8_t> convert_fp32_to_uint8_array(float value)
{
    uint32_t* k = reinterpret_cast<uint32_t*>(&value);
    unsigned int fltInt32 = k[0];
    unsigned short fltInt16;

    fltInt16 = (fltInt32 >> 31) << 5;
    unsigned short tmp = (fltInt32 >> 23) & 0xff;
    tmp = (tmp - 0x70) & ((unsigned int)((int)(0x70 - tmp) >> 4) >> 27);
    fltInt16 = (fltInt16 | tmp) << 10;
    fltInt16 |= (fltInt32 >> 13) & 0x3ff;

    std::vector<uint8_t> res;
    uint8_t *m = reinterpret_cast<uint8_t*>(&fltInt16);

    res.push_back(m[0]);
    res.push_back(m[1]);

    return res;
}

Then, the point cloud data are store in the std::vector<float> pcl_data like this (in row-major format):

X data from 0 to pcl_data.size()/3-1
Y data from pcl_data.size()/3 to pcl_data.size()/3*2-1
Z data from pcl_data.size()/3*2 to pcl_data.size()/3-1

The code is not optimised, but it works. Hope it can help!
Thanks

erik · Mar 13, 2023

Hi FredericGauthier , Awesome, thanks for sharing, I have added it to our docs!