Background Subtraction running on the VPU of OAK-1

mfocka

Hi,

I had a follow up question that I saw in an older thread (https://discuss.luxonis.com/d/168-art-project-background-removal-and-motion-estimation/4#:~:text=quick%20question%20about%20this).

Very simply put, I would like to use the camera's VPU to run my background subtraction algorithm (currently, OpenCV's MOG2) such that the hosting machine's (Linux embedded device) CPU is used as little as possible.
Using PyTorch, I have tried to make a pipeline that captures the essence of this algorithm and then convert this to a blob, but I can't update the history of background model and keep this history throughout the entire process.

For context, in the future, if I can get this to work, an extension would be attaching a classifier to label all objects detected by the MOG2 to label them as Foreign or Non-Foreign.

Does anyone know if this is at all possible? Are there ways to do this using only the nodes of depthAI?

Thanks in advance!

jakaskerl

Hi @mfocka
If you manage to create a model that is capable of producing both the results as well as the history, you could pipe it into script node, split the two layers, then route history back into the model, while routing the output wherever.

Thoughts?

mfocka

jakaskerl, This is what I thought as well, but then I'm unsure about the correct way to reintegrate the updated background model with a new preview frame. Would this process require an additional script node to merge the updated background model back into the pipeline? To clarify, I've outlined the proposed model structure below, incorporating two new script nodes into the DepthAI pipeline:

Where we see fg_mask and updated_background_model, these are one output that scriptnode_out will split. It will then send the updated_background_model to scriptnode_in where this one will combine the new preview frame with it and send it into the model that will split this input.
Is this what you thought as well? Or am I missing something simpler?

Additionally, to address the challenge of a cold start, these models typically initialize with a background-only image and utilize a different learning rate. How can we effectively implement this? My thought is to modify the model to accept a variable learning rate or to enable the DepthAI pipeline to detect whether it has already performed a "hot start." If not, it would repeatedly send the background frame through the model for $n$ iterations, mimicking the hot-start process. Would this approach work, and if so, how can we best implement it?

Looking forward to your insights and suggestions.

jakaskerl

mfocka Is this what you thought as well? Or am I missing something simpler?

You can link the frame directly from camera node while the background model is linked from the script node.

mfocka Additionally, to address the challenge of a cold start, these models typically initialize with a background-only image and utilize a different learning rate. How can we effectively implement this? My thought is to modify the model to accept a variable learning rate or to enable the DepthAI pipeline to detect whether it has already performed a "hot start."

TBH I'm not sure. The mock input at the start is probably the easiest way of doing it. Until the script node receives an input, it should try to output a starting background.