OAK-D POE S2: Multi-model + Standalone mode (TCP Streming)

Irena

Hello everyone,

I'm currently working on implementing a pipeline with two models, MobileNetSSD and FireDetection, running in standalone mode on my OAK-D S2.

While the person detection part is working as expected, I'm encountering some difficulties with the fire detection model. I'm struggling to understand how to manage and decode the detections generated by the fire detection neural network within the script that needs to be flashed onto the camera.

Below, you can find the schematic representation of my pipeline:

However, my primary challenge lies in decoding the fire detections onto device and forwarding them to the host. I'm unsure whether this is even possible, and I'd greatly appreciate any guidance or assistance on how to address this issue.

Thank you in advance!

Irena

In my attempt to follow the FireDetection example, I'm currently facing some uncertainty regarding the process of transitioning from the "lpb.NNData" object, which represents the results obtained from the fire_detection neural network, to a tensor result. Specifically, I'm looking to access the correct layer, "final_result" onto script node, which will be flashed on the device.

script.setScript("""
import socket
import time

server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(("0.0.0.0", 5000))
server.listen()
node.warn("Server up")

labelMap_SSD = ["background", "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow",
                "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]

label_fire = ["fire", "normal", "smoke"]

while True:
    conn, client = server.accept()
    node.warn(f"Connected to client IP: {client}")
    try:
        while True:
            pck = node.io["frame"].get()
            data = pck.getData()
            ts = pck.getTimestamp()
            
            # --------MobilenetSSD
            detections_ssd = node.io["detSSD"].tryGet()
            if detections_ssd:
                dets = detections_ssd.detections
                
                data_ssd = []
                for det in dets:
                    label = labelMap_SSD[det.label]
                    if label == "person":
                        det_bb = [det.label, det.xmin, det.ymin, det.xmax, det.ymax]
                        data_ssd.append(det_bb)

	    # --------FireDetection
	    data_fire = node.io["detFire"].tryGet()
            # node.warn(f"data_fire: {data_fire}")
            # TODO: extract tensor data ???


            # now to send data we need to encode it (whole header is 256 characters long)
            header = f"ABCDE " + str(ts.total_seconds()).ljust(18) + str(len(data)).ljust(8) + str(data_ssd).ljust(224)
            conn.send(bytes(header, encoding='ascii'))
            conn.send(data)

    except Exception as e:
        node.warn(f"Error oak: {e}")
        node.warn("Client disconnected")
""")

Any help will be appreciated.
Irena

Irena

Hello everyone!

I wanted to share that I've made some progress with my project. After carefully reviewing the documentation, I found the solution I was searching for.

It turns out, all I needed to do was utilize the ".getLayerFp16("final_result")" method to extract the results and work with the tensors.

Consider my problem solved! 🙂

Irena

Hi @jakaskerl !

I have some questions regarding the available space on my OAK-D S2 device. When attempting to flash my pipeline (multi-model) onto the device, I encountered the following error message:

Found 1 devices
Start flashing SW_version_MobilenetSSD_FireDet_0.0.1 on device: DeviceInfo(name=10.1.1.101, mxid=1844301001A2970F00, X_LINK_BOOTLOADER, X_LINK_TCP_IP, X_LINK_MYRIAD_X, X_LINK_SUCCESS)
Flashing progress: 0.0%
Flash ERROR: Not enough space: 8388608->41156634B

My pipeline is relatively simple, and I'm not sure how to optimize it or reduce its size. I would greatly appreciate it if you could provide some tips or guide me on how to identify the key points to resolve this issue.

Lastly, I'm curious if there's a way to access the file system on the device. Any information on this would be very helpful.

Thanks in advance!
Irena

jakaskerl

Hi Irena
You can check the available storage with https://docs.luxonis.com/projects/api/en/latest/samples/bootloader/bootloader_version/

When flashing the application, use compress=True.

There might not be enough memory to store the pipeline there. Also, I found that SDK pipelines are much bigger than APIs.

Thanks,
Jaka

Irena

Hello @jakaskerl !

Thank you for your quick response.

Indeed, when attempting to flash the pipeline, I'm using the parameter "compress = True." I'm not using the SDK in my pipeline; instead, I'm using the API.

I've also checked the available memory on my device. However, I'm unsure of how to determine what is consuming space in my pipeline. Are there any script to check that one.

Appreciate your help in advance!
Irena

jakaskerl

Hi Irena
Well, what are your numbers for NOR and EMMC storage?

Irena However, I'm unsure of how to determine what is consuming space in my pipeline. Are there any script to check that one.

To my knowledge, there is no script to check specific node sizes. Only the whole pipeline.

Thanks,
Jaka

Irena

Hi jakaskerl !

Result for OAK-D S2 PoE FF:

Found device with name: 10.1.1.102
Version: 0.0.22
NETWORK Bootloader, is User Bootloader: False
Memory 'Memory.FLASH' size: 33554432, info: JEDEC ID: 01 02 19
Memory 'EMMC' not available...

Here you can check script which I'm tryinf to flash on the device. I would greatly appreciate any assistance.

Thank you in advance, Jaka!
Irena

Irena

Hi @jakaskerl !

I was wondering if you have had a chance to conduct any testing with the information I provided?

Thank you a lot in advance.
Irena

jakaskerl

Hi Irena
I'm confused as to why the values are so strange. The storage space is different from one script to the next, so I'm just wondering if there is a bug with the model, not the actual storage.

The model is also not that large:

8388608->41156634B # left side is smaller..

Thanks,
Jaka

Irena

Hi @jakaskerl !

Which of the two models are you referring to? I have a hunch it may be related to the fire and smoke detection, but I'm not entirely sure.

I'm planning to conduct some tests, and I'll reach out to you again with the test data.

Thank you very much in advance.
Irena

jakaskerl

Hi Irena
Talked to our firmware dev, it's likely there is just not enough space. The bootloader + the firmware + the pipeline + the model seem to take up more storage space than you have available. The fw logs don't make sense either, since there is information missing, but I was told the standalone is "semi-deprecated", so likely this won't be fixed in the future. Your only hope right now is to use a device with enough NOR or EMMC memory to support running the apps - has to be in GB domain, not MB.

Thanks,
Jaka

Irena

Hi @jakaskerl !!

Thanks for your quick response.

I tried using other models, and it turns out they don't work. This is a major issue for us because our priority that the cameras can work in standalone mode.

In response to your feedback and our tests, I have a series of questions regarding possible solutions and implementation on our devices.

As you mentioned, now we are trying to avoid standalone mode. With that in mind, and considering our goal of operating 12 cameras with a single PC, we've noticed that when we execute the script (which you can check oak_ssd_yolov5.py) on the host, it launches 8 threads per camera. 7 of these threads correspond to the created nodes, plus the main script.

Logs:

I understand it's a complex issue, but is there any way to reduce or encapsulate this behavior?

Secondly, can we run a single script for all our cameras? We are thinking about the scalability of our development, so it was important to us that the cameras can work in standalone mode. Since that's not possible, our question is whether we can execute a single .py to all of our devices.

Lastly, looking ahead to our next projects, do you think if this dives, OAK-1 POE FF fit to us and our aims (custom multimodel standalone mode + TCP streming) also it's possible to incorporate an M12 connector into this device? If so, who can I speak to about it?

I appreciate all the help and look forward to your response.
Irena

jakaskerl

Hi Irena,
I am confused as to what is actually happening here. The script you have sent should execute completely on the device, on the LEON CSS processor. This is because anything created within the pipeline will get uploaded to the device. The host should have access to anything except the XLink messages (and the internet - sockets). It could be that something else is using the threads.

Irena Secondly, can we run a single script for all our cameras?

Sure. Use this example to connect to multiple devices and upload pipeline to each one.

Irena Lastly, looking ahead to our next projects, do you think if this dives, OAK-1 POE FF fit to us and our aims (custom multimodel standalone mode + TCP streming) also it's possible to incorporate an M12 connector into this device? If so, who can I speak to about it?

cc. @erik for this one.

Thanks,
Jaka

Irena

Hello jakaskerl !!

I appreciate your prompt response. It's confirmed that the pipeline runs on the device. My main focus has been on the host's performance behavior when launching the script. Upon reviewing my logs, I notice several different threads upon script launch, presumably corresponding to each node in the pipeline.

Thank a lot for the reference you provided on managing all the cameras with a single script; it has proven very useful to me.

A new question arises as I delve into the following example gen2-yolo-device-decoding. Given that we cannot operate in multi-model standalone mode, we are aiming to offload most processes to the device to ease the load on the host, which is handling 12 cameras. The example suggests that the device can handle the decoding of the neural network output using the YoloDetectionNetwork node. If we have a custom model, can we perform a similar decoding of the custom model's output on the device?

Thank you once again for your assistance.
Irena

erik

Hi @Irena ,
Yep that would work, you can run multiple models in standalone mode, and also add TCP streaming to your application. M12 connection would require a complete redesign of the hardware & enclosure, and we likely won't be working on that, especially because our next gen of devices will all have M12 + M8 + USB-C, including the (name tbd) "OAK4-1", so single cam with RVC4 on there and M12 connector, which would perfectly suite your requirements. Thoughts?
Thanks, Erik

Irena

Hi erik !!

Thank you for your response. I'm eager to know if there's any information available regarding the release date and potential cost of these devices. This information holds significant relevance for our upcoming projects.

Thanks once again!
Irena

jakaskerl

Irena If we have a custom model, can we perform a similar decoding of the custom model's output on the device?

Could you elaborate a bit on what kind of model you are using. Maybe try the model (if Yolo) with the YoloDetectionNetwork node. It essentially does what host side decoding would do, but is customized to work for Yolo models only and runs on-device.
Though I am not sure whether the blob will run out-of-the-box with the Yolo node; we usually suggest training the models with our training notebooks (https://github.com/luxonis/depthai-ml-training/tree/master/colab-notebooks).

Thanks,
Jaka

erik

Hi @Irena ,
Planned release is June 2024, prices vary depending on the model/variation. MSRPs will likely range from $400 for OAK-1-PoE equivalent to $800 for the OAK-D-LR equivalent. All models will offer both POE and USB connectivity.

Irena

Hi erik !!

That's fantastic news. Thank you for your prompt response, and we will be watching for your next releases.

Thank you!
Irena

Irena

Hi jakaskerl!

Hi Jaka,

We are contemplating training Yolo (v6-v7) for a detection model, that is to say our custom data + yolo. From my understanding, as you mentioned, it's possible for the decoding to be performed within the device.

Thanks for the reference to the training models 🙂

The question that arises for me is whether, if we develop our own model from scratch, it is feasible to have the same functionality, i.e., decoding the results on the device?

Thank you in advance!
Irena