Hi @jakaskerl

I had try my own model on Mono & MobilenetSSD ,but It don't detect correct ,The Screen fall of wrong bounding box, I don't know why it doesn't work, But the same model at the old example is working(the old example is not using MobilenetSSD ,is rather like the bottom of https://www.oakchina.cn/2023/02/24/yolov8-blob/ )

I had modify the Mono & MobilenetSSD is

nnPath = str((Path(__file__).parent / Path('C:/egg.blob')).resolve().absolute())

labelMap = ["egg"]

manip.initialConfig.setResize(320, 320)

There’s nothing wrong with using Depthai Tools to transfer yolov8 model, right?

Thanks,

LI

Hi @li_you_chen
If you train with yolov8, it won't work on mobilenetSSD. Here should be a more up to date yolov8 network notebook which you can use directly.

Thanks,
Jaka

Hi @jakaskerl

The link is using the same Depthai Tools to transfer model as I did ,It's just that I'm not using Colab rather and using my laptop,I think there should be no difference,

I refer to the old yoloDetect example and modify it like this

import cv2

import depthai as dai

import argparse

# Specify the path to your own model weights file

model_weights_path = "C:/catdog320.blob"

parser = argparse.ArgumentParser()

parser.add_argument('nnPath', nargs='?', help="Path to mobilenet detection network blob", default=model_weights_path)

parser.add_argument('-ff', '--full_frame', action="store_true", help="Perform tracking on full RGB frame", default=False)

args = parser.parse_args()

fullFrameTracking = args.full_frame

# Load model weights using dai.OpenVINO.Blob

custom_model_blob = dai.OpenVINO.Blob(model_weights_path)

numClasses = 80

dim = next(iter(custom_model_blob.networkInputs.values())).dims

output_name, output_tenser = next(iter(custom_model_blob.networkOutputs.items()))

numClasses = output_tenser.dims[2] - 5

labelMap = ["class_%s" % i

for i in range(numClasses)

]

# Create a pipeline

pipeline = dai.Pipeline()

monoL = pipeline.create(dai.node.MonoCamera)

manip = pipeline.create(dai.node.ImageManip)

detectionNetwork = pipeline.create(dai.node.YoloDetectionNetwork)

objectTracker = pipeline.create(dai.node.ObjectTracker)

manipOut = pipeline.create(dai.node.XLinkOut)

trackerOut = pipeline.create(dai.node.XLinkOut)

manipOut.setStreamName('flood-left')

trackerOut.setStreamName("tracklets")

monoL.setNumFramesPool(24)

monoL.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)

script = pipeline.create(dai.node.Script)

script.setProcessor(dai.ProcessorType.LEON_CSS)

script.setScript("""

floodBright = 0.1

node.warn(f'IR drivers detected: {str(Device.getIrDrivers())}')

while True:

event = node.io['event'].get()

Device.setIrFloodLightIntensity(floodBright)

frameL = node.io['frameL'].get()

node.io['floodL'].send(frameL)

""")

# Model-specific settings

detectionNetwork.setBlob(custom_model_blob)

detectionNetwork.setConfidenceThreshold(0.7)

detectionNetwork.input.setBlocking(False)

objectTracker.setDetectionLabelsToTrack([]) #all

objectTracker.setTrackerType(dai.TrackerType.ZERO_TERM_COLOR_HISTOGRAM)

objectTracker.setTrackerIdAssignmentPolicy(dai.TrackerIdAssignmentPolicy.SMALLEST_ID)

manip.initialConfig.setResize(320, 320)

manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)

#link

monoL.out.link(manip.inputImage)

manip.out.link(detectionNetwork.input)

monoL.frameEvent.link(script.inputs['event'])

monoL.out.link(script.inputs['frameL'])

script.outputs['floodL'].link(manipOut.input)

if fullFrameTracking:

manip.video.link(objectTracker.inputTrackerFrame)

else:

detectionNetwork.passthrough.link(objectTracker.inputTrackerFrame)

detectionNetwork.passthrough.link(objectTracker.inputDetectionFrame)

detectionNetwork.out.link(objectTracker.inputDetections)

objectTracker.out.link(trackerOut.input)

# Connect to the device and start the Pipeline

with dai.Device(pipeline) as device:

preview = device.getOutputQueue("flood-left", 4, False)

tracklets = device.getOutputQueue(name="tracklets", maxSize=4, blocking=False)

while True:

imgFrame = preview.get()

track = tracklets.get()

print('track',track)

color = (255, 0, 0)

frame = imgFrame.getCvFrame()

trackletsData = track.tracklets

print('trackdata',trackletsData)

for t in trackletsData:

print('t',t)

roi = t.roi.denormalize(frame.shape[1], frame.shape[0])

x1 = int(roi.topLeft().x)

y1 = int(roi.topLeft().y)

x2 = int(roi.bottomRight().x)

y2 = int(roi.bottomRight().y)

try:

label = labelMap[t.label]

except:

label = t.label

cv2.putText(frame, str(label), (x1 + 10, y1 + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)

cv2.putText(frame, f"ID: {[t.id]}", (x1 + 10, y1 + 35), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)

cv2.putText(frame, t.status.name, (x1 + 10, y1 + 50), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)

cv2.rectangle(frame, (x1, y1), (x2, y2), color, cv2.FONT_HERSHEY_SIMPLEX)

cv2.imshow("tracker", frame)

if cv2.waitKey(1) == ord('q'):

break

The program can run but cannot recognize anything, If adding detectionNetwork.out.link(trackerOut.input),

it will show " trackletsData = track.tracklets AttributeError: 'depthai.ImgDetections' object has no attribute 'tracklets' " ,Can you help me debug

Hi @li_you_chen
Hmm, could you send the model over, along with MRE so we can check please?
Try changing the IOU threshold and the confidence to see if you get any improvements in detections.
Also important make sure you update to the latest depthai version (2.25.1).

Thanks,
Jaka

Hi @li_you_chen
I tested with the model and I am getting no detections (to pictures of cats and dogs). I looks like detection problem.

Hi @jakaskerl

I don’t know how you test my model ,but I’m pretty sure it’s work on this Yolodetectionnetwork,The Only wondering me is why the same model in blob format is not work on the mobilnetdetection ,

I had always use Depthai Tools to convert the model from .pt into .blob

Thanks,

Li

Hi @li_you_chen
I tested it on images of cats and dogs, with the swapped model inside this example: https://docs.luxonis.com/projects/api/en/latest/samples/Yolo/tiny_yolo/

Could you post the output of detections:

inDet = qDet.tryGet()

if inDet is not None:
            detections = inDet.detections
            print(detections)

You can't interchange Yolo blobs and Mbnet, they have completely different decoding functions.

Thanks,
Jaka

Hi @jakaskerl

After I change model ,it works and print(detections) show [<depthai.ImgDetection object at 0x000002276FD7ECF0>, <depthai.ImgDetection object at 0x000002276FCD8CB0>  

it works ,but i have to delete two line(that is not a problem)

#cv2.putText(frame, labelMap[detection.label]….

#cv2.putText(frame, f"{int(detection.confidence * 100)}%…….

I modify this three place

#change my model

nnPathDefault = str((Path(__file__).parent / Path('C:\catdog320.blob')).resolve().absolute())

#MobileNetDetectionNetwork -> YoloDetectionNetwork

detectionNetwork = pipeline.create(dai.node.YoloDetectionNetwork)

detectionNetwork.setBlobPath(nnPathDefault)

it show nothing on the frame

Do you mean this trained model's format is mobilennetssd ?

Hi @li_you_chen
No, it's yolo. I'm just saying you can not use a yolo blob inside MobileNetDetectionNetwork.

#!/usr/bin/env python3

"""
The code is the same as for Tiny Yolo V3 and V4, the only difference is the blob file
- Tiny YOLOv3: https://github.com/david8862/keras-YOLOv3-model-set
- Tiny YOLOv4: https://github.com/TNTWEN/OpenVINO-YOLOV4
"""

from pathlib import Path
import sys
import cv2
import depthai as dai
import numpy as np
import time

# tiny yolo v4 label texts
labelMap = [
    "cat", "dog"
]

syncNN = True

# Create pipeline
pipeline = dai.Pipeline()

# Define sources and outputs
camRgb = pipeline.create(dai.node.ColorCamera)
detectionNetwork = pipeline.create(dai.node.YoloDetectionNetwork)
xoutRgb = pipeline.create(dai.node.XLinkOut)
nnOut = pipeline.create(dai.node.XLinkOut)

xoutRgb.setStreamName("rgb")
nnOut.setStreamName("nn")

# Properties
camRgb.setPreviewSize(320, 320)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
camRgb.setInterleaved(False)
camRgb.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
camRgb.setFps(40)

# Network specific settings
detectionNetwork.setConfidenceThreshold(0.5)
detectionNetwork.setNumClasses(2)
detectionNetwork.setIouThreshold(.5)
detectionNetwork.setBlobPath("catdog320.blob")
detectionNetwork.setNumInferenceThreads(2)
detectionNetwork.input.setBlocking(False)

# Linking
camRgb.preview.link(detectionNetwork.input)
if syncNN:
    detectionNetwork.passthrough.link(xoutRgb.input)
else:
    camRgb.preview.link(xoutRgb.input)

detectionNetwork.out.link(nnOut.input)

# Connect to device and start pipeline
with dai.Device(pipeline) as device:

    # Output queues will be used to get the rgb frames and nn data from the outputs defined above
    qRgb = device.getOutputQueue(name="rgb", maxSize=4, blocking=False)
    qDet = device.getOutputQueue(name="nn", maxSize=4, blocking=False)

    frame = None
    detections = []
    startTime = time.monotonic()
    counter = 0
    color2 = (255, 255, 255)

    # nn data, being the bounding box locations, are in <0..1> range - they need to be normalized with frame width/height
    def frameNorm(frame, bbox):
        normVals = np.full(len(bbox), frame.shape[0])
        normVals[::2] = frame.shape[1]
        return (np.clip(np.array(bbox), 0, 1) * normVals).astype(int)

    def displayFrame(name, frame):
        color = (255, 0, 0)
        for detection in detections:
            bbox = frameNorm(frame, (detection.xmin, detection.ymin, detection.xmax, detection.ymax))
            cv2.putText(frame, labelMap[detection.label], (bbox[0] + 10, bbox[1] + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)
            cv2.putText(frame, f"{int(detection.confidence * 100)}%", (bbox[0] + 10, bbox[1] + 40), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)
            cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), color, 2)
        # Show the frame
        cv2.imshow(name, frame)

    while True:
        if syncNN:
            inRgb = qRgb.get()
            inDet = qDet.get()
        else:
            inRgb = qRgb.tryGet()
            inDet = qDet.tryGet()

        if inRgb is not None:
            frame = inRgb.getCvFrame()
            cv2.putText(frame, "NN fps: {:.2f}".format(counter / (time.monotonic() - startTime)),
                        (2, frame.shape[0] - 4), cv2.FONT_HERSHEY_TRIPLEX, 0.4, color2)

        if inDet is not None:
            detections = inDet.detections
            counter += 1

        if frame is not None:
            displayFrame("rgb", frame)

        if cv2.waitKey(1) == ord('q'):
            break

Does this work for you? Cause it doesn't for me and I think the model is not created correctly.

Thanks ,
Jaka

Hi @li_you_chen
This is the exact same script as tiny_yolo in the examples. I only changed the model to catdog and the label names respectively. Can you recheck please?

Thanks
Jaka

Hi @jakaskerl

it can run ,but can't detect right ,full of wrong boundingbox , even I raise threshold

can i object tracking by yolodetect not mobilenet ssd

    Hi @li_you_chen

    li_you_chen full of wrong boundingbox , even I raise threshold

    This suggest a problem with the model. Did you test it off depthai?

    Thanks,
    Jaka

    Hi @jakaskerl

    Can i object tracking by yolodetect not mobilenet ssd. Or If I train my own mobilennet ssd model so I can run this

    Thanks,

    Li

    Hi @li_you_chen
    Yes, of course you can. The model may be different but the on-device decoding is done in a way that ensures the output structure is the same, so they can both be linked to an object tracker.

    If you train your own mobilenet model, the example you sent should be working without any changes (well, except for the model path).

    Thanks,
    Jaka

    Hi @li_you_chen
    We have some here but we ditched MBNet a while ago and mostly use YOLO now. If you wish you can train YOLO as well.

    Thanks,
    Jaka

    Hi @jakaskerl

    Yes, I can training yolo model ! It just the object tracking example is using MBNet ,but I want to use yolo