I'm curious if anyone has explored integrating ByteTrack as an object tracker or has any insights into integrating custom trackers. I came across an example in the depthai-experiments repository for integrating DeepSORT, but I'm keen to learn more about integrating ByteTrack specifically. Any guidance or pointers would be greatly appreciated!

Hey @AmanMantena @jakaskerl

I have written a short code for integrating ByteTracker with DepthAI:

def from_depthai(cls, depthai_results) -> Detections:
    depthai_detections_predictions = np.array([[     
        depthai_results.xmin,     
        depthai_results.ymin,     
        depthai_results.xmax,     
        depthai_results.ymax,     
        depthai_results.confidence,     
        depthai_results.label ]])  
    return cls(     
           xyxy=depthai_detections_predictions[:, :4],     
           confidence=depthai_detections_predictions[:, 4],     
           class_id=depthai_detections_predictions[:, 5].astype(int), 
    )

Paste this code in venv/lib/python3.11/site-packages/supervision/detection/core.py in the Detections class.

Here is the sample code for using this function:

import supervision as sv

#### Initialize the tracker with the pre-trained model's weights
tracker = sv.ByteTrack()

#### After getting detections
#### Loop through detections
for detection in detections:
    #### Convert depthai detections class to yolov5 format class
    #### new method "from_depthai" added to core.py in trackers/bytetrack for conversion
    yolo_format = sv.Detections.from_depthai(detection)
    #### [Apply tracker](https://supervision.roboflow.com/trackers/)
    tracks = tracker.update_with_detections(yolo_format)
    #### Apply nms
    tracks = tracks.with_nms(threshold=0.5, class_agnostic=False)
    #### Loop through tracks to get tracker for each frame
    for track in tracks:
        xyxy = track[0]
        conf = track[2]
        class_id = track[3]
        ids = track[4]
        bbox = frameNorm(frame, xyxy)
        print(track)
a month later

Hello,

Thank you for this.

Please correct me if I'm wrong, but this does not appear to take advantage of depth information. Would this be the same bytetrack process applied to 2d images?

Thanks!