I'm developing retail people counting system. Camera is placed above shop entrance (in 3 meters height) and I need to count people entering and leaving the area. It looks almost vertically to the floor.

I know there are examples of this use case, but my results are not very good. I've built my implementation based on this example: https://docs.luxonis.com/projects/api/en/latest/samples/ObjectTracker/object_tracker/

With following changes:

  • Based on maximize FOV page, I've used letter boxing (so there is black stripe on top and bottom of a frame).
  • I've used yolov6t_coco_416x416_openvino_2021.4_5shave for detections.
  • I tried to twak confidence_threshold, but no significant improvements.

As a result system has 14 FPS.

Problem is, there are a lot of tracking ids, which has 1 detection and some are discontinued (starts in the middle of screen and are lost soon again).

I was thinking that this approach might not be optimal:

  1. Detection network accepts square (416x416) so not the whole NN is employed (because of letterboxing). When I scale the image to (stretch it vertically) detections are affected. Are there NN models which are not square shaped?
  2. It seems a bit overkill for this application to detect people (everyone who enters shop is human or kid :-) ) What about using edge detection or depth data only and try to detect&track moving objects? Are there some ready-to-go NNs, which can be utilized to detect moving "blobs" from depth detection?

Thank you for your thoughts!

  • erik replied to this.

    Hi austy246 ,
    Cool project! I think for the tracking filtering, you could use something similar to what we implemented in poeple-tracking demo.

    1. I have a great news for you, we just added models that are closer to 16:9 aspect ratio (640x352) this week to our model zoo and SDK (PR here). And another great news, I have just created a demo with one of these models that tracks the person and counts how many went left/right, see PR here. To use, you need latest depthai-sdk.
    2. Great question. In theory possible, but there aren't any pre-existing models that do this (I have looked a bit in the past). Usually, blob detection is done using traditional CV, not NNs, eg. using opencv.

    Thoughts?
    Thanks, Erik