• DepthAICommunity
  • Need Advice: I’m an artist and a techy working on a piece for my thesis

Need advice: I am a newbie with the OAK and have been excited to create with this unit since I read about it. I am using a Pi 4 and coding in Python. I am trying to use the data from OAK-D (particularly the x coordinate data on a person) to activate an image on a screen for a piece I programmed.

The idea is... as a person moves from right to left (or left to right) the x coordinate will trigger each corresponding quadrant to display an image creating a series of images that play as a person moves through space.

My main concern is to make sure it locks only on one person. I am needing help with code to tie into the data stream to work with my code. Which data stream should I tap into to get this to happen? Or if something pops out that can be helpful and can point me in the right direction that would be helpful. Anything helps! TY and happy coding 😀

    @Brandon Would you be willing to point me in the right direction? Anything helps.

    Thank you for an amazing product. The best ideas get created when trying to solve real problems. The possibilities are endless with tech. I can’t wait to get this down and create more art with this. I feel art, science, and engineering often intersect in beautiful ways.

    Hi @pHubb ,

    Sounds awesome! This is exactly the sort of thing we are hoping to enable! So more than happy to help out here (and thanks for the kind words!) and sorry about the delay. Just have been behind recently.

    I'm reading through and thinking about what would be best here. And will circle back shortly. Also if you haven't already, please feel free to join our Discord: https://discord.gg/EPsZHkg9Nx.

    Will circle back shortly.

    -Brandon

    Hi again pHubb ,

    So here is what I'm thinking:

    https://github.com/luxonis/depthai-experiments/tree/master/gen2-pedestrian-reidentification

    See how this works tracking and giving IDs to a given to you (and others) as you walk by. I think what you can do is just store whatever ID it is for the first that is scene, and then just track them while it has that ID, ignoring if there are ever any big jumps in the position of the ID.

    So the model use for IDing each person and tracking is not perfect, and seems to work best at a decent stand-off distance. I have seen more sophisticated techniques use a Kalman filter, and that performs way better (e.g. here, just from Googling as I couldn't remember the example I had seen).

    But I'd say try the example above first, see if it works OK just by watching/recording-and-watching as you or others walk by. If that looks OK, then I'd say move on to integrating the tracked person coordinates for then controlling your artwork. I am probably not the best to help here, but can try. @Luxonis-Lukasz is better to advise as he wrote this on which variable in this example to pull for doing the control you are mentioning.

    @Luxonis-Lukasz - would you mind sharing here? Just the example of taking an active/existing detected person and tracking them as they move.

    Thoughts?

    Thanks,
    Brandon

    (Sorry had a big launch today of our first official release Gen2 API (here), which is why Lukasz didn't get back to this today. Always a lot more loose-ends to tie up. Actually asking Karolina to help as well.

    Thanks,
    Brandon

    @Brandon I’m happy you all are willing to lend a hand. I have a few weeks before my demo has to be ready. Once again, congratulations on such a cool product and the many great opportunities coming your way. Well deserved.

      Hi pHubb ,

      I agree with Brandon's approach - Store the first ID in the scene and track them while it has that ID. This should help to lock on one person. There are couple variables may be useful in your case. You can find this piece of code (line 207) in the github link [https://github.com/luxonis/depthai-experiments/blob/master/gen2-pedestrian-reidentification/main.py]

                          for person_id in results:
                              dist = cos_dist(reid_result, results[person_id])
                              if dist > 0.7:
                                  result_id = person_id
                                  results[person_id] = reid_result
                                  break
                          else:
                              result_id = next_id
                              results[result_id] = reid_result
                              results_path[result_id] = []
                              next_id += 1

      and the variable result_id should be the ID printed on each person in the video. You can link the result_id to your code, which only tracks one person.

      Then about the x-coordinates, you can use the variable raw_bbox (line 202) to track. It has det.xmin, det.ymin, det.xmax, det.ymax raw data. Or you can find cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (10, 245, 10), 2) (in line 221), the (bbox[0], bbox[1])and(bbox[2],bbox[3]) are the top left and right bottom coordinates for the box. You can also use that data stream to work with your code and trigger your code.

      Hope this helps! Please feel free to let us know how it goes pHubb .

      Please feel free to correct me if finds any misleading info above @Luxonis-Lukasz and @Brandon

      Best,
      Steven

        Steven @pHubb

        I would see a couple of options to implement this

        1. Use just person detection network and host-side tracker

          The first approach would be to use a network like person_detection_retail_0013 to detect people and then host-side centroid tracker to track the bounding boxes of people between frames - later on, you can either just take one id that you'd follow or change the implementation of tracker not to store dict of trackers but a single tracked object.

          This implementation is fairly easy to understand, therefore making it easier to adjust the code, but it may not be so precise, as it relies on distance between bounding boxes, not actually identifying any person

        2. Use person detection network and device-side tracker

          This approach would run both detection network and object tracker running on DepthAI (example here).
          This approach would increase the performance, but you won't be able to modify the tracking code itself - and it relies on bounding box distance, so reliability may also be an issue

        3. Use person detection network and reidentification network

          The last approach, the most complicated of those, it's the reidentification network that is responsible for assigning the IDs.
          This is more reliable, as we got rid of the bounding box distance measuring method, but comes with its own performance drawbacks as two networks need to produce the results before we know where is the target

        Regarding compatibility, if you'd like to implement this application in Gen2 (version 2.0.0.0), there is no device-side tracker yet (so approach 2. is not yet doable in Gen2).
        All aproaches are doable in Gen1 though (version 1.0.0.0), so while we're implementing the features to the new API, you can still use this one to implement your example

        Regarding models, I would suggest using either person_detection_retail_0013 or pedestrian_detection_adas_0002, and you can easily compile and download them using our online blob converter

        Personally, I would use the first aproach, as in this case the actual person id is not needed, we just want to limit the amount of followed persons to 1 (I may misunderstood, feel free to correct me pHubb).
        For this, you can use people-tracker experiment where most of the code should be relevant in this case (it uses person_detection_retail_0013 and host-side tracker)

        In Gen2, the most related example is pedestrian-reidentification one, but I can prepare an example implementation of these approaches if needed, so that they are more related to your use case

        Hope it helps

        25 days later

        I was wondering if the social-distancing example in gen1 could be used here? Does it track people, I haven't yet explored the core of it. It has fairly accurate results and is really easy to alter.

        I tried the collision-avoidance and it's really unstable on Raspi 4. It's poitning false-positives, and the output is distorted for now. similar to what corona-mask example looked like. Any suggestions @Luxonis-Lukasz

          dhruvsheth-ai social distancing does not include tracker. There is one in people-tracker example that relies on bounding box centroids to track. Also, there is an option to experiment with gen2 device-side tracker.

          Collision-avoidance experiment can be definitely updated, as it was released early to show the idea behind collision avoidance, and was also a starting point for tools like depthai-mock

            Luxonis-Lukasz Intel Object Tracker looks the best suited, as I suppose. Will be easy for PHubb to understand, and alter if needed in the long run(Helping him out for now). Will wait for a stable release, asked for the depthai version being used here - https://github.com/luxonis/depthai-python/pull/173#issuecomment-809396768. Probably doesn't work with 2.0.0.1 or 2.1.0.0 for now, until there's an uodate.

            Thanks Lukasz