BlogDepthAI

It Works! Working Prototype of Commute Guardian.

Hey guys and gals!

So the 'why' of the DepthAI (that satisfyingly rhymes) is we're actually shooting for a final product which we hope will save the lives of people who ride bikes, and help to make bike commuting possible again for many.

Doing all the computer vision, machine learning, etc. work I've done over the past year, it struck me (no pun intended) that there had to be a way to leverage technology to help solve one of the nastiest technology problems of our generation:

A small accident as a result of texting while driving, a 'fender bender', often ends up killing or severely injuring/degrading the quality of life of a bicyclist on the other end.

This happened to too many of my friends and colleagues last year. And even my business partner on this has been hit once, and his girlfriend twice.

These 'minor' accidents (were they to be between two cars) resulted in shattered hips, broken femurs, shattered disks - and then protracted legal battles with the insurance companies to pay for any medical bills.

So how do we do something about it?

Leverage computer vision + machine learning to be able to automatically detect when a driver is on a collision course with you - and stuff that functionality into a rear-facing bike light. So the final product is a multi-camera, rear-facing bike-light, which uses object detection and depth mapping to track the cars behind you - and know their x, y, and z position relative to you (and the x, y, and z of their edges, which is important).

What does it do with this information?

It's normally blinking/patterning like a normal bike light, and if it detects danger, it does one of two actions, and sometimes both:

  1. Starts flashing an ultra-bright strobe - to try to get the driver's attention
  2. Initiates haptic feedback - to let the person riding the bike that they're in danger
  3. Honks a car horn. This is if all else fails, and only happens if you are in certain danger.

So case 1 and 2 occur when a car is on your trajectory, but has plenty of time/distance to respond. An example is rounding a corner, where their arc intersects with you, but they're still at distance. The system will start flashing, to make sure that they are aware of you. And it will let you, the person riding the bike, that you may be in danger and may need to act soon.

And if the ultra-bright strobes and haptic feedback don't work, then the system will sound the car horn with enough time/distance left (based on relative speed) for the driver to respond. The car horn is key here, as it's one of not-that-many 'highly sensory compatible inputs' for a driving situation. What does that mean? It nets the fastest possible average response/reaction time.

So with all the background out, we're happy to say that last weekend, we actually proved out that this is possible, using slow/inefficient hardware (with a poor data path):

What does this show? It shows that you can differentiate between a near-miss and a glancing impact.

What does it not show? Me getting run over... heh. That's why we ran at the car instead of the car running at us. And believe it or not, this is harder to track. So it was great it worked well.

Long story short, the idea works!

This only runs at 3FPS, so you can see in one instance it's a little delayed. The custom board we'll be making fixes this, offloading the workload from the Pi to the Movidius X, bringing the FPS above 30. (And note we intentionally disabled the horn as it's just WAY too loud inside.)

You can see below what the system sees. It's running depth calculation, point-cloud projection (to get x,y,z for every pixel), and object detection. It's then combining the object detection and point cloud projection to know the trajectory of salient objects, and the edges of those objects.




And this is the car under test:

We used that for two reasons:

  1. It was snowing outside, and it was the only one that was parked in such a way indoors that I could run at it.
  2. It proves that the system handles corner cases pretty well. It's a 'not very typical' car. ;-)

And here's the hardware used:

And the next big thing is to make a custom board, which integrates everything together and solves a major data path issue, which is the main limiter of performance when using the NCS with a smaller linux system like the Pi.

So we realized that a ton of other people are having the same data path issue w/ the Pi, and lack of access to other capabilities of the Myriad X, so we're actually planning on building selling our intermediate-board, which is a carrier for the Myriad X and the Raspberry Pi Compute module.

So the DepthAI will probably go to CrowdSupply so others can build such a bike solution, or all sorts of real-time spatial AI solutions off of it.


[UPDATE 29 July 2020] You can back this solution now, the OpenCV AI Kit (OAK) on KickStarter, below:


Oh and here're some initial renderings/ideas for the product and user interface:


We'd also have a bit of an audible warning for the rider in 'warning' stated, so that they know they're at some risk, before 'DANGER' state - even if they're not using the app.

Thoughts?

    Comments (28)

    Hey guys and gals!

    So the 'why' of the DepthAI (that satisfyingly rhymes) is we're actually shooting for a final product which we hope will save the lives of people who ride bikes, and help to make bike commuting possible again for many.

    Doing all the computer vision, machine learning, etc. work I've done over the past year, it struck me (no pun intended) that there had to be a way to leverage technology to help solve one of the nastiest technology problems of our generation:

    A small accident as a result of texting while driving, a 'fender bender', often ends up killing or severely injuring/degrading the quality of life of a bicyclist on the other end.

    This happened to too many of my friends and colleagues last year. And even my business partner on this has been hit once, and his girlfriend twice.

    These 'minor' accidents (were they to be between two cars) resulted in shattered hips, broken femurs, shattered disks - and then protracted legal battles with the insurance companies to pay for any medical bills.

    So how do we do something about it?

    Leverage computer vision + machine learning to be able to automatically detect when a driver is on a collision course with you - and stuff that functionality into a rear-facing bike light. So the final product is a multi-camera, rear-facing bike-light, which uses object detection and depth mapping to track the cars behind you - and know their x, y, and z position relative to you (and the x, y, and z of their edges, which is important).

    What does it do with this information?

    It's normally blinking/patterning like a normal bike light, and if it detects danger, it does one of two actions, and sometimes both:

    1. Starts flashing an ultra-bright strobe - to try to get the driver's attention
    2. Initiates haptic feedback - to let the person riding the bike that they're in danger
    3. Honks a car horn. This is if all else fails, and only happens if you are in certain danger.

    So case 1 and 2 occur when a car is on your trajectory, but has plenty of time/distance to respond. An example is rounding a corner, where their arc intersects with you, but they're still at distance. The system will start flashing, to make sure that they are aware of you. And it will let you, the person riding the bike, that you may be in danger and may need to act soon.

    And if the ultra-bright strobes and haptic feedback don't work, then the system will sound the car horn with enough time/distance left (based on relative speed) for the driver to respond. The car horn is key here, as it's one of not-that-many 'highly sensory compatible inputs' for a driving situation. What does that mean? It nets the fastest possible average response/reaction time.

    So with all the background out, we're happy to say that last weekend, we actually proved out that this is possible, using slow/inefficient hardware (with a poor data path):

    What does this show? It shows that you can differentiate between a near-miss and a glancing impact.

    What does it not show? Me getting run over... heh. That's why we ran at the car instead of the car running at us. And believe it or not, this is harder to track. So it was great it worked well.

    Long story short, the idea works!

    This only runs at 3FPS, so you can see in one instance it's a little delayed. The custom board we'll be making fixes this, offloading the workload from the Pi to the Movidius X, bringing the FPS above 30. (And note we intentionally disabled the horn as it's just WAY too loud inside.)

    You can see below what the system sees. It's running depth calculation, point-cloud projection (to get x,y,z for every pixel), and object detection. It's then combining the object detection and point cloud projection to know the trajectory of salient objects, and the edges of those objects.




    And this is the car under test:

    We used that for two reasons:

    1. It was snowing outside, and it was the only one that was parked in such a way indoors that I could run at it.
    2. It proves that the system handles corner cases pretty well. It's a 'not very typical' car. ;-)

    And here's the hardware used:

    And the next big thing is to make a custom board, which integrates everything together and solves a major data path issue, which is the main limiter of performance when using the NCS with a smaller linux system like the Pi.

    So we realized that a ton of other people are having the same data path issue w/ the Pi, and lack of access to other capabilities of the Myriad X, so we're actually planning on building selling our intermediate-board, which is a carrier for the Myriad X and the Raspberry Pi Compute module.

    So the DepthAI will probably go to CrowdSupply so others can build such a bike solution, or all sorts of real-time spatial AI solutions off of it.


    [UPDATE 29 July 2020] You can back this solution now, the OpenCV AI Kit (OAK) on KickStarter, below:


    Oh and here're some initial renderings/ideas for the product and user interface:


    We'd also have a bit of an audible warning for the rider in 'warning' stated, so that they know they're at some risk, before 'DANGER' state - even if they're not using the app.

    Thoughts?

      Patrick stickied the discussion .
      3 months later
      a month later

      This prophecy is very impressive and your prototype is brilliant.

      I’m building something very similar and would love to chat.

        Thanks Rememberlenny ! So feel free to shoot me an e-mail at brandon at luxonis dot com.

        I'd email you but I actually don't know if our forum software shows me users emails. Heh. :-)

        25 days later
        9 days later

        As an update we have depth that is close to the quality of the D435, but 30 FPS with the Raspberry Pi instead of 3FPS, now running on our own Myriad X platform, here.

        21 days later
        3 months later
        6 months later

        Been a long time since I updated the effort here. We're getting WAY closer to having the embedded platform to actually build a productized version of Commute Guardian.

        See some tantalizing pictures below (of DepthAI Onboard Camera Edition, here:

        And one of our users (thanks, Martin!) even made a mount for the Raspberry Pi Compute Module Edition here with a battery holder such that the entire solution can be mounted to a bike post, like below. And lights/horns could be mounted separately.

        We're getting very close on DepthAI to being able to productize to make a smart bike light like described above.

        One last thing is we need to do hard-sync between depth and AI results to allow quite-fast-moving objects to still be tracked in 3D space accurately, particularly when there is extreme lateral motion (i.e. side impact instead of from behind). But we're quite close.

        See below to see how quickly this now tracks (in this case, faces, instead of cars, but it's similar):

        You can buy the DepthAI platform from our store, on CrowdSupply and on Mouser

        And if you want to build something off of it, do it! We've open sourced all hardware and software:
        Hardware: https://github.com/luxonis/depthai-hardware
        Software:

        And we have documentation on how to use all the software here:

        And even our documentation is open-source, so if you find an error you can do a PR w/ the fix!


        We even have open-source (and free) training with tutorials (here:

        The Tutorials
        The below tutorials are based on MobileNetv2-SSD, which is a decent-performance, decent-framework object dectector which natively runs on DepthAI. A bunch of other object detectors could be trained/supported on Colab and run on DepthAI, so if you have a request for a different object detector/network backend, please feel free to make a Github Issue!

        Easy Object Detector Training Open In Colab
        The tutorial notebook
        Easy_Object_Detection_With_Custom_Data_Demo_Training.ipynb shows how to quickly train an object detector based on the Mobilenet SSDv2 network.

        After training is complete, it also converts the model to a .blob file that runs on our DepthAI platform and modules. First the model is converted to a format usable by OpenVINO called Intermediate Representation, or IR. The IR model is then compiled to a .blob file using a server we set up for that purpose. (The IR model can also be converted locally to a blob.)

        And that's it, in less than a couple of hours a fairly advanced proof of concept object detector can run on DepthAI to detect objects of your choice and their associated spatial information (i.e. xyz location). For example this notebook was used to train DepthAI to locate strawberries in 3D space, see below:

        Real-time 3D Strawberry Detector

        COVID-19 Mask/No-Mask Training Open In Colab
        The Medical Mask Detection Demo Training.ipynb training notebook shows another example of a more complex object detector. The training data set consists of people wearing or not wearing masks for viral protection. There are almost 700 pictures with approximately 3600 bounding box annotations. The images are complex: they vary quite a lot in scale and composition. Nonetheless, the object detector does quite a good job with this relatively small dataset for such a task. Again, training takes around 2 hours. Depending on which GPU the Colab lottery assigns to the notebook instance, training 10k steps can take 2.5 hours or 1.5 hours. Either way, a short period for such a good quality proof of concept for such a difficult task.
        We then performed the steps above for converting to blob and then running it on our DepthAI module.

        Below is a quick test of the model produced with this notebook on Luxonis megaAI:

        COVID19 Mask Detector

        Cheers,
        Brandon and the Luxonis Team!

        6 days later

        Looking good. Very cool to have the hardware and software open source, too. Looking forward to watching this project evolve

        As a quick update, we just got object tracking initially working, which is what allows calculating individual trajectories in 3D space to know if a vehicle is on a collision course with your trajectory.

        20 days later
        9 days later

        So we just got intro'ed with lumenus.com, and their stuff is great. Exactly in-line with the problem we're trying to solve here... keeping people who ride bikes safe.

        If you haven't checked them out yet - I'd highly recommend watching the video on their website. It has great stats too. Reproducing below for convenience:

        Speaking of which, I meant to mention that I backed the Lumos Ultra:

        It looks quite awesome... and for an early-bird of $79, it's a no-brainer!

        6 months later
        3 months later

        As another update, we now have many teams (as many as 10x) working on this solution as part of our OpenCV Competition using the Luxonis DepthAI OAK-D:
        https://opencv.org/opencv-ai-competition-2021/

        Thanks for Microsoft and Intel for sponsoring this competition, and to OpenCV for helping coordinate and publicize it!

        2 months later

        The Gen2 pipeline builder is making this more and more tractable in DepthAI. Some quick experiments with Gen2 below:


        We'll be updating this as we, and several others start to build this on the platform.

        11 days later

        Here is 2x DepthAI OAK-D on-bike:

        And here is what the system sees:

        DepthAI is now at the point where this bike safety product can be made with it. And it can then be made smaller than just the cameras on the front/back. With nothing else needed.

        2 months later

        It's cool to see the other safety products that are being built off of this. For example recently BlueBox Labs released their collision-deterring (among many other features) spatial AI dashcamera based on DepthAI on KickStarter:

        So this could work inside the vehicle to even detect and alert if a (distracted) driver is going to hit a person riding a bike or a pedestrian.

        More safety products are in the works. Very satisfying to see.

        Cheers,
        Brandon