Luxonis DepthAI and megaAI | Overview and Status

Brandon · Aug 1, 2019

The Myriad X is a vision processor capable of doing real time object detection and stereo depth at over 30FPS.

Let's unleash this power!

How?
We're making a Myriad X System on Module (SoM) which allows embedding the power of the Myriad X into your own products, with firmware support for 3D object detection/location.

And we're making a carrier board that includes all the cameras, the Myriad X, and the Raspberry Pi Compute Module all together to allow you to get up and running in seconds.

This allows:

The video data path to skip the Pi, eliminating that additional latency and bottlenecking
Stereo depth capability of the Myriad X for 3D object localization
Significant reduction in the CPU load on the Raspberry Pi

So you, the Python programmer, now have real-time 3D position of all the objects around - on an embedded platform - and backed by the power of the Raspberry Pi Community!

For the full back story, let's start with the why:

There’s an epidemic in the US of injuries and deaths of people who ride bikes
Majority of cases are distracted driving caused by smart phones (social media, texting, e-mailing, etc.)
We set out to try to make people safer on bicycles in the US
- We’re technologists
- Focused on AI/ML/Embedded
- So we’re seeing if we can make a technology solution

Making People Who Ride Bikes Safer

(If you'd like to read more about CommuteGuardian, see here)

DepthAI Platform

In prototyping the Commute Guardian, we realized how powerful the combination of Depth and AI is.
And we realized that no such embedded platform existed
So our milestone on the path to CommuteGuardian is to build this platform – and sell it as a standard product.
We’re building it for the Raspberry Pi (Compute Module)
- Human-level perception on the world’s most popular platform
- Adrian’s PyImageSearch Raspberry Pi Computer Vision Kickstarter sold out in 10 seconds – validating demand for Computer Vision on the Pi (that, and validating that Adrian is AWESOME!)

So below is a rendering of our first prototype of DepthAI for Raspberry Pi.

The key difference between this, and say using the Raspberry Pi with an NCS2 is the data path. With the NCS2 approach all the video/image data has to flow through (and be resized/etc.) by the host, whereas in this system the video data goes directly to the Myriad X, as below, unburdening the host from these tasks - which increases frame-rate and drastically reduces latency:

Development Steps

The first thing we made was a dev board for ourselves. The Myriad X is a complicated chip, with a ton of useful functionality... so we wanted a board where we could explore this easily, try out different image sensors, etc. Here's what that looks like:

BW0235

We made the board with modular camera boards so we could easily test out new image sensors w/out the complexity of spinning a new board. So we'll continue to use this as we try out new image sensors and camera modules.

While waiting on our development boards to be fabricated, populated, etc. we brainstormed how to keep costs down (working w/ fine-pitch BGAs that necessitate laser vias means prototypes are EXPENSIVE), while still allowing easy experimentation w/ various form-factors, on/off-board cameras, etc. We landed on making ourselves a Myriad X System on Module, which is the board w/ all the crazy laser vias, stacked vias, and over all High-Density-Integration (HDI) board stuff that makes them expensive. This way, we figure, we can use this as the core of any Myriad X designs we do, without having to constantly prototype w/ expensive boards.

BW1099

We exposed all that we needed for our end-goal of 3D object detection (i.e. MobileNet-SSD object detection + 3D reprojection off of stereo depth data). So that meant exposing a single 4-lane MIPI for handling high-res (e.g. 12MP) color camera sensors and 2x 2-lane MIPI for cameras such as _1MP global-shutter image sensors for depth.

And we threw a couple other interfaces, boot methods, etc. on there for good measure, which are default de-pop to save cost when not needed, and can be populated if needed.

So of course in making a module, you also need to make a board on which to test the module. So in parallel to making the SoM, we started attacking a basic breakout carrier board:

It's basic, but pulls out all the important interfaces, and works with the same modular camera-board system as our development board. So it's to some degree our 'development board lite'.

And once we got both of these ordered, we turned our attention to what we set out to build, for you, the DepthAI for Raspberry Pi system. And here it is, in all it's Altium-rendered glory:

So what does this thing do? The key bit is it's completely self-contained. If you need to give something autonomy, you don't need anything more than this. It has the vision accelerator (Myriad X), all the cameras, all the connectivity, and interfaces you need - and the Raspberry Pi Compute Module on-board.

So it's a self-contained system, allowing you to write some simple Python to solve problems that 3 years ago were not yet solvable by humanity! And now you can do it all on this one system.

To visualize what this gives you, see below, noting that DepthAI will give over 30FPS instead of the 3FPS in this example:

And while we are still working to integrate object detection and depth data together directly on the Myriad X, as well as tweaking our depth filtering, here's an example of depth alone running on our platform at 30FPS on DepthAI for Raspberry Pi (see here for more details):

Cheers,
The Luxonis Team

Brandon · Aug 19, 2019

Hi everyone,

So the first prototypes of the DepthAI for Raspberry Pi just finished population late last night, and shipped to us this morning, likely to arrive this week!

Some images below:

It's going to be a busy week of testing, as we're getting these, and also the Myriad X modules one day apart.

On this board our test plan is to exercise all the standard Raspberry Pi features and IO (as it's supposed to act as a standard Raspberry Pi, minus WiFi for this revision), take notes on any errors/etc. for fixing in Altium, and then if all is OK-enough, connect the Myriad X module to it, and see if the whole thing runs together (with the code we have working so far).

Any comments or questions? Feel free to drop them here or shoot an email to brandon at luxonis dot com.

Best,
The Luxonis Team

Brandon · Aug 19, 2019

Hi again!

So our first Myriad X modules finally shipped! So we're expecting to have these in-hand tomorrow, Tuesday August 20th.

So back-story on these is that we ordered them on June 26th w/ 3-week turn from MacroFab (who we really like, and have used a lot), and the order was unlucky enough to fall in w/ 2 other orders that were subject to a bug in MacroFab's automation (which is super-impressive, by the way).

So what was the bug? (You may ask!)

Well, their front-end and part of the backend were successfully initializing all the correct actions (e.g. components order, bare PCB order, scheduling of assembly) at the correct times. However, the second half of the backend was apparently piping these commands straight to dev/null, meaning that despite the system showing and thinking that all the right things were being done, nothing was actually happening.

So on July 15th, when the order was supposed to ship, and despite our every-2-day prodding up until then, it was finally discovered that the automation had done nothing, at all. So then this was debugged, the actual status was discovered, and the boards were actually started around July 22nd.

Fast-forward to now, and this 3-week order is now a 8-week order - which should arrive tomorrow!

Unfortunately, the only photo we got of the units, was one from a confirmation that the JTAG connector was populated in the right orientation (and it was), so here's a reminder of what the module looks like, rendering in Altium:

And for good measure, the only photo we have of the boards so far, which is of the JTAG connector:

So hopefully tomorrow we'll have working modules! And either way we'll have photos to share.

Cheers!

The Luxonis Team

Brandon · Sep 4, 2019

Woefully behind on updates. We got the BW1099 in, and they work, and we got the BW1097 in, and they work! Pictures only because out of time these days with all the exciting hardware to play with and write code for!

Brandon · Sep 4, 2019

And much more to come on firmware, software, and hardware updates. Oh and also our Crowd Supply pre-launch page is up. Need to update that too!

https://www.crowdsupply.com/luxonis/depthai

Brandon · Sep 4, 2019

JJanT · Oct 31, 2019

Good job! Please, which accelerator on Myriad X do you use for the Stereo Depth implementation? Do you use the new "Stereo depth block" mentioned on this webpage: https://www.movidius.com/myriadx or do you use the SHAVE cores only?
Thank you!
Jan

Brandon · Nov 3, 2019

Hi JanT

Thanks!

Yes we're using the stereo depth block and not the SHAVEs for the stereo implementation. This leaves the SHAVEs free for neural processing and also filtering on the depth, etc.:

A bit more information is here:
https://github.com/Luxonis-Brandon/DepthAI

Thanks again,
Brandon

GGOB · Nov 5, 2019

Hi Brandon! This project looks amazing. What is the frame rate to be expected for depth camera + AI usage? In the sample video it looked like the framerate was around 2 or 3 fps. My project would need a higher rate.

Brandon · Nov 5, 2019

GOB Thanks and good question GOB . So we're expecting 25 frames per second of depth and AI operating at the same time. We were expecting to have a demo of this for the launch, but don't have both stacks working correctly yet - so we used our slower, prototype video for now, which is yes slower at 2-3 frames per second.

GGOB · Nov 5, 2019

Brandon Thanks for the update! If it's okay, can I email you with some more questions I have?

Brandon · Nov 5, 2019

Yes for sure. Brandon at luxonis dot com

JJanT · Nov 21, 2019

Thank you for your answer, Brandon!
I have an additional question. I'm sorry if it's already explained somewhere else.
Which libraries do you use for the Stereo block and SHAVE cores programming? You mention you are working on some stack to put together the Depth processing and AI detection. I would expect there is some stack or framework delivered from Intel.. I know there is the OpenVINO, but it's for the AI block only..
So, if I summarize my question, do you use enablement (drivers, libs, stacks) from Intel, or do you need to write everything from scratch by yourself?

Thank you!
Jan

Brandon · Nov 24, 2019

Hi JanT,

We’re writing our own system, based on an architecture we put together and have been implementing for a while now. This is composed of a binary (custom, that we make) that runs in the Myriad X, a library that runs on the host OS (again, that we make) that works with OpenVINO cleanly and has additional open source Python code which exposes the additional functionality that does exist yet in OpenVINO.

We are also planning on working with OpenCV and OpenVINO to integrate this functionality directly into both.

Expect some announcements soon (and I think some are already out, from the President of OpenCV for example).

Thoughts?

Thanks,
Brandon

JJanT · Nov 26, 2019

Hi Brandon,
I just saw the announcement from the President of OpenCV yesterday. That's great news! You are doing a great work! I'm happy to see the Myriad X is much more capable than what the NCS2 can deliver. Thank you for putting effort into it.

Best regards,
Jan

Brandon · Nov 30, 2019

Hey JanT,

Thanks a ton! Really looking forward to working with OpenCV on this to make it easy to use and wicked useful.

Thanks again,
Brandon

Brandon · Mar 5, 2020

Hi DepthAI fans,

So we've done SO MUCH since we last updated here. The only thing we haven't done is keep this post active.

So what have we done:

We delivered our Crowd Supply on time! Backers are happily using DepthAI now, and are discussing ideas on our luxonis-community.slack.com public slack group.
We got our first set of documentation out. https://docs.luxonis.com/
We made a couple new models which are available now (at https://shop.luxonis.com/) And we will have these on Crowd Supply soon.
We are in the process of making a power-over-ethernet version of DepthAI.
Our MVP Python API is running (and super fund to play with)

New Models
On the new hardware models since the Crowd Supply started. These include a USB3 Edition with onboard cameras and a USB3 Edition that's tiny and single camera (which we're calling μAI):

USB3C with Onboard Cameras (BW1098OBC):

μAI (BW1093):

Upcoming Model

This is our first engineering-development version of our PoE version of DepthAI. Some interesting new features include:

A new module (the BW1099) with:
Built in 128GB eMMC
SD-Card interface for base-board SD-Card support
PCIE support for Ethernet
A reference-design carrier board with:
PoE (100/1000)
SD-Card
4-lane MIPI for 12MP camera (BG0249)

MVP Functionality

So the core functionality gives 3D object localization as the output from the DepthAI - with all processing done on the Myriad X - and no other hardware required. The Raspberry Pi here is used purely to show the results.

So what you can see is the person - a little person at that - at 1.903 meters away from the camera, 0.427 meters below the camera, and 0.248 meters to the right of that camera.

And you can also see the chair, which is 0.607 meters to the left of the camera, 0.45 meters below the camera, and 2.135 meters away from the camera.

And for good measure, here is our test subject walking over to the chair:

The results are returned real-time. And the video is optional. We even have a special version that outputs results over SPI for using DepthAI with microcontrollers like the MSP430. (Contact us at support@luxonis.com if this is of interest)

Cheers,
Brandon & the Luxonis Team

Brandon · Mar 5, 2020

And here's a video view of the MVP:

Brandon · Apr 4, 2020

Our intern went ahead and got DepthAI working natively on Mac OS X:

We’ll be writing up instructions soon. Almost all of the work is actually just setting a Mac up for Python development using Homebrew... so if your Mac is already set up for that it pretty much ‘just works’.

Brandon · Apr 6, 2020

Meant to share this a while ago. So we have our initial online custom training for DepthAI now live on Colab.

https://colab.research.google.com/drive/1Eg-Pv7Amgc3THB6ZbnSaDJm0JAr0QPPU

So there are two notable limitations currently:

DepthAI currently supports OpenVINO 2019 R3, which itself requires older versions of TensorFlow and so on. So this flow has all those old versions, which causes a lot of additional steps in Colab... a lot of uninstalling current versions of stuff and installing old versions. We are currently in the process of upgrading our DepthAI codebase to support OpenVINO 2020.1, see here. We'll release an updated training flow when that's done.
The final conversion for DepthAI (to .blob) for some reason will not run on Google Colab. So it requires a local machine to do it. We're planning on just making our own server for this purpose that Google Colab can talk to to do the conversion.

To test the custom training we took some images of apples and oranges and did a terrible job labeling them and then trained and converted the network and ran it on DepthAI. It's easy to get WAY better accuracies and detection rates by using something like basic.ai to generate a larger dataset.

Cheers,
Brandon

Brandon · Apr 6, 2020

Update on the training. To use the latest of everything (as of this writing), including OpenVINO (R2020.1), etc. use the following:

Training: https://colab.research.google.com/drive/1_bjLv6QH_SPQ4QQ4TX1l_45acBSbjQBu
Running on DepthAI: https://github.com/luxonis/depthai-python-extras/tree/host_watchdog_r10.15

Brandon · Apr 13, 2020

And here's training for MobileNetSSDv2:
https://colab.research.google.com/drive/1n7uScOl8MoqZ1guQM6iaU1-BFbDGGWLG

You can label using this tool:
https://github.com/tzutalin/labelImg

Brandon · Apr 21, 2020

We now have a more complete training flow:
https://docs.luxonis.com/tutorials/object_det_mnssv2_training/

And we used it to train a mask/no-mask model for DepthAI with a quick effort at it over the weekend:

More images of validation/testing on Google Colab here:
https://photos.app.goo.gl/FhhUCLTsm6tqBgqL8

And here's the Google Colab used to train DepthAI on mask/no-mask face detection:
https://colab.research.google.com/drive/1uY5vekGK7S6uD88d28G861SIRh9yYbjJ

Brandon · Apr 22, 2020

Hi DepthAI Fans,

As promised, we have open sourced the DepthAI hardware!

All the carrier boards for the DepthAI System on Module (SoM), including the Altium design files and all supporting information are below:

https://github.com/luxonis/depthai-hardware

So now you can integrate the power of DepthAI into your custom prototypes and products at the board level using the DepthAI System on Module (SoM).

We can't wait to see what you build with it (and we've already seen some really cool things!).

Cheers, Brandon & the Luxonis Team

Brandon · Apr 24, 2020

The Power over Ethernet (PoE) variant of DepthAI is starting to trickle in (after COVID19 delays)...

We now have the baseboard (which actually implements the PoE portion):

So now you can deploy DepthAI all over the place and with 328.1 feet of cable between you and the device! The power of DepthAI, with the convenience of power over ethernet deployment.

Brandon · May 1, 2020

DepthAI on Jetson Tx2. Followed the same build instructions used on Mac OS X (here) and it built w/out even a single complaint and worked first try:

Brandon · May 14, 2020

The PoE boards work great! Tested 1,000FDX over PoE (from UniFi Switch) and they work exactly as intended.

Here's DepthAI running on our Power over Ethernet prototypes:

Brandon · May 14, 2020

And we just finished layout of an adapter board which will allow the new Raspberry Pi HQ camera to work with DepthAI
(including the above PoE version and the FFC version here).

Design files are here:
https://github.com/luxonis/depthai-hardware/tree/master/BW0253_R0M0E0_RPIHQ_ADAPTER

Brandon · May 22, 2020

We launched megaAI on Crowd Supply today!

4K Video at 30FPS on a Pi, while running object detection in parallel!

Get yours now before the early bird and roadrunner specials sell out! Only 14 left!

https://www.crowdsupply.com/luxonis/megaai

Brandon · May 23, 2020

We’re excited to share that DepthAI and megaAI are being used to build autonomous virus-killing robotics to help in the fight against COVID-19:

https://www.intel.com/content/www/us/en/corporate-responsibility/akara-fight-against-covid19-article.html

Brandon · Jun 10, 2020

Hi DepthAI (and megaAI) fans!

So we have a couple customers who are interested in IR-only variants of the global-shutter cameras used for Depth, so we made a quick variant of DepthAI with these.

We actually just made adapter boards which plug directly into the BW1097 (here) by unplugging the existing onboard cameras. We tested with this IR flashlight here.

It's a bit hard to see, but you can tell the room is relatively dark to visible light and the IR cameras pick up the IR light quite well.

Cheers,

The Luxonis Team

Brandon · Jun 10, 2020

More great news coming at you! We've accomplished so much so fast recently that it's hard to keep up with the updates.

Over the weekend we wrote a driver for the IMX477 used in the Raspberry Pi HQ Camera.

So now you can use the awesome new Raspberry Pi HQ camera with DepthAI FFC (here). Below are some videos of it working right after we wrote the driver this weekend.

Notice that it even worked w/ an extra long FFC cable!

More details on how to use it are here. And remember DepthAI is open source, so you can even make your own adapter (or other DepthAI boards) from our Github here.

And you can buy the adapter here: https://shop.luxonis.com/products/rpi-hq-camera-imx477-adapter-kit

Cheers,

Brandon & the Luxonis team

Brandon · Jun 17, 2020

We have a super-interesting feature-set coming to DepthAI:

3D feature localization (e.g. finding facial features) in physical space
Parallel-inference-based 3D object localization
Two-stage neural inference support

And all of these are initially working (in this PR, here).

So to the details and how this works:

We are actually implementing a feature that allows you to run neural inference on either or both of the grayscale cameras.

This sort of flow is ideal for finding the 3D location of small objects, shiny objects, or objects for which disparity depth might struggle to resolve the distance (z-dimension), which is used to get the 3D position (XYZ). So this now means DepthAI can be used two modalities:

As it's used now: The disparity depth results within a region of the object detector are used to re-project xyz location of the center of object.
Run the neural network in parallel on both left/right grayscale cameras, and the results are used to triangulate the location of features.

An example where 2 is extremely useful is finding the xyz positions of facial landmarks, such as eyes, nose, and corners of the mouth.

Why is this useful for facial features like this? For small features like this, the risk of disparity depth having a hole in the location goes up, and even worse, for faces with glasses, the reflection of the glasses may throw the disparity depth calculation off (and in fact it might 'properly' give the depth result for the reflected object).

When running the neural network in parallel, none of these issues exist, as the network finds the eyes, nose, and mouth corners per image, and then the disparity in location of these in pixels from the right and left stream results gives the z-dimension (depth = 1/disparity), and then this is reprojected through the optics of the camera to get the full XYZ position of all of these features.

And as you can see below, it works fine even w/ my quite-reflective anti-glare glasses:

Thoughts?

Cheers,
Brandon and the Luxonis Team

Brandon · Jul 2, 2020

Hi DepthAI Backers and Fans,

So we've proof-of-concepted an SPI-only interface for DepthAI and it's working well (proof of concept done with MSP430 and Raspberry Pi over SPI).

So to make it easier for engineers to leverage this power (and also for us internally to develop it), we're making a complete hardware and software/AI reference design for the ESP32, with the primary interface between DepthAI and the ESP32 being SPI.

The design will still have USB3C for DepthAI, which will allow you to see live high-bandwidth results/etc. on a computer while integrating/debugging communication to your ESP32 code (both running in parallel, which will be nice for debugging). Similarly, the ESP32 will have an onboard UART-USB converter and micro-USB connector for programming/interfacing w/ the ESP32 for easy development/debug.

For details on the effort and to see progress see here

For details and progress on the hardware effort see [here] and to check out the SPI support enhancement on DepthAI API see [here]

In short here's the concept:

And here's a first cut at the placement:

And please let us know if you have any thoughts/comments/questions on this design!

Best,
Brandon & The Luxonis Team

Brandon · Jul 9, 2020

Our production run of the megaAI CrowdSupply campaign is complete and now shipping to us:

We had 97% yield on the first round of testing and 99% yield after rework of the 3% that had issues in the first testing.

Brandon · Jul 14, 2020

Today, our team is excited to release to you the OpenCV AI Kit, OAK, a modular, open-source ecosystem composed of MIT-licensed hardware, software, and AI training - that allows you to embed Spatial AI and CV super-powers into your product.

And best of all, you can buy this complete solution today and integrate it into your product tomorrow.

Back our campaign today!