• DepthAI
  • Neural Processors and Hosts We've Found (So Far)

Hi folks,

So in down-selecting our architecture we did a TON of work surveying what's available out there to do embedded machine learning. We figured we'd share this here, in case it's helpful to others who are trying to do embedded machine vision. All of the following is public information - it's just sometimes hard to find in the fray. šŸ˜€

The highlights are Kendryte K210 on the lowest end, gyrfalcon 2803 in terms of highest performance/W, and then Intel Myriad X takes the cake in terms of raw video-processing power (can do 3 pairs of stereo cameras doing stereo depth simultaneously, in addition to doing MobileNet-SSD at a nice framerate).

This github seems to be the goldmine of everything thatā€™s going on, here

Neural Processors:

  • RockChip RK3399Pro, which is finally out, here!.
  • SOPHON BM1880, here, eval board order here ($129), documentation here
  • Renesas, here
  • Edge TPU here 4 TOPS
  • NVIDIA Jetson Nano here
    • 0.472 TOPS but leveraging a very efficient software/firmware stack which makes this 'go a lot further'
    • Think the Apple days pre-Intel when they talked about how their CPUs do more per MHz. I.e. NVIDIA has done a lot of other optimization which makes the 0.472 TOPS quite effective/usable.
  • Qualcomm 2 TOPS, Development Kit for sale here
    • Reference design kits for sale here
  • Samsung - Exynos wiki
    • Exynos 9810 - has ā€œdeep learningā€ capability - supposedly available (Q1, 2018). Used in Samsung Galaxy S9/S9+
    • Exynos 9820 - has ā€œNeural Processing Unitā€ - not out until Q1, 2019?
  • HiSilicon HI3559A, here
    • Itā€™s in this tracking camera, here
    • It seems the only thin on-par with Appleā€™s A12 processor, with both of them at 5TOPS. Making it the fastest thing you can actually buy for this application.
    • Looks to have built-in stereo-depth hardware
  • MediaTek P90 here
    • One of the top performers in the Android space, but isn't fully out yet.
  • Gyrfalcon 2803, here
    • 24 TOPS at 1W.
    • Definitely the best TOPS/W of anything available.
    • Only downside is can only run specific types of inference because of their novel/low-power inference architecture
  • Inuitive NU4000AI, here
    • 2 TOPS neural, 0.5 TOPS vision processing.

New Special Category: Fully Embedded

  • Apollo3 from Ambiq, here, which is in the SparkFun Edge (here) and can run voice recognition with TensorFlow Lite (see Pete Warden's talk, here)
  • Kendryte K210 here
    • PCB solutions start at $9, here
    • Lowest-end and incredible value/price IF it works well
    • But sketchy as to what is actually supported
  • The GreenWave GAP8 solution here looks really neat and is super low power. Can do low-res video at low power and presumably audio/sensor processing at low power. Check out their store here
    • They have an example doing video-based object avoidance for a drone. Really cool!
    • It's the PULP-DroneNet, here and it's so clever and effective. I bet a ton of tiny drones will be using this for automatic and cheap obstacle avoidance.
    • Check out their videos here and here

Hosts:

  • Allwinner solutions - super inexpensive (see Hackaday article here) but downside is they're relatively power hungry and a bit physically large (which is great for hobbyists, as they're easy to hand-solder).
  • NXP i.MX 7 series, here - great/low power use, really small physical size. More on the expensive side (presumably better if/when in volume).
  • Nationalchip GX6605S SoC, devboard here, Linux images here

Resources:

  • An Android AI chip benchmarking site, here
  • Plot comparison in terms of TOPS/W vs. W here
  • en.wikichip.org - Cool resource with all kinds of semiconductor mfr/device information

Comments
So the only thing that really 'competes' with the Myriad X currently is the HI3559A (EDIT: and the NU4000AI), but as far as I can tell the HI3559A won't be applicable to US markets because of legal restrictions. Anyone know on this?
It sure would be awesome for Intel to release a Myriad X with say a built-in RISC-V or similar, so that say Debian can be run on it directly, and it could then act as its own host. It's the most capable of the bunch, but it sure would be nice to not have to have a host processor alongside it in order to run Linux/etc.

The BM1880 is a version of this, but doesn't have the video processing hardware capabilities of the Myriad X. The RK3399Pro is another version of this, but it doesn't seem to be out yet.

The integrated host solution, particularly if it could run Debian would dominate the market of embedded computer vision/machine learning, I think. Thoughts on that, everyone?

On the low-end, the Kendryte K210 seems like it's the winner. It's 'big' development board is $50 (here, including screen and camera. And the smaller development board with camera and display are $20. And a PCB module is $9, including WiFi, here. It's got a pretty-fast-growing set of community and hardware modules available for it.

Brandon changed the title to Neural Processors and Hosts We've Found (So Far) .

I've encountered the Inuitive NU3000/NU4000 http://www.inuitive-tech.com/product/nu4000/

Ambarella also came out with the CV22/CV25 https://www.ambarella.com/products/vr-cameras/vr-camera-products

My Google Pixel 3XL phone also has a pretty fast VPU that is currently used to improve picture quality. https://www.fastcompany.com/90247454/the-pixel-3-puts-googles-extraordinary-ai-in-your-pocket
https://en.wikipedia.org/wiki/Pixel_Visual_Core

ARM has also came out with its AI IP. https://developer.arm.com/products/processors/machine-learning/arm-ml-processor

    I'm thinking it's not all about speed and efficiency of the chip. If the Myriad X is anything like the USB NCS 2 then, thanks to the abundance of 'out of the box' solutions, by time your soldering iron has cooled down you'll be detecting your neighbour's sheep?

      hanooi Thanks! I had seen the ARM stuff but hadn't seen the Inuitive and Ambarella solutions. There's SO much out there right now.

      So I'm thinking a Wiki-type solution might be helpful here for keeping track? That or I can just manually add all the notes/etc. back up to the top/etc.

      Thoughts?

      6 days later

      hanooi

      Just realized that Occipital's new Structure Core uses the NU3000 that you mentioned:

      I ran into the owner of Occipital here in Boulder back in 2012, climbing, actually. At the time they were doing panoramic apps for iOS devices (before iOS had this built in). It's cool to see how much they've expanded.

      Here's more on their product:
      Buy Structure Core

      @hanooi - do you happen to know if the NU4000AI can host its own Linux install?

      Their block diagram on their reference design makes it seem like you need to use it with an external host. But can't the Arm Cortex A5 in there run Linux?

      My understanding of the Inuitive N3000 and N4000 chips is they are very similar architecturally to the Movidius Myriad. Instead of SPARC cores, they use an A5.

      http://www.inuitive-tech.com/product/nu4000/

      I don't believe they can run Linux. Never seen one that did. I think only the Ambarella chips can run Linux since they come equipped with quad core ARM A53's, making them similar to Raspberry Pi's but at a smaller process node and vision built in.

      I tend to view N3000/N4000's them as Poor Man's Movidius Myriad 2/X. The Structure Core looks like it was built off of the Inuitive Veronica reference platform:
      http://www.inuitive-tech.com/product/veronica/

      Thanks for the quick reply!

      Yes, I was thinking exactly the same, I say that and thought, oh, "they used the reference design and built off of it, clever!" Or something to that effect.

      Anyways, my concern with Ambarella is that it'll be impossible to get their attention, given that they're in like every camera. Thoughts? Otherwise they look pretty awesome.

      Thanks again,
      Brandon

      Ambarella don't talk to the small folk like us. Serves them right. Now their business is declining since they never refilled their opportunity pipeline now that GoPro is past its prime.

        hanooi Ha. Well said. So I'm going to update the main post, for the ease of others finding stuff, to find what you responded with.

        Just wanted to give you a heads up so you don't think I'm stealing your finds. ;-)

        5 days later
        11 days later

        Just updated to the top post the new SparkFun Edge, which is one of the first really-low-cost/complexity neural inference along with the Kendryte K210.

        And that board looks interesting hanooi . Sorry for the delayed response... I had missed the post and only saw it when coming here to update about the Sparkfun part.

        Do you see if the Snapdragon solution gives a TOPS (or GOPS) rating on the NN part? I didn't see it on there.

        Got the K210 SiSPEED modules in today.

        Also got these in the mail yesterday. Been too busy with the Myriad X stuff to play with them, but we've heard great things from those who have. MobileNet SSD at 80FPS!




        • GOB replied to this.
          2 months later
          a month later

          gyrfalcon in partnership with SolidRun seems to have a released a really compelling module:
          https://www.notebookcheck.net/SolidRun-announces-the-i-MX-8M-Mini-SOM-Dev-Board-with-Gyrfalcon-AI-acceleration-for-next-gen-Edge-AI-applications.423039.0.html

          It's an IMX8m coupled with their 16 TOPS (gasp!) neural inference chip. So you get the host and the inference accelerator, eerily similar to the Edge TPU module (and I think it even uses the same connectors: