Amount of shaves setting when converting custom model to new NNArchive format

legow00

Hi all,

I am in the process of converting my v2 application to v3. I am successful in converting my custom model to the new 'NNArchives' format via https://hub.luxonis.com.
When running my model I am only able to achieve half the performance I got with the v2 library. In the v2.30.0 library the model can process 22FPS, now with v3.1.0 I am only able to process 11 FPS.

When starting the application I get the warning:

[14442C103180EECF00] [2.1] [4.736] [NeuralNetwork(2)] [warning] Network compiled for 8 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance

My question is where can I set the amount of shaves I want to use when converting my model on https://hub.luxonis.com? The old https://tools.luxonis.com website has this setting why does the new not offer me this option?

For reference see the code below which I am currently using to test v3.1.0:

#!/usr/bin/env python3

from pathlib import Path

import cv2

import depthai as dai

import numpy as np

import time

NNPath = 'YOLOv6n_finetune_V9_best_ckpt_2-5-25.rvc2.tar.xz'

# Create pipeline

with dai.Pipeline() as pipeline:



                pipeline.setXLinkChunkSize(0)



                # Define the colour camera

                ColourCam = pipeline.create(dai.node.Camera).build(dai.CameraBoardSocket.CAM_A, [2024, 1520], 30)

                # Resize frame to fit the input of the NN

                ColourCamNN = ColourCam.requestOutput([640, 640], dai.ImgFrame.Type.BGR888p, dai.ImgResizeMode.CROP, 30)



                # Define the object detector

                NNArchive = dai.NNArchive(NNPath)

                YOLOv6Network = pipeline.create(dai.node.DetectionNetwork).build(ColourCamNN, NNArchive, 0.8)



                YOLOv6Network.setNumNCEPerInferenceThread(1)

                YOLOv6Network.setNumInferenceThreads(2)

                YOLOv6Network.setNumShavesPerInferenceThread(6)



                YOLOv6Network.input.setBlocking(False)

                labelMap = YOLOv6Network.getClasses()



                # Output queue

                qRgb = YOLOv6Network.passthrough.createOutputQueue()

                qDet = YOLOv6Network.out.createOutputQueue()

                pipeline.start()

                frame = None

                detections = []

                startTime = time.monotonic()

                counter = 0

                color2 = (255, 255, 255)

                # nn data, being the bounding box locations, are in <0..1> range - they need to be normalized with frame width/height

                def frameNorm(frame, bbox):

                               normVals = np.full(len(bbox), frame.shape[0])

                               normVals[::2] = frame.shape[1]

                               return (np.clip(np.array(bbox), 0, 1) * normVals).astype(int)

                def displayFrame(name, frame):

                               color = (255, 0, 0)

                               for detection in detections:

                                               bbox = frameNorm( frame, (detection.xmin, detection.ymin, detection.xmax, detection.ymax),)

                                               cv2.putText(frame, labelMap[detection.label], (bbox[0] + 10, bbox[1] + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255,)

                                               cv2.putText(frame, f"{int(detection.confidence * 100)}%", (bbox[0] + 10, bbox[1] + 40), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255,)

                                               cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), color, 2)

                               # Show the frame

                               cv2.imshow(name, frame)

                while pipeline.isRunning():

                               #inRgb: dai.ImgFrame = qRgb.get()

                               #inDet: dai.ImgDetections = qDet.get()



                               inRgb = qRgb.tryGet()

                               inDet = qDet.tryGet()

                               if inRgb is not None:

                                               frame = inRgb.getCvFrame()

                                               cv2.putText(frame, "NN fps: {:.2f}".format(counter / (time.monotonic() - startTime)), (2, frame.shape[0] - 4), cv2.FONT_HERSHEY_TRIPLEX, 0.4, color2,)

                               if inDet is not None:

                                               detections = inDet.detections

                                               counter += 1

                               if frame is not None:

                                               displayFrame("rgb", frame)

                                               print("FPS: {:.2f}".format(counter / (time.monotonic() - startTime)))

                               if cv2.waitKey(1) == ord("q"):

                                               pipeline.stop()

                                               break

KlemenSkrlj

Hi @legow00 ,
You can refer to the docs here on how to explicitly set the amount of shaves in DAIv3. In your specific code this would look something like:

YOLOv6Network = pipeline.create(dai.node.DetectionNetwork).build(ColourCamNN, NNArchive, 0.8)
YOLOv6Network.setNNArchive(NNArchive, numShaves=6)

Best,
Klemen

legow00

Hi @KlemenSkrlj

Thanks for your reply!

I added YOLOv6Network.setNNArchive(NNArchive, numShaves=6) to my code and its running a bit better but not yet like the old v2 version.

I played around with the VINO version and settled on version 2021.4.0. There was about a 13% difference between the 2 versions offered to me in the AI Hub.

I am using a OAK-D CM4 PoE camera with the RVC2 module.

Changing the VINO version to 2021.4.0 and setting the numShaves to 6 got me back to 23 FPS.

Thanks for your help @KlemenSkrlj !