That's awesome to hear hussain_allawati !
So on line 77 (with dai.Device) what happens is that the pipeline will get serialized and sent to the OAK camera (usb, ethernet..) where the pipeline will actually be constructed in the firmware. Then you add your queues that will send/receive data to/from the device and usually a while loop to continuously run the operations (get data, display frames..).
Thanks, Erik

    erik Is there a documentation/course that explains the pipeline and nodes in details? I went through the this but still have some doubts regarding the issue.

    Thanks again,

    • erik replied to this.

      erik Currently I can't address the exact doubts. I will go through the documentation you suggested, test some code, and keep you updated.

      Thanks again,

      • erik replied to this.
        10 days later

        hussain_allawati my pipeline is having several functions such as: Object Recognition, Face Recognition, Text to Speech, and I/O buttons.

        Your "program" will have all that, but "pipeline" is the term for what happens inside the Oak device, so your pipeline will consist of camera node(s) connected to a face reco NN node AND an object reco NN node and all those going to output nodes

        5 days later

        erik

        Erik, I made myself familiar with the API, pipeline, and nodes. I also ran the examples available.

        For now, I would like to use pretrained models to perform object detection and scene classification.

        I am thinking to use the MS coco dataset for objects, and Places365 dataset for scene classification.
        Several pretrained models exist for these datasets. They use various architectures.
        For example, the Places365 dataset has models trained using Vgg16, GoogLeNet, ResNet, and AlexNet architectures.

        I attempted to download the VGG-16 Places365 model and successfully converted it to .blob using the online convertor tool.

        Now my questions are:

        1) How to use the converted model and decode its output?
        2) Does using a converted model differs based on the originating model? In other words, If I converted several models each originating from different platforms (Caffe, Tensorflow, etc) or from different architectures (VGG-16, ResNet, etc), Is the way to use them and decode their results is the same or differs from one to other?
        3) Which architecture is preferred to be used on OAK devices? (the one it is optimized for)

        • erik replied to this.

          Hello hussain_allawati ,
          1) See efficientDet demo here. Decoding really depends on the model. I would use the same decoding code that is usually present in the model repo (eg. at evaluation steps).
          2) You can run any AI model on the OAK (all layers need to be supported). Since models have different output, decoding also differs.
          3) I don't think Myriad X VPU was really optimized for any specific model, but you can see performance below (or here)

          Thanks, Erik

            erik

            Thank you for your informative reply.
            The efficientDet Demo you referred does object detection.
            Could you please refer something similar that does image classification?

            • erik replied to this.

              erik Erik, unfortunately, I am lost !
              I spent an entire day to try out to use the Places365 VGG-16 model (after converting it to blob).
              The issue is that I don't have much knowledge about NNs, and hence got stuck in understanding how to use the model. I know this discussion might not be related directly to DepthAI, but I would be glad if you can guide me!

              Thanks,

              • erik replied to this.
                6 days later

                erik Erik, I looked at the examples you suggested.

                Currently, I want to perform the following transformation to match the model requirements:

                from torchvision import transforms as trn
                centre_crop = trn.Compose([
                trn.Resize((256,256)),
                trn.CenterCrop(224),
                trn.ToTensor(),
                trn.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
                ])

                How can I apply such transformation on DepthAI?