• DepthAI
  • Yolo v4 Implementation Problem

Hello! I trained Yolo-v4 to detect 3 classes of medical mask wearing (With mask, without mask and incorrect worn mask).
I made all steps described in colab tutorial (https://colab.research.google.com/github/luxonis/depthai-ml-training/blob/master/colab-notebooks/Easy_TinyYOLOv4_Object_Detector_Training_on_Custom_Data.ipynb) locally and created blob files for different resolutions (416x416 and 608x608), compiled for 6 shaves. But I cant run this blob file on my OAK-D. I tried three ways to do it:

  1. Using gen1 Pipeline terminates the execution, some part of errors on screenshots below:

  2. I tried to use gen2 Pipeline and it seems to be OK, but I need to decode corretly yolov4 output, result is on screenshot

  3. Looking for configuration to decode NN output I found one of your solutions for yolo with gen2 pipeline (https://docs.luxonis.com/projects/api/en/gen2_develop/samples/22_tiny_tolo_v3_decoding_on_device/?highlight=yolo), but it produces device error/misconfiguration.

I'm a bit confused searching a way to run yolo, what can I do here? Thanks

  • erik replied to this.

    Hello asidsunrise,
    could you share the code of main2.py where you tried to use yolo decoding on the device? Did you only change labels and blob path?
    Thanks, Erik

      erik You are right, I changed only labels and blob path. The only difference between versions of pipeline gen2 code that works and that doesn't is using pipeline.createNeuralNetwork() instead of pipeline.createYoloDetectionNetwork()

      main2.py code provided offline.

      erik I added blob file of yolo with resolution 608x608 (6 shaves) to Google Drive, it may be easier to find the problem this way or to give me advice what to do. Thanks!

      Well, I found my mistake in code - the number of classes in detectionNetwork.setNumClasses(3) remained 80. After correction I was able to run the window, but there are too many predictions. I think, there were mistakes in compiling the blob, because Pycharm output says the dimension of blob nn is 416x416.

        asidsunrise So the RGB output is actually configurable and right now it is required to specify the RGB output to the neural network input to match what the network is expecting. Our default script for tinyYOLO is 416x416. So the possibility exists that that was left at the default and your model is actually a different resolution, in which case weird predictions would make sense.

        Are you running your 416 or 608 model in this case?

        And CC: @erik

        Thanks,
        Brandon

          Brandon, config file for darknet training was made for 608 resolution, and best weights file of trained model was for 608x608. But after converting model to tf format and optimizing it by openvino model optimizer it became 416x416, as I see it in the generated .xml file. Setting the 608x608 resolution of rgb cam in pipeline gen2 project also produces a warning that network input is 416x416.

          I tried to compile blob file for 608 resolution, but there are no success as I don’t have deep understanding of openvino optimizing process. Manually modified shapes in .xml file cannot be compiled by myriad_compiler tool.

            Hi asidsunrise ,

            Got it. I'm not 100% sure here either. So I think @GergelySzabolcs could likely help. So in the meantime, here is a script that one of our community members (geax, Pierre Mangeot) wrote for converting the input resolution of models in OpenVINO. It doesn't work for all models, so I'm not sure if it will work here:

            His notes/examples:

            Example:

            python reshape_openvino_model.py -m models/human-pose-estimation-0001.xml -r 128x228
            Loading network files:
                    models/human-pose-estimation-0001.xml
                    models/human-pose-estimation-0001.bin
            reshape_openvino_model.py:27: DeprecationWarning: Reading network using constructor is deprecated. Please, use IECore.read_network() method instead
              net = IENetwork(model=model_xml, weights=model_bin)
            Input blob: data - shape: [1, 3, 256, 456]
            Output blob: Mconv7_stage2_L1 - shape: [1, 38, 32, 57]
            Output blob: Mconv7_stage2_L2 - shape: [1, 19, 32, 57]
            Reshapping to 128x224
            Input blob: data - new shape: [1, 3, 128, 224]
            Output blob: Mconv7_stage2_L1 - new shape: [1, 38, 16, 28]
            Output blob: Mconv7_stage2_L2 - new shape: [1, 19, 16, 28]
            Saving reshaped model in models/human-pose-estimation-0001_128x224.xml and models/human-pose-estimation-0001_128x224.bin

            The script should probably be adapted for other models than human-pose-estimation-0001, in particular for models that have more than one inputs 😀

            @GergelySzabolcs will likely have more pertinent advice than me here though. :-)

            -Brandon

            GergelySzabolcs Thanks, correcting appearances in .pb file helped to compile blob file with resolution of 608. But network doesn’t detect anything. I trained yolo v4 tiny model again for resolution of 416, compiled .blob and it works, but with the same false detections as in last screenshot of this discussion. I’ll try to play with parameters of conversion and write in case of success

            Thanks @asidsunrise . I'm not sure the root of the false positives. Usually it's some sort of normalization step missing or some other preprocessing issue. In our Discord community there are some members who are talented at this who may be able to help debug, if you'd like. https://discord.gg/EPsZHkg9Nx

            https://docs.luxonis.com/en/latest/pages/faq/#my-model-requires-pre-processing-normalization-for-example-how-do-i-do-that-in-depthai

            I haven't personally done enough model conversion to advise though. That said, both Roboflow and Super Annotate have done tutorials on training and deploying. So here is Super Annotate in case it's helpful:

            Thoughts?

            Thanks,
            Brandon

            It works! As I understand, the problem was in openvino and dependencies versions. I installed the latest openVino 2021.2.185 (previous was openvino_2020.3.341), updated tf version to 1.15.2 and everything compiled as planned with the same weights input file. No artifacts or false positive predictions on OAK, thanks for your support.
            *If it is possible, please delete my message with main2.py code, it is very large and bad formatted. Thanks!

            5 days later

            Hi @asidsunrise ,

            Awesome! Thanks for circling back and sorry about the delay. I meant to follow up earlier. Just deleted the main.py as well.

            Excited that this is working for you - and thanks again for updating with what versions/etc. worked! (As this may probably help others in the future.)
            -Brandon