I am looking to detect and perform OCR on small moving objects which appear unpredictable. The lighting conditions would be highly variable and the motion would be as well. My idea is to use object detection to determine the object is present, and then adjust camera exposure, resolution, etc specifically to make the object readable and disregarding any other part of the frame. For instance if a person wearing a badge were standing in front of a bright light in nighttime, the badge would be detected (but unreadable) and the camera and processing would tune the settings to make the entire frame underexposed regarding the bright lights but the badge would be readable for as many frames as needed to capture readable information on the badge.

I am aware this is not trivial but if this is possible to perform on this platform I would be grateful for any information as to what would be required to get started.

  • jakaskerl replied to this.
  • Hi vital,

    I would recommend looking at this example. It it currently configured so you can control the auto exposure roi manually or using NN.
    The first step would be to train a neural network model (or perhaps use an existing one) that would be capable of detecting the small objects (like badge for example) and return the roi of that object. Then you could instruct the camera to apply auto exposure to that roi.
    The image inside roi would then be passed to another neural network that would do something like text recognition. The structure should in the end be pretty similar to our face recognition example.

    Hope this helps,
    Jaka

    Hi vital,

    I would recommend looking at this example. It it currently configured so you can control the auto exposure roi manually or using NN.
    The first step would be to train a neural network model (or perhaps use an existing one) that would be capable of detecting the small objects (like badge for example) and return the roi of that object. Then you could instruct the camera to apply auto exposure to that roi.
    The image inside roi would then be passed to another neural network that would do something like text recognition. The structure should in the end be pretty similar to our face recognition example.

    Hope this helps,
    Jaka

    Thank you very much. Would an Oak-1 or Oak-1 Lite with a Pi4 or similar be able to handle that?

    EDIT: To clarify, I mean for the edge device doing inference.

      Hi vital
      The whole pipeline runs on the Oak device. It will be able to handle the pipeline, but the fps very much depends on desired input resolution and the complexity of the NN models.

      Hope this helps 🙂
      Jaka