CUSTOM OCR TRAINING

Ddhunjoshi · Jan 11, 2021

I WANT TO TRAIN CUSTOM OCR . CAN I GET HELP REGARDING THIS. I PREFER TO USE LOCAL SYSTEM FOR TRAINING.

Brandon · Jan 11, 2021

So there are two parts to our OCR system:

Text detection. We used EAST, as it allows text to be oriented at odd angles/etc. I am pinging our engineer on this on if he did retraining. Anyway, here is one example of retraining it in TensorFlow, which then should be compatible with our platform, as OpenVINO supports TensorFlow.
OCR. This takes the region found from the text detection, and runs the actual OCR on it. For this network, we use the OCR model from Intel: https://docs.openvinotoolkit.org/2019_R1/_text_recognition_0012_description_text_recognition_0012.html

I do not at this time know if Intel gives a reference on how to retrain that. For some of their networks, they do give retraining. Looking quickly. Not immediately seeing it.

From the notes on it, "VGG16-like backbone and bidirectional LSTM encoder-decoder". So likely any network that is similar to this could be used instead, as long as it is on a similar backbone (or one of these), or uses neural operations supported by OpenVINO for the VPU (OAK-D is the VPU in this context), see here.

Ddhunjoshi · Jan 12, 2021

Can OCR will work in attached image. i have to retrain with new images?

Ddemoacct01 · Jan 12, 2021

i tried EAST and result as shown. EAST was suggested by Brandon

By the way Brandon is there any OCR examples ?

Brandon · Jan 12, 2021

Thanks @demoacct01 . To your question demoacct01 , we do have an in-progress example but it is crashing on some of the warp/re-shape when taking the EAST bounding boxes and feeding them through WARP/deWARP on the Myriad X.

We just fixed the crash (got the message that it was fixed just now, actually). So the last step is the OCR output decoding. But the whole pipeline is actually now running.

Here's the Github issue:
https://github.com/luxonis/depthai/issues/124

And here's the WIP PR for the whole demo flow:
https://github.com/luxonis/depthai-experiments/pull/26

We will likely update the PR shortly with the whole flow working, excepting the host-side decoding of the OCR. We're discussing internally the timing on that.

Thoughts?

Thanks again,
Brandon

Ddemoacct01 · Jan 13, 2021

Hi Brandon, git checkout the branch gen2_ocr and gave it a test run. It runs slow ... :-(

I tried to move the paper (with text on it) I was holding in front of the camera. Was like it never went past the first few frame or something. The output screen just freeze with the bounding box on some texts on the paper I was holding.

When I did pip install -r requirements.txt, I got this error.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
depthai-gui 1.0.7 requires depthai==0.0.2.1+d436ec6b629c09b92c58d869e80aac52367a3aa9, but you have depthai 0.0.2.1+9430403bc960388da512c6c8936c27f8d1fa8b2d which is incompatible.

Could this cause the "freeze" ?

Brandon · Jan 13, 2021

Hi demoacct01 ,

Sorry about the trouble. The OCR is likely the slow part. The detection itself is a lot faster:

This example is still a work in progress. Luxonis-Alex is finishing it up now. We actually lost the engineer who was working on this (another one to Amazon!) so it got delayed as a result.

This PR is what will allow for the flow to be on DepthAI/OAK directly:
https://github.com/luxonis/depthai-shared/pull/16

It's what took a lot of time/debugging here. So now that this is out of the way, hopefully we will have the whole flow working well soon.

Thanks and sorry about the delay.
-Brandon

Ddemoacct01 · Jan 13, 2021

No problem Brandon ... I am working on some OCR project at the moment. So was eager to see a working sample from your end.

I hope I didn't misunderstood your response. Are you trying to say that if I "git checkout gen2_ocr" and run main.py I should see a speedup improvement? I tested it and it was slightly faster but not as fast as what was shown in the youtube. I realised the youtube was a last year video. So I am a bit confused now whether the fix is completed or not.

LLuxonis-Alex · Jan 13, 2021

Hi @demoacct01 ,
The demo in the Youtube video is only running the text detection (bounding boxes) network, whereas the Gen2 one https://github.com/luxonis/depthai-experiments/pull/26 is running both text detection and text recognition, passing each cropped bounding box to the recognition model. So depending on the number of detections, it may be slower.

I'm working now on this Gen2 example, moving the rotated cropping from host (OpenCV warp) to run on device (with the recent rotated cropping/rescale added in ImageManip node), and will add a decoder for the recognition model outputs as well: https://docs.openvinotoolkit.org/2020.1/_models_intel_text_recognition_0012_description_text_recognition_0012.html

Aasidsunrise · Jan 28, 2021

Hello! I tried to use gen2-ocr project from GitHub
https://github.com/luxonis/depthai-experiments/tree/master/gen2-ocr
and got several errors. Using depthai version from requirements caused a warning
"DeprecationWarning: setCamId() is deprecated, use setBoardSocket() instead."

Switching between depthai versions made other warnings. There are some of them:

Creating depthai.Pipeline() has no Constructor
nn doesn't have function nn.passthrough (line 25)
silent crashes while compiling
I use OAK-D with Windows-10. What am I doing wrong?

Brandon · Feb 3, 2021

Hi asidsunrise ,

Sorry about the delay (been inundated with Brexit/pandemic shipping/logistics problems in Europe). This is quite odd. So just to confirm, when trying out that example, you ran:

python -m pip install -r requirements.txt

And then:

python3 main.py

And it is still giving those errors?

If so, is the installation of requirements throwing errors? As from what you shared, it is seeming that the API being used is not compatible with the example.

Thoughts?

Thanks,
Brandon

Aasidsunrise · Feb 3, 2021

Brandon If so, is the installation of requirements throwing errors? As from what you shared, it is seeming that the API being used is not compatible with the example.

You are right, I've got errors with the requirements installed. I think the problem relates to using wrong API or the right API was modified. But I tried another versions of depthai too and it didn't work. There are screenshots applied

Brandon · Feb 4, 2021

Strange. Let me see if I can reproduce what you are seeing here. Will circle back shortly.

Brandon · Feb 4, 2021

I can't seem to replicate the version error here. For me it is just running:

So here are the commands I did to get it to run:

cd ~/depthai-experiments/gen2-ocr
git checkout master
git pull
python3 -m pip install -r requirements.txt
python3 main.py

Could you try doing these in sequence? And actually, to make sure that the other version of depthai installed, could you do this sequence:

python3 -m pip uninstall depthai
cd ~/depthai-experiments/gen2-ocr
git checkout master
git pull
python3 -m pip install -r requirements.txt
python3 main.py

I just repeated that whole sequence to be sure, and it works. I'm wondering if something has locked an incorrect version of depthai on the system. If this doesn't work, would it be OK to do a TeamViewer session to debug?

If so, please shoot us an email at support@luxonis.com to coordinate the session.

Thanks - and sorry about the trouble.
-Brandon

Aasidsunrise · Feb 4, 2021

I used all commands you provided in last message and it didn't work for me on both my Windows PC's. I'll try to run it on my Macbok Air and connect via email. Thanks for your support!

Brandon · Feb 4, 2021

asidsunrise Thanks. Very strange. Perhaps our CI/CD didn't properly build for Windows for this example. Will be curious to find out.

Aasidsunrise · Feb 5, 2021

Brandon Well, I tried to run it on my Macbook and got the same error. The re-installing depthai version didn't help. If it could help to find a problem it will be OK to use TeamViewer, but I'm not sure it is possible in the next 2-3 days.

WWill · Feb 6, 2021

Hi, same issue on my iMac under Catalina (macOS 10.15.7)...

Brandon · Feb 6, 2021

Thanks and sorry about the delay, both. So could you please send an email to support@luxonis.com to facilitate a teamviewer to see what is happening here?

I'll try this again now on my iMac as well.

Thanks and sorry again about the delay (so far behind on shipping issues in Europe),
Brandon

Brandon · Feb 6, 2021

Hi @Will ,

So I was able to replicate this with the version that was checked out on my iMac. Steps:

cd ~/depthai-experiments/gen2-ocr 
python3 -m pip install -r requirements.txt

Which produced:

Looking in indexes: https://pypi.org/simple, https://artifacts.luxonis.com/artifactory/luxonis-python-snapshot-local
Requirement already satisfied: numpy==1.19.5 in /Users/leeroy/Library/Python/3.9/lib/python/site-packages (from -r requirements.txt (line 1)) (1.19.5)
Requirement already satisfied: opencv-python==4.5.1.48 in /Users/leeroy/Library/Python/3.9/lib/python/site-packages (from -r requirements.txt (line 2)) (4.5.1.48)
Collecting depthai==0.0.2.1+e88c98c8a7681dd16cd754a832df1165380ac305
  Using cached https://artifacts.luxonis.com/artifactory/luxonis-python-snapshot-local/depthai/depthai-0.0.2.1%2Be88c98c8a7681dd16cd754a832df1165380ac305-cp39-cp39-macosx_10_9_x86_64.whl (4.3 MB)
Installing collected packages: depthai
  Attempting uninstall: depthai
    Found existing installation: depthai 0.4.1.1
    Uninstalling depthai-0.4.1.1:
      Successfully uninstalled depthai-0.4.1.1
Successfully installed depthai-0.0.2.1+e88c98c8a7681dd16cd754a832df1165380ac305

And then I ran:

python3 main.py

Which resulted in:

/Users/leeroy/depthai-experiments/gen2-ocr/main.py:15: DeprecationWarning: setCamId() is deprecated, use setBoardSocket() instead.
  colorCam.setCamId(0)
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: XLink write error, error message: X_LINK_ERROR
zsh: abort      python3 main.py

I am now updating the repo with git pull on the master branch to check if this improves it.