3 stage NN - Not working with MultiMsgSync.py

Rrexn8r · Jul 9, 2022

hello depthai experts,

I am trying to put up a python script to do following tasks using OAK-D Lite (3 stage nn);

-- Face Detection
|__ Age / Gender Detection
|__ Expression Detection

The camera is rotated @ -90 degrees.

I have created a python script using gen2 age/gender/expression demo along with modifications to MultiMsgSync.py as per the code described in the link below;

detection script code;

https://pastebin.com/2ib34hqH

MultiMsgSyncV2.py code;

https://pastebin.com/VAmD2WwU

The problems:

Some time it throws Processing failed, potentially unsupported config error.
if multiple faces are in front of the camera then msgs = sync.get_msgs() returns none and the script doesn't work there on i.e. it gets stuck on the last frame
The python scripts runs on OAK-D but on OAK-D Lite, it freezes after first 4-5 frames

I have following PR installed which has fix for the edge problem on the -90 degree rotated camera;

python3 -m pip install -U --prefer-binary --user --extra-index-url https://artifacts.luxonis.com/artifactory/luxonis-python-snapshot-local "depthai==2.15.4.0.dev+7ce594fa6a2b8437be7fef010a26b1eb5eb1fdf8"

I would highly appreciate if some expert can throw a light on where could be the problem in the code.

thanks
rex

Rrexn8r · Jul 11, 2022

i suppose the luxonis team/experts answering other questions posted after mine suggest its time to move on and find alternative of OAK-D for my project.

bit disappointing when someone not having huge AI experience trying hard to put together a real world solution and not getting enough support from luxonis ecosystem.

time to move on....

erik · Jul 11, 2022

Hi rexn8r ,
I haven't been able to reply on the weekend, as there's no easy solution (and it was weekend) - it's quite complex and we are working on SDK to abstract these complex settings, and fix the 2-stage NN freezing, most likely due to blocking recognition NN behavior. As mentioned, it's quite complex, so debugging will take some time, but if you figure it out let us know.

Regarding the potentially unsupported config error - it's likely as the boudning box is out of bounds (below 0.0 or above 1.0) I would suggest checking out the image_manip_refactor branch and try out that approach. Thoughts?

Thanks, Erik

Rrexn8r · Jul 11, 2022

hi

as you said its complex solution to achieve 3 stage nn, what I as a user of the SDK was expecting from luxonis expert is to at least vet the code and let me know if I am on the right path or not so that i can try to debug further.

for example;

in the script node code, am I cropping the image 62X62 and 64X64 and passing them back to respective input. is this correct way??
cam.preview.link(image_manip_script.inputs['preview']) this line pass the camera frame to the script node. when I output the frame in a window its still in landscape mode and not in -90 degree rotated mode (i thought cam.preview.link(face_det_manip.inputImage) should pass the rotated frame to the script node but its not). so that could have been the reason the inference was off mark. but then i replaced cam.preview.link(image_manip_script.inputs['preview']) with manipRgb.out.link(image_manip_script.inputs['preview']) and it seem to have resolved the inference.
Also the Multimsgsync.py. I have added expression bit in it. is there any code problem which could be causing the script to produce errors listed in the post?
"potentially unsupported config error" - if you look at my script node code, the correct_bb function is already used which i suppose take care of out of bounds problem. and as i mentioned in my original post, i am using following PR which was given to me in following post on github by szabi-luxonis

https://github.com/luxonis/depthai-experiments/issues/326

so in my first post also I asked for help to check the code and suggest modification/tweaking so i can do further debugging myself and help get such live example up and running that would benefit not only me as potential customer but also luxonis community.

hope this will clarify my ranting.

if you guys have time to vet the pastebin script code and suggest modifications to it then i will try to debug further and see if that resolves the issue.

thanks

erik · Jul 12, 2022

Hi rexn8r ,
I took the time and checked your code. I reduced it to MRE and it works as expected. I used the latest depthai (2.17).

https://pastebin.com/84nKFx7s

Regarding your questions;

Yes, that's the way I would do it as well.
The cam preview outpuit is in landscape mode - the face_det_manip rotates the frame. Could you elaborate the question?
ImageManip issues usually - but since we have updated depthai library yesterday to 2.17, this shouldn't be the case anymore.

Thoughts?
Thanks, Erik

Rrexn8r · Jul 13, 2022

hi @erik

thanks for you response.

I downloaded your code with MRE and ran on my pc having Ubuntu 22.04. I uninstalled depthai and reinstalled 2.17.0.0

I am getting following error after half a minute;

I then downloaded gen2 age and gender demo files from github and ran main.py and it just got stuck on the initialization;

Let me elaborate on the 2nd question in my previous post regarding -90 degree rotation;

The question is about wrong gender inference when script node is used with -90 degree rotation.

After the script node is initialized i.e. "image_manip_script.setScript" , following line of code is executed;

cam.preview.link(image_manip_script.inputs['preview']) <--- (this line present in your gen2 age-gender demo)

The gender inference was coming out wrong so i wanted to check the preview input frame used by the script node so i added following line in the script node

node.io['manip_frame'].send(frame)

Then i added following xlinkout node to check the frame used by the script node;

frame_xout = pipeline.create(dai.node.XLinkOut)
frame_xout.setStreamName("frame_xout")
image_manip_script.outputs['manip_frame'].link(frame_xout.input)

When i did imshow on the frame_xout what i found was that its was not rotated -90degree and the inference was off mark i.e. wrong gender result

if you check my first post on the pastebin, these lines are present but commented;

https://pastebin.com/2ib34hqH

then what i did was replaced cam.preview.link(image_manip_script.inputs['preview']) with following line;

manipRgb.out.link(image_manip_script.inputs['preview'])

and that showed rotated image and also the gender inference was correct.

so my question is, Is it correct to use manipRgb.out.link instead of cam.preview.link to the script node?

Sorry to be a pain but we are at a stage of DepthAI integration with our digital signage software where time is running out for us and the success of our solution depends upon the accuracy and stability of Depthai integration before we can proceed for POC. so any help would be highly appreciated.

we are testing 2 versions of depthai implementation;

1. Option 1 - Use script node which we are discussing in this post
2. Option 2 - without script node using dai.NNData() on each detection for age/gender and expression inference which seem to be slower then option 1.

Any guidance or help would be highly appreciated.

thanks
rex