Hi martin181818 ,
For each face on the image you are doing one additional infernece (besides the face-detection one). So with 5 faces on frame, you are doing 1 (detection) + 5 (recognition) inferences for each frame. Perhaps try the sdk implementation - I believe it should skip inferencing for some frames if it can't keep up (too many faces). Thoughts?
Thanks, Erik