Testing Inference Speed

dave · Sep 4, 2023

Hi All,

I'm trying to test a model for inference speed. I am getting some unexpected results, I was wondering if you could help.

I'm using the following script to test it. It consists of a script node that creates a dummy NNData input for the model. A NeuralNetwork node with the model blob and another script node that reads the output of the model and prints when it receives the data.

import depthai as dai

dummyInputScript = '''
dataSize = 200
bufSize = 2 * dataSize * 3
list1 = [1.0,1.0,1.0]
list2 = [1.0,1.0,1.1]
dummy1 = [list1[i%3] for i in range(dataSize*3)]
dummy2 = [list2[i%3] for i in range(dataSize*3)]
buf1 = NNData(bufSize)
buf1.setLayer("points",dummy1)
buf2 = NNData(bufSize)
buf2.setLayer("points",dummy2)

while True:
    node.io["A"].send(buf1)
    node.io["B"].send(buf2)
'''

dummyOutputScript = '''
while True:
    data = node.io["in"].get()
    node.io["out"].send(data)
    node.warn("Inference Complete")
'''

if __name__ == "__main__":
    pipeline = dai.Pipeline()

    dummyInput = pipeline.create(dai.node.Script)
    dummyInput.setScript(dummyInputScript)

    nnNode = pipeline.create(dai.node.NeuralNetwork)
    nnNode.setBlobPath("pathToModel.blob")
    nnNode.input.setBlocking(True)
    nnNode.input.setQueueSize(1)

    dummyInput.outputs["A"].link(nnNode.inputs["A"])
    dummyInput.outputs["B"].link(nnNode.inputs["B"])

    dummyOutput = pipeline.create(dai.node.Script)
    dummyOutput.setScript(dummyOutputScript)

    nnNode.out.link(dummyOutput.inputs["in"])

    xout = pipeline.create(dai.node.XLinkOut)
    xout.setStreamName('host')

    dummyOutput.outputs["out"].link(xout.input)

    with dai.Device(pipeline) as device:
        device.setLogLevel(dai.LogLevel.WARN)
        device.setLogOutputLevel(dai.LogLevel.WARN)

        while True:
            data = device.getOutputQueue("host", maxSize=1, blocking=False).get()

When run on the model I am testing I get output such as:

[1944301081FBF81200] [2.2] [1.892] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.033] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.251] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.392] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.640] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.793] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.043] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.188] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.431] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.559] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.799] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.880] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [4.149] [Script(2)] [warning] Inference Complete

For the blob I am testing, when looking at the time intervals between each of the outputs (as a representation of inference time) I get graphs that look like the image below.

The time between print outputs seem to alternate between a short and long time repeatedly. I was wondering if you had any comments on this result or my method?

Naively, I would expect the model to have a more regular inference time, assuming there isn't a bottleneck of some sort somewhere else.

Thanks

jakaskerl · Jan 18, 2024

dave
You can check inference time by running the pipeline with DEPTHAI_LEVEL=TRACE python3 .... I don't see a reason other than perhaps measuring method that would produce this results.

Thanks,
Jaka