Hi All,
I'm trying to test a model for inference speed. I am getting some unexpected results, I was wondering if you could help.
I'm using the following script to test it. It consists of a script node that creates a dummy NNData input for the model. A NeuralNetwork node with the model blob and another script node that reads the output of the model and prints when it receives the data.
import depthai as dai
dummyInputScript = '''
dataSize = 200
bufSize = 2 * dataSize * 3
list1 = [1.0,1.0,1.0]
list2 = [1.0,1.0,1.1]
dummy1 = [list1[i%3] for i in range(dataSize*3)]
dummy2 = [list2[i%3] for i in range(dataSize*3)]
buf1 = NNData(bufSize)
buf1.setLayer("points",dummy1)
buf2 = NNData(bufSize)
buf2.setLayer("points",dummy2)
while True:
node.io["A"].send(buf1)
node.io["B"].send(buf2)
'''
dummyOutputScript = '''
while True:
data = node.io["in"].get()
node.io["out"].send(data)
node.warn("Inference Complete")
'''
if __name__ == "__main__":
pipeline = dai.Pipeline()
dummyInput = pipeline.create(dai.node.Script)
dummyInput.setScript(dummyInputScript)
nnNode = pipeline.create(dai.node.NeuralNetwork)
nnNode.setBlobPath("pathToModel.blob")
nnNode.input.setBlocking(True)
nnNode.input.setQueueSize(1)
dummyInput.outputs["A"].link(nnNode.inputs["A"])
dummyInput.outputs["B"].link(nnNode.inputs["B"])
dummyOutput = pipeline.create(dai.node.Script)
dummyOutput.setScript(dummyOutputScript)
nnNode.out.link(dummyOutput.inputs["in"])
xout = pipeline.create(dai.node.XLinkOut)
xout.setStreamName('host')
dummyOutput.outputs["out"].link(xout.input)
with dai.Device(pipeline) as device:
device.setLogLevel(dai.LogLevel.WARN)
device.setLogOutputLevel(dai.LogLevel.WARN)
while True:
data = device.getOutputQueue("host", maxSize=1, blocking=False).get()
When run on the model I am testing I get output such as:
[1944301081FBF81200] [2.2] [1.892] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.033] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.251] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.392] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.640] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [2.793] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.043] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.188] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.431] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.559] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.799] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [3.880] [Script(2)] [warning] Inference Complete
[1944301081FBF81200] [2.2] [4.149] [Script(2)] [warning] Inference Complete
For the blob I am testing, when looking at the time intervals between each of the outputs (as a representation of inference time) I get graphs that look like the image below.
The time between print outputs seem to alternate between a short and long time repeatedly. I was wondering if you had any comments on this result or my method?
Naively, I would expect the model to have a more regular inference time, assuming there isn't a bottleneck of some sort somewhere else.
Thanks