Zeromq performance test. What is the exact latency?

I am using zmq to carry a message through a process and I want to do some benchmarking to get latency everywhere.

The official site gives a guide to inform How to run performance tests

For example, I've tried:

local_lat tcp://*:15213 200 100000
remote_lat tcp://127.0.0.1:15213 200 100000

      

and get the result:

message size: 200 [B]
roundtrip count: 100000
average latency: 13.845 [us]

      

But when trying a pub-sub example in C ++ I found that the time interval between sending and receiving is around 150us. (I am getting the result over the print log with a time stamp)

Can anyone explain the difference between the two?

EDIT: I found the 0mq question : is pubsub latency constantly increasing with posts? The result gives a nearly constant latency of 0.00015s which equals 150US, like my test, 10x than the official performance test. Why the difference?

+1


source to share


1 answer


I have the same problem: ZeroMQ - pub / sub latency

I ran wireshark on my example code that posts zeromq message every second. Here is the output from wireshark:

145  10.900249     10.0.1.6 -> 10.0.1.6     TCP 89 5557→51723 [PSH, ACK] Seq=158 Ack=95 Win=408192 Len=33 TSval=502262367 TSecr=502261368
146  10.900294     10.0.1.6 -> 10.0.1.6     TCP 56 51723→5557 [ACK] Seq=95 Ack=191 Win=408096 Len=0 TSval=502262367 TSecr=502262367
147  11.901993     10.0.1.6 -> 10.0.1.6     TCP 89 5557→51723 [PSH, ACK] Seq=191 Ack=95 Win=408192 Len=33 TSval=502263367 TSecr=502262367
148  11.902041     10.0.1.6 -> 10.0.1.6     TCP 56 51723→5557 [ACK] Seq=95 Ack=224 Win=408064 Len=0 TSval=502263367 TSecr=502263367

      



As you can see, it takes about 45 microseconds to send and acknowledge each message. At first I thought that the connection was restored in every message, but this is not the case. So I turned my attention to the receiver ...

while(true)
    if(subscriver.recv(&message, ZMQ_NOBLOCK)) {
        // print time
    }
}

      

By adding ZMQ_NOBLOCK and polling in a hard loop I got the time up to 100us. It still seems big and it comes at the cost of one core. But I think I understand the problem a little better. Any insight would be appreciated.

0


source







All Articles