ZeroMQ - pub / sub-latency

I am reviewing ZeroMQ to see if it is suitable for a real-time software application. I was very pleased to see that the latency for small payloads was in the 30 microseconds or so range. However, in my simple tests, I get around 300 microseconds.

I have a simple publisher and subscriber, mostly copied from examples from the internet, and I am sending one byte via localhost.

I've been playing for about two days with different ones sockopts

and I'm retiring.

Any help would be appreciated!

Lisher pub :

#include <iostream>
#include <zmq.hpp>
#include <unistd.h>

#include <sys/time.h>


int main()
{
    zmq::context_t context (1);
    zmq::socket_t publisher (context, ZMQ_PUB);
    publisher.bind("tcp://*:5556");

    struct timeval timeofday;
    zmq::message_t msg(1);
    while(true)
    {
        gettimeofday(&timeofday,NULL);
        publisher.send(msg);
        std::cout << timeofday.tv_sec << ", " << timeofday.tv_usec << std::endl;
        usleep(1000000);
    } 
}  

      

sub Subscriber:

#include <iostream>
#include <zmq.hpp>
#include <sys/time.h>


int main()
{
    zmq::context_t context (1);
    zmq::socket_t subscriber (context, ZMQ_SUB);
    subscriber.connect("tcp://localhost:5556");
    subscriber.setsockopt(ZMQ_SUBSCRIBE, "", 0);

    struct timeval timeofday;
    zmq::message_t update;
    while(true)
    {
        subscriber.recv(&update);
        gettimeofday(&timeofday,NULL);
        std::cout << timeofday.tv_sec << ", " << timeofday.tv_usec << std::endl;
    }
}

      

+3


source to share


2 answers


First, make sure you are running producer and consumer on different physical cores (not HT). Second, it depends on the hardware and OS. Last time I measured core IO (4-5 years ago) the results were really 10 to 20US around the send / recv syscalls. You should optimize your low latency kernel settings and set TCP_NODELAY.



0


source


Is the task definition real?

When talking about * -real-time design, the validation of the architecture's capabilities is more important than the implementation itself.

If you take the source code as is, your readings (which weren't posted along with your replicated MCVE retest cross-validation snippets) won't serve much as the numbers don't distinguish between which chunks (how long) were wasted sending on the feedback side, on the sending side zmq-data-receiving / copying / scheduling / formatting at the wiring level / datagram-sending and unloading the side from the media / copying / decode / match templates / propagate to the receiver's buffer

If you're interested in working with the internals of ZeroMQ, there are some great performance-related notes.

If the pursuit of design with minimal delay:

  • remove all overhead
    • replace all -header handling from suggested / channel tcp

      PUB

      SUB

    • avoid handling non-standard overhead (no point wasting time on sub scribe-side (of course newer versions of ZMQ have moved to pub lisher - but the idea is clear) with pattern matching encoded in the chosen archetype handling (using avoids any such, regardless of class transport) - if it's designed to block something, then rather change the signal socket layout accordingly to mostly avoid blocking (this should be a real-time system as you said above). ZMQ_PAIR

    • apply "latency masking" where possible on target multicore / multicore hardware architectures to squeeze the last drops of free time out of your hardware / tool capabilities ... compare to experimental setups with lots of I / O-threads help where N> 1 zmq::context_t context( N );

Missing target:

As Alice in Wonderlands stated more than a hundred years ago, whenever there was no goal, any road leads to a goal.



With soft real-time ambitions, it shouldn't be a problem to specify the maximum allowable end-to-end latency and from that get a limit for transport layer latency.

Without doing this, 30 us, 300 us, or even 3 ms have no meaning as such, so no one can decide whether these numbers are "sufficient" for some subsystem or not.

The next smart step:

  • determine the stability horizon in real time ... if used for real time control
  • define design constraints in real time ... to collect / collect data / data for processing tasks (tasks) for self-diagnosis and control services
  • avoiding any blocking, design decisions and checking / confirming the absence of blocking will never appear in all possible real operations (formal proof methods are ready for such a task) (no one would like to see AlertPanel

    [Waiting for data] during your next jet landing or the last thing must be seen before the autonomous car hits the wall, nice looking [hour-glass]animated-icon

    as it moves sand while the control system is busy, whatever the reason it was behind it, in a devastating lockdown.

Quantitative targets make sense for testing.

If a given threshold allows a stability horizon of 500 ms (which may be a safe value for a hydraulic actuator / slo-mo control loop, but may not work for a guided missile guidance system, the less for any [mass inertial inertial system] (like DSP -families of RT-control-systems)), you can test end-to-end if your processing is in between.

If you know your incoming data stream is bringing about 10KB every 500 us, you can check your design if it can keep pace with packet traffic or not.

If you test, your mock design will miss a target (not meeting the performance / time limit metrics) that you know well where the design is or where the architecture needs improvement.

0


source







All Articles