ECAL – Enhanced Communication Abstraction Layer / Pub-Sub Middleware

theamk 6 months ago

I don't know about other topics, but do avoid this as a top-level framework for robotics applications. The reason is that the framework is designed to only run at real-time speed. You can see that in examples - yes, "while (Ok()) { Publish("message"); std::this_thread::sleep_for(50 ms); }" looks nice and simple.. but this can't go faster or slower depending on CPU load.

Why is this a problem? Imagine wanting to write an integration test for a module that replays messages and verifies the replies - you'd always be limited to real-time speed, even if your module only consumes 1% of the CPU and could theoretically run 100x faster. On the other hand, if during the test, your system gets a short hick-up (such as random daemon activity), your test might miss messages and will generate false positive failures. This is quite a problem if you are running those tests in automated fashion.

This is normally not an issue in academic circles (my experience shows that they don't really care about robust automated testing), but is quite a deal-breaker in real apps.

What's the fix? The modules' should support two modes: "real-time" and "replay", and in replay mode, the message delivery should be controlled by consumers only - replay message, wait until all consumers acknowledged processing of it, immediately send next one. This means any sort of system "sleep()" is forbidden (will break replay), only the wrapper should be used. Same with any functions which query current time - only call via special wrapper. The threads could be problematic, too - don't want to signal "ready for next message" if the thread is working in the background, so those will need wrappers as well. Unfortunately, I don't see anything like this in the API docs.

(note you can implement that framework yourself and use ECAL only as communication layer... but this will mean giving up on many of its tools, so it's not clear how much value will you get out of it)

rex71 6 months ago

eCAL does not rely on real-time at all. The provided samples are simple demo applications designed to demonstrate general usage. The publish/subscribe pattern supports sending messages with or without acknowledgment from the subscriber side. Latency is very low on the same machine (using shared memory transport) and depends on the Ethernet speed for inter-host connections.
You can simply use a publisher to send data to one or multiple matching subscribers, regardless of where they are running, without requiring any additional configuration. If you need RPC functionality, you can use the client/server pattern or combine multiple communication patterns to suit your needs.