CAF: C++ Actor Framework

104 points by anarchyrucks 6 years ago · 40 comments

Reader

I was bored so I added CAF to pkgsrc, which you can use to install CAF on NetBSD/Linux/macOS and a variety of other Unix-ey platforms that pkgsrc supports:

https://github.com/NetBSD/pkgsrc-wip/tree/master/actor-frame...

    git clone https://github.com/NetBSD/pkgsrc --depth 1 ~/pkgsrc  
    git clone https://github.com/NetBSD/pkgsrc-wip --depth 1 ~/pkgsrc/wip  
    cd ~/pkgsrc/bootstrap   
    ./bootstrap --unprivileged  
    cd ~/pkgsrc/wip/actor-model  
    ~/pkg/bin/bmake install    
    ~/pkg/sbin/pkg_info -L actor-model

That is all.

asdfasgasdgasdg 6 years ago

Given the project is called "C++ Actor Framework", why did you name the package actor-model?
- Mofu-chan 6 years ago
  
  Just wanted to get the thingy into pkgsrc-wip earlier this afternoon before I had to leave (and before I got bored and ended-up not committing anything).
  Also made a typo. I called it actor-model in the commit message and my post above, but its real name was actor-framework. Will probably find a better name for it later.

clarry 6 years ago

> Decoupling concurrently running software components via message passing avoids race conditions by design.

How am I supposed to read this? It's obviously not "message passing eliminates race conditions", so perhaps it's intended to say something like "you need to design the components to avoid race conditions."

dnautics 6 years ago

I don't think it's true. Message passing definitely doesn't eliminate race conditions or deadlocks. Although, it's much easier to design code that has fewer race conditions.
It's possible that they mean "it avoids data races by design".
- eao197 6 years ago
  
  > Message passing definitely doesn't eliminate race conditions or deadlocks.
  You mention two very different problems.
  1. Data race. It's a problem of mutation of some shared state. This problem is illuminated in the actor's approach by removing the shared state completely. Every actor has its own state and the only actor can change during handling incoming messages.
  Even if we don't speak about actors and use message-passing within some other model (like CSP or Pub/Sub) then it hard to imagine how data race can happen in 1-to-1 interaction between entities in your application.
  2. Deadlocks. There is no such thing as deadlock if actors use async message-passing. But there can be another problem: livelocks (https://en.wikibooks.org/wiki/Operating_System_Design/Concur...)
  Or such thing as miss of a message. For example, actor A starts in state S1(A) and waits for message M1 to switch to state S2(A) where it will wait for M2. But actor B sends messages to A in reverse order: M2, then M1. If message M2 is not deferred by A, then M2 will be ignored by A in S1(A), then A switches to S2(A) and will wait for M2, but that message won't go to it.
  - signa11 6 years ago
    
    exactly right exposition on both data-races and deadlocks in general for message-passing systems.
    > Or such thing as miss of a message. For example, actor A starts in state S1(A) and waits for message M1 to switch to state S2(A) where it will wait for M2. But actor B sends messages to A in reverse order: M2, then M1. If message M2 is not deferred by A, then M2 will be ignored by A in S1(A), then A switches to S2(A) and will wait for M2, but that message won't go to it.
    once message delivery is guaranteed to be in-order and lossless, then for the above scenario the issue is 'obviously' on the sender side. it can be easily solved with timers where 'A' expects to move from state-a to state-b in say 'n' seconds after startup etc.
    
    eao197 6 years ago
    
    > once message delivery is guaranteed to be in-order and lossless, then for the above scenario the issue is 'obviously' on the sender side.
    It depends on how the receiver handles incoming messages:
    * there could be a scheme where a message is lost if it isn't handled in the current actor's state (or if it isn't deferred explicitly);
    * there could be a scheme where a message that is not handled in the current state is deferred automatically (for example, Erlang's selective receive).
    The problem I described exists for the first case but isn't actual for the second. However, schemes with selective receive can have their own drawbacks.
    > it can be easily solved with timers where 'A' expects to move from state-a to state-b in say 'n' seconds after startup etc.
    The main problem is: a developer should understand that problem and should implement that in code. But people make mistakes...
    
    signa11 6 years ago
    
    > * there could be a scheme where a message is lost if it isn't handled in the current actor's state (or if it isn't deferred explicitly);
    that's weird :) how can a message be 'lost' from a mailbox without any explicit 'read-message-from-mailbox' call from the actor itself. now, an actor can choose to ignore messages in some states, but then again a suitable protocol needs to exist between the interacting parties on how to proceed.
    > * there could be a scheme where a message that is not handled in the current state is deferred automatically (for example, Erlang's selective receive).
    sure, but that was conscious decision on part of the actor. message was not 'lost' per-se
    > The main problem is: a developer should understand that problem and should implement that in code. But people make mistakes...
    oh most definitely yes. which is why having callflow diagrams is so ever useful. moreso when dealing with actor like environments...
    
    eao197 6 years ago
    
    > how can a message be 'lost' from a mailbox without any explicit 'read-message-from-mailbox' call from the actor itself.
    It depends on the implementation of actors. If an actor is represented as a thread/fiber then an actor is responsible to call `receive` method from time to time. The only example of such an approach I know in the C++ world is Just::Thread Pro library. But even in that case, a message can be ignored if a user writes the wrong if-then chain (or `switch` statement).
    But actors often implemented as an object with callbacks those are called by actor framework at the appropriate time. List of callbacks can differ from state to state.
    
    signa11 6 years ago
    
    ah, i see where you are coming from.
    my notion of the whole thing was one where an actor is actually a pid (canonical or simulated), running a infinite loop with deque-process-wait on the msgq.
    
    eao197 6 years ago
    
    I understand, but it isn't enough. You have to have a selective receive primitive to write something like:
    for(;;) { // The main state. receive( when([](M1 msg1){ // Dive into another state. while(some_condition) { receive(when([](M2 msg2){...}, when([](M7 msg7){...}, ...); } }), when([](M3 msg3){...}), when([](M5 msg5){...}), ...); }
    In that case, if you receive M2 before M1 the instance of M2 will be kept in the queue.
    But if you have to write something like that:
    for(;;) { // The main state. auto * m = receive(); if(m->type == M1::msg_typeid) { // Dive into another state. while(some_condition) { auto * m2 = receive(); if(m2->type == M2::msg_typeid) {...} else if(m2->type == M7::msg_typeid){...} else if... } } else if(m->type == M3::msg_typeid){...} else if(m->type == M5::msg_typeid){...} else if... }
    then you can easily lose a message by a mistake.
    
    signa11 6 years ago
    
    if you are tangling msgq-receive with its processing then sure.
    however, imho, if you decouple the whole thing i.e. msgq-receive, and its processing via callbacks, then things are harder to get wrong. it might lead to either more elaborate state machine on the receiver side though, but then again imho that is not bad either.
- senderista 6 years ago
  
  Message passing without selective receive does eliminate deadlocks. See: Pony.
  - dnautics 6 years ago
    
    Single threaded service a sends a call request to service b, as a part of handling the request, service b requests back to service a. Deadlock.
    You may not have control over service b.
    
    signa11 6 years ago
    
    huh :) if you have async-messages without any blocking semantics you are just fine.
    
    dnautics 6 years ago
    
    My point is just that it's possible to construct one in any system that static analysis will be unable to detect or prevent. If you can construct it, it can also happen by accident.
    
    signa11 6 years ago
    
    > My point is just that it's possible to construct one in any system that static analysis will be unable to detect or prevent. If you can construct it, it can also happen by accident.
    how ? can you please explain ? thanks !
    
    dnautics 6 years ago
    
    most people implement "block on API response", because that's way easier to reason about, sometimes, even without timeout. Reading through the thread, what I was misunderstanding is that this situation technically gets rolled into what called a "livelock". Fine, but from the higher-level API consumer's POV it's a deadlock.
    
    signa11 6 years ago
    
    with blocking semantics, you are sure to run into deadlocks in a single-threaded environment pretty easily a -> b -> c -> a.
    grpc f.e. has support for blocking requests but it's 'solution' is to spawn a new thread for the request.
    
    senderista 6 years ago
    
    You did not construct such a scenario for an async message-passing system.
- yellowapple 6 years ago
  
  Erlang makes similar promises around its implementation of the actor model; however, that's also combined with immutability, a lack of shared state (processes are isolated from one another), and preemptive multitasking, and I don't know how much of that CAF is able to reproduce.
  Then again, avoid != eliminate. You can avoid hitting a deer on the road, but that doesn't mean you're able to eliminate the possibility of doing so.
- jfkebwjsbx 6 years ago
  
  Exactly! Data races are just a subset of issues!

Koshkin 6 years ago

Honestly, the source code of the examples looks rather complicated, like something that should better be generated from a nicer high-level language that has a native actor concept. (Obviously, the power of C++ templates does not come for free - you pay with noisy code and increased cognitive load.)

eao197 6 years ago

> Honestly, the source code of the examples looks rather complicated
There are several CAF alternatives with very different usages of C++ features:
* QP/C++: https://www.state-machine.com/qpcpp/
* SObjectizer: https://github.com/Stiffstream/sobjectizer
* actor-zeta: https://github.com/jinncrafters/actor-zeta
* rotor: https://github.com/basiliscos/cpp-rotor
At least two of them (QP/C++ and SObjectizer) are evolved and used for a longer time than CAF.
ekez 6 years ago

Zeek uses CAF internally for data exchange over networks and workers [1]. It's not quite a "native actor concept", but it is a nice way to interact with the framework. I've used it before and its worked quite smoothly.
[1] https://docs.zeek.org/en/current/frameworks/broker.html
alexeiz 6 years ago

On the contrary, I worked with CAF and I found it one of the simpler frameworks that provide the actor model (simpler to use, that is). Later I learned about Erlang and I noticed a lot of similarities between CAF and message passing in Erlang. Look at this https://github.com/actor-framework/actor-framework/blob/mast..., for example. There are multiple implementations of the same thing (calculator). But each implementation is simple enough to grasp without any prior knowledge of the framework.
jfkebwjsbx 6 years ago

And compilation times and memory requirements for it.

zoomablemind 6 years ago

In the Docs:Overview [0], the compile and build instructions are not showing up for some reason.

[0] https://actor-framework.readthedocs.io/en/latest/Overview.ht...

thedance 6 years ago

"Actor" architecture is supposed to enable high developer productivity but it's been a decade since the article so I wonder how it's working out? The most prominent project of which I am aware is FoundationDB.

dnautics 6 years ago

Basically microservices over ip/dns are actors (maybe not pure actors like Carl Hewitt would insist - but close enough), so I think it's safe to say that the dominant programming model today is actors.
Having spent a bit of time in actor-land at the language level, I think the primary advantages of actors are:
1) making concurrency relatively easy to reason about (vs. say strict async/await, promises.. shudder, async/yield, whatever twisted does etc.).
2) more importantly, and most actor frameworks don't get this right, fault tolerance and isolated failure domains. If your actor system doesn't limit the blast radius of a failure, then it's missing one of the biggest reasons to go with actors.
clarry 6 years ago

> it's been a decade since the article so I wonder how it's working out?
What article? The actor model dates back to 70s much has been written about its various implementations and incarnations since then.
If you don't treat it like a theoretical model of computation (like Hewitt does), then what's left is various largely unrelated architectures that employ message passing, with various levels of granularity.
I don't think any claim about higher developer productivity can be made without specifying what it's being compared to.
janderland 6 years ago

Also, FoundationDB doesn’t use a traditional actor model. A single thread runs all the actors. It’s much closer to coroutines than concurrent actors.
staticassertion 6 years ago

The actor model is at least 50 years old if not more, and it is found in all sorts of places in some form, which is unsurprising since it is a fundamental model for computation. So, for example, it is not hard to argue that SOA/Microservices are implementations of an Actor approach to services.
- thedance 6 years ago
  
  Sure, sorry, I meant the actor model in C++ specifically, according to the papers published by the authors of the library that is in this post.

clarry 6 years ago

If someone's got a nice actor library written in and for C, I'm all hears.

frumiousirc 6 years ago

The actor pattern for ZeroMQ in C (or C++): http://czmq.zeromq.org/czmq4-0:zactor
A similar actor pattern is provided in the Python implementation of ZeroMQ Zyre: https://github.com/zeromq/pyre

nickysielicki 6 years ago

The activity around C++20 has gotten to the point where I’m skeptical of great libraries like this because I don’t want to learn something third-party when I might be able to use STL equivalents in just a few hundred days.

eao197 6 years ago

There are at least two corner cases in real-world usage of actors:
* traceability of an application (or part of the application). For example, you send a message and don't see any reaction to it. Why? Was this message lost (sent nowhere)? Was it ignored by the receiver (or received by a different actor)? Was it received by the right receiver but handled incorrectly? It could be hard to find an answer, especially if there are millions of actors in your app.
* testability of your actors. Writing a unit-test for an actor can be a more complex task than testing some C++ function or class.
I think a good actor framework should provide tools for simplification of such tasks. And I can't image that those topics will sometime be covered by a C++ standard.
jcelerier 6 years ago

C++20 feature list has been fixed a few mobths ago and there's no actor framework involved afaik.
At best there will be in c++23 which means 2024 the time it trickles down to all relevant compilers and 2030 until you can use it in Debian stable.
That's also the time required by 32 successive 3-months javascript "zero to hero" bootcamps.

Settings

CAF: C++ Actor Framework

Keyboard Shortcuts