Settings

Theme

Flow: Actor-based language for C++, used by FoundationDB

github.com

189 points by SchwKatze 23 days ago · 54 comments

Reader

SoKamil 23 days ago

FoundationDB is awesome testing wise as they have deterministic simulation testing [1] that can simulate distributed and operating system failures.

> We wanted FoundationDB to survive failures of machines, networks, disks, clocks, racks, data centers, file systems, etc., so we created a simulation framework closely tied to Flow. By replacing physical interfaces with shims, replacing the main epoll-based run loop with a time-based simulation, and running multiple logical processes as concurrent Flow Actors, Simulation is able to conduct a deterministic simulation of an entire FoundationDB cluster within a single-thread! Even better, we are able to execute this simulation in a deterministic way, enabling us to reproduce problems and add instrumentation ex post facto. This incredible capability enabled us to build FoundationDB exclusively in simulation for the first 18 months and ensure exceptional fault tolerance long before it sent its first real network packet. For a database with as strong a contract as the FoundationDB, testing is crucial, and over the years we have run the equivalent of a trillion CPU-hours of simulated stress testing.

[1]https://pierrezemb.fr/posts/notes-about-foundationdb/#simula...

ttul 23 days ago

Type-safe message-passing is such a wonderful programming paradigm - and not just for distributed applications. I remember using QNX back in the 1990s. One of its fabulous features was a C message passing library allowing you to send arbitrary binary structs from one process to another. In the context of realtime software development, you often find yourself having one process that watches for events from a certain device, modify the information somehow, and then pass it on to another process that ends up doing something else. The message-passing idiom was far superior to what was available in Linux at the time (pipes and whatnot) because you were able to work with C structs. It was not strictly type safe (as is the case with FoundationDB’s library), but for the 1990s it was pretty great.

  • mrbnprck 23 days ago

    I remnber that ASN.1 does sth similar. You'd give a ASN.1 notation to a language generator (aka producing C) and not have to worry about parsing the actual structure anymore!

    • IshKebab 23 days ago

      Literally every schema-based serialisation format does this. ASN.1 is a pretty terrible option.

      The best system for this I've ever used was Thrift, which properly abstracts data formats, transports and so on.

      https://thrift.apache.org/docs/Languages.html

      Unfortunately Thrift is a dead (AKA "Apache") project and it doesn't seem like anyone since has tried to do this. It probably didn't help that there are so many gaps in that support matrix. I think "Google have made a thing! Let's blindly use it!" also helped contribute to its downfall, despite Thrift being better than Protobuf (it even supports required fields!).

      Actually I just took a look at the Thrift repo and there are a surprising number of commits from a couple of people consistently, so maybe it's not quite as dead as I thought. You never hear about people picking it for new projects though.

      • computably 22 days ago

        FB maintains a distinct version of Thrift from the one they gave to Apache. fbthrift is far from dead as it's actively used across FB. However in typical FB fashion it's not supported for external use, making it open source in name (license) only.

        As an interesting historical note, Thrift was inspired by Protobuf.

      • mrbnprck 22 days ago

        Very true. ASN.1 is mostly not a great fit, yet has been the choice for everything to do with certificates and telecommunication protocols (even the newer ones like 5G for things like RRC AND NGAP) Mostly for bit-level support and especially long-term stability. * and looking back in time ASN.1 has definetly proven its LTS.

        actually never heard of thrift until today, thanks for the insight :)

      • p_l 22 days ago

        Honestly, first time I've seen someone praising Thrift in a long time.

        Wanted to do unspeakable and evil things to people responsible to choosing it as well as its authors last time I worked on a project that used Thrift extensively.

        • IshKebab 22 days ago

          How come? I haven't used it for like a decade but I remember it being good.

          • p_l 22 days ago

            Lot of network issues coming from Thrift RPC runtime apparently not handling anything well.

            I recall threatening I'll rewrite everything with ONC-RPC out of pure pettiness and wish to see the network stack not go crazy.

  • CyberDildonics 22 days ago

    Reinventing QNX will be revolutionary for decades to come.

websiteapi 23 days ago

I'm always hearing about FoundationDB but not much about who uses it. I know Deno and obviously Apple is using it. Who else? I'd love to hear some stories about it.

boris 23 days ago

The strangest thing about Flow is that its compiler is implemented in C#. So if you decide to use it in your C++ codebase, you now have a C#/.Net dependency, at least at build time.

  • boxfire 23 days ago

    It’s also funny because it’s a small, incomplete, incompatible subset of c++… seems like a perfect LLVM / clang rewriter case too, it would be easy to convert and be pure c++. Hell even a clang plugin to put the compile time into one process wouldn’t be awful. But i wonder looking at the rewrites if there’s not a terribly janky way to not need a compiler, if at some runtime cost of contextual control flow info.

  • jermaustin1 23 days ago

    I wonder why that decision was made. I know why I, a C# developer, would make that decision, but why Apple?

    • atn34 23 days ago

      The original developers (before Apple bought the company) used Visual Studio on Windows

    • jeffbee 23 days ago

      This entire codebase was acquired by apple in a state of substantial completion and since then relatively little has changed.

    • rdtsc 22 days ago

      Someone knew C# and was good at parsers, would be my guess. It could have just as easily been Scala or something else.

culebron21 23 days ago

At first glance, it looks like Rust's channels with a polymorphic type -- when you receive from a channel, you do match and write branches for each variant of the type.

But I wonder if this can be a better abstraction than async. (And whether I can build something like this in existing Rust.)

pmarreck 23 days ago

how does this compare to the inbox and supervisor model of erlang/elixir?

  • yetihehe 23 days ago

    It doesn't. It's "promise" based, not "communicating sequential processes". Erlang has more preemptive scheduling, a "thread" can be preempted at any time, here you can only be synchronized when you wait for result. It is called "actor-based", because only functions tagged as "actor" can call waiting functions.

    This is more node.js-like communication than erlang.

    • jacquesm 23 days ago

      By they looks of it they changed the word 'async' to 'actor' because they thought it was cool not because it actually uses the actor pattern. Which to me seems to be namespace pollution.

      • voidmain 23 days ago

        If I were designing it today rather than in... 2008?, I would use the terms 'async' and 'await' because they are a lingua franca now. And for a modern audience that already knows what promises are it probably makes sense to start the explanation with that part. But the thing as a whole was intended to build lightweight asynchronously communicating sequential processes with private state that can be run locally or in a distributed way transparently, restarted on failure, etc. I don't think the choice of terms was obviously a crime at the time.

      • junon 23 days ago

        Unfounded guess, they probably didn't want to bump into the new C++ keywords for async/await.

    • thesz 22 days ago

      They build channels on top of these "promises" and "futures" and this made them square into communicating sequential processes category. Also, you can look at promise-future pair as a single-element channel, again, it's CSP.

      BTW, Erlang does not implement CSP fully. Its' interprocess communication is TCP based in general case and because of this is faulty.

      • yetihehe 22 days ago

        It is not TCP based. In Erlang processes have mailboxes. But they don't have promises, you send a message and wait for response with timeout or do something else. And TCP is only used between nodes (vm instances). But you can use any communication channel (UDP, unix sockets, tls, serial port, some other process doing funny things).

        > Its' interprocess communication is TCP based in general case and because of this is faulty.

        What? It's faulty because of TCP? No, in Erlang it is assumed that communication can be faulty for a lot of reasons, so you have to program to deal with that and the standard library gives you tools to deal with this.

        • thesz 20 days ago

          There is no such thing as "Communicating Sequential Processes with faulty channels and processes." I tried to find something like that, fruitlessly.

          This means that Erlang does not implement CSP, it implements something else.

          Again, general case of communication between Erlang processes includes communication between processes on different machines.

      • pmarreck 21 days ago

        > BTW, Erlang does not implement CSP fully.

        Specific evidence?

        > Its' interprocess communication is TCP based in general case

        No, it is not. Only between machines is that true.

        > and because of this is faulty.

        LOL, no. Why are you rolling with "speaking a whole lot of BS based on ignorance" today?

        On the other hand, I now understand that one impediment to Elixir adoption is apparently "people repeating a lot of bullshit misinformation about it"

        • thesz 20 days ago

            >> Its' interprocess communication is TCP based in general case
            > No, it is not. Only between machines is that true.
          
          It is true for communication between two VMs on same machine, isn't it?

          The general case includes same-VM processes, different VM processes and also different VMs on different machines.

            > Why are you rolling with "speaking a whole lot of BS based on ignorance" today?
          
          TCP is unreliable: https://networkengineering.stackexchange.com/questions/55581...

          That was acknowledged by Erlang's developers before 2012. I remember that ICFP 2012 presentation about Cloud Haskell mentioned that "Erlang 2.0" apparently acknowledged TCP unreliability and tried to work around.

        • thesz 20 days ago

          Here, page 31 on: https://wiki.haskell.org/wikiupload/4/46/Hiw2012-duncan-cout...

          Erlang circa 2012 was even less reliable than TCP on which its interprocess communication was based.

          Namely, TCP allows for any prefix of messages m1,m2,m3... to be received. But Erlang circa 2012 allowed for m1,m3... received, dropping m2.

          It may be not case today, but it was case about ten years ago.

  • hawk_ 23 days ago

    Ok a related note, how does it compare to SeaStar?

srinikhilr 23 days ago

iirc there was a ticket/doc about FoundationDB deprecating usage of this and moving to C++ coroutines.

thisisauserid 23 days ago

How did they come up with such an original and unique name? Apple does it again.

  • Hayvok 23 days ago

    FoundationDB was originally a startup, purchased by Apple in 2015.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection