Neat Rust Tricks: Passing Closures to C

blog.seantheprogrammer.com

215 points by rabidferret 6 years ago · 68 comments

Reader

This is a really well-written article about a common way to trampoline Rust methods into C embedded RTOS tasks, which often work in exactly this way. I've done just this in the service of getting ChibiOS and uC/OS II running Rust code.

cheez 6 years ago

Is this a neat trick or just standard operating procedure for calling C from <your favorite lang>? As it was billed as a trick, I was expecting some sort of runtime code generation to pass the data pointer and some jump instruction to jump to the right spot and unpack the data pointer.

Maybe I just overcomplicate things ;-)

DougBTX 6 years ago

I’d say that “standard procedure” would be to do it the same way as it would be done in C: define a struct, allocate one somewhere, then pass a pointer to it as the data pointer. Using the anonymous struct which represents the closure itself seems like skipping a step, the user doesn’t need to spell out which values are stored in the struct.
- zozbot234 6 years ago
  
  If the language supports closures which capture variables from their surrounding environment, there's no way around using "the closure itself" as your data object. After all, "the user" is not expected to "know" what any given closure is capturing from the environment; part of the point of closures is implementing a sort of information hiding.
marcan_42 6 years ago

It's standard procedure. I've done the exact same thing when wrapping C APIs into Python using Cython, several times. You pass the Python closure as the void *data and then register a shared generic callback which casts it and calls it. Easy. Getting the memory management right is slightly tricky, but not too bad.
Fun fact: you can't safely do this with ctypes. Since it is called as pure Python, it cannot do watertight Python exception handling in a callback context (because even if you have a try/except block, an exception can always happen right before or after it), and ctypes provides no usable internal way of doing it - it just eats exceptions inside callbacks. This is what motivated me to rewrite Ceph's librbd bindings from ctypes to Cython.
- cheez 6 years ago
  
  I thought as much, thanks for the confirmation :-)
Mathnerd314 6 years ago

It does seem quite similar to Haskell FFI code: https://github.com/bobfrank/hasqlite/blob/4e38801d969a43e88b...
The "neat" factor comes from how little type wrangling and unsafe code is needed.
- Ericson2314 6 years ago
  
  I believe this actually JITs a trampoline with libffi, so only one code pointer is needed, not separate code and data pointers.
  (Also hi, go contribute to Nixpkgs again!)

KenanSulayman 6 years ago

Interestingly this is very similar to how I implemented passing closures into JavaScriptCore as hooks for JS class invocations ("function calls"). [0]

Essentially it's taking advantage of the fact that closures are static methods with "implicit" data pointers. It should be fairly obvious that this is a massive violation of safety and undefined behavior, and most likely to break when debugging symbols etc. are inserted.

The safest way to do this until Rust has figured out a stable-enough-ABI for closure passing would be a thread-local trampoline, I guess. Not very nice..

[0] https://github.com/psychonautwiki/rust-ul/blob/master/src/he...

Gibbon1 6 years ago

I read an article by a guy talking about stupid C tricks. One of them was to 'mangle' raw pointers into an index before passing them. And then de-magle them to get back a raw pointer. Advantages are you can pass meta data with the 'pointer'. Which also allows you to invalidate a pointer. The pointer can't be dereferenced. The enclosed variables/data isn't accessible and cannot be modified by the target.
For callbacks the overhead likely isn't significant.
doomrobo 6 years ago

Where's the UB? It casts a boxed closure to a raw pointer, and then back to a boxed closure. There's no tricky introspection being done here.
- KenanSulayman 6 years ago
  
  I'm not entirely sure you read the code I'm referring to. There's no box there.

kazinator 6 years ago

You can pass closures to C as C functions in TXR Lisp, a language I created.

Example:

http://rosettacode.org/wiki/Window_creation#Win32.2FWin64

In this program, a translation of Microsoft's "Your First Windows Program" from MSDN, defun is used to define a WindowsProc callback. defun generates a lambda under the hood, which carries a lexical scope.

The lambda is passed directly to Win32 as a callback, which is nicely called for repainting the window. (Or at least, things appear that way to the programmer.)

Setting this up requires a few steps. We need a target function, of course, which can be any callable object.

Then there is this incantation:

  (deffi-cb wndproc-fn LRESULT (HWND UINT LPARAM WPARAM))

The deffi-cb operator takes a name and some type specifications: return type and parameters. The name is defined as a function; so here we get a function called wndproc-fn. This function is a converter. If we pass it a Lisp function, it gives back a FFI closure object.

Then in the program, we instantiate this closure object, and stick it into the WNDPROC structure as required by the Windows API. Here we use the above wndproc-fn converter to obtain WindowProc in the shape of a FFI closure:

  (let* ((hInstance (GetModuleHandle nil))
         (wc (new WNDCLASS
                  lpfnWndProc [wndproc-fn WindowProc]
         ...

The lpfnWndProc member of the WNDCLASS FFI structure is defined as having the FFI type closure; that will correspond to a function pointer on the C side. The rest is just Windows:

  (RegisterClass wc)

kazinator 6 years ago

Here is another example of callbacks at work from the TXR Lisp test suite: using the C library funtion qsort to sort a Lisp array of strings.
http://www.kylheku.com/cgit/txr/tree/tests/017/qsort.tl
It's done in two ways, as UTF-8 char * strings and as wchar_t * strings.
What's used as the callback is the function cmp-str which is in TXR Lisp's standard library. A lambda expression could be used instead.
Also tested is the perpetration of a non-local control transfer out of the callback, instead of the normal return. This properly cleans up the temporary memory allocated for the string conversions.
- cellularmitosis 6 years ago
  
  TXR looks interesting. Is there a project README?
  - kazinator 6 years ago
    
    There is a boring and poorly maintained home page: http://www.nongnu.org/txr.
    And big honkin' manual.

jgtrosh 6 years ago

How does this relate to nested functions in C? (And resulting “infectious executable stacks”?)

https://nullprogram.com/blog/2019/11/15/

Diggsey 6 years ago

It doesn't. This is just showing the normal way that callbacks are implemented in vanilla C and how you would make that programming pattern interoperate with Rust closures. Neither one relies on the compiler trickery/runtime code generation described in the earlier article.
zozbot234 6 years ago

The executable stack trick is only required if you want to implement closures that can be called as if they were plain C functions, with only a function pointer and no extra (void *) argument.
rabidferretOP 6 years ago

It doesn't relate to it at all. The issues around linking to problematic object files mentioned in that article will apply to Rust as well, but that's unrelated to the subject of this article, it's a property of the linker you're using and the toolchain used to compile whatever C dependencies you have
richardwhiuk 6 years ago

The problems there don't apply I believe because Rust closures don't require an executable stack.
- mmastrac 6 years ago
  
  That's correct - a Rust closure generally [1] can't be converted to a function pointer as it requires both code and state.
  [1] https://github.com/rust-lang/rust/issues/39817
  - twic 6 years ago
    
    The whole point of jgtrosh's link is that there is a way to hide data behind a function pointer, so Rust could convert any closure to a function pointer. But it requires writable-and-executable memory, so it's a pretty bad idea (in GCC's implementation, that memory is on the stack, which is an extra bad idea, but i don't think it needs to be).
    
    mmastrac 6 years ago
    
    Definitely.
    Technically this can also be done via static code trampolines that are mmap'd as well [1]. That approach has been used on iOS in the past to turn blocks into raw function pointers.
    If you have a platform that allows W+X on code (yikes!), you can do [2] as well.
    [1] https://github.com/plausiblelabs/plblockimp/blob/master/Sour... [2] https://www.mikeash.com/pyblog/friday-qa-2010-02-12-trampoli...
    
    rabidferretOP 6 years ago
    
    Anything that doesn't require W+X would need an entire page allocated per closure, wouldn't it?
    
    devit 6 years ago
    
    No, you can of course allocate W+X pages from the OS and put multiple closures in them using a standard userspace memory allocator.
    Or if the OS doesn't support W+X allocation at all, then you can have a bunch of tightly packed pregenerated trampolines in the binary.
    
    saagarjha 6 years ago
    
    Right, this is how Objective-C's implementation works, except it keeps around one page of trampolines and remaps that around when necessary to be able to "create" more trampolines on the fly, I believe.
    
    a1369209993 6 years ago
    
    Nope! You'd do something to the effect of:
    clo_code: 4C8B1501100000 mov r10 [rel clo_code+0x1008] FF25F30F0000 jmp [rel clo_code+0x1000] 0F1F00 nop3 # one page away... struct clo_slot { void (*func)(void* _R10,...); void* data; };
    Edit: to use r10 rather than rotating all the argument registers.
    
    barrkel 6 years ago
    
    For example, every platform that has a virtual machine with JIT compilation support.
zabzonk 6 years ago

C doesn't have nested functions - they are a GCC extension.

tedunangst 6 years ago

Now call qsort with a closure.

dfox 6 years ago

qsort(3) or even ftw(3) is the simple case. You can either dynamically generate trampoline code with exactly bounded dynamic scope (and even do the gcc-style executable stack hack) or simply stash the whole context in some TLS region and completely sidestep the whole issue.
Side point: ftw(3) is much more interesting unix API to call from some FFI layer than qsort(3). And I spent about a year pestering people from Sun with you should implement fts_open(3) and friends because it presents more sane API for FFIs for the same functionality.
- wahern 6 years ago
  
  And it appears that you succeeded. Awesome!
  It seems Solaris has been adding many BSD and, especially, Linux compatibility APIs lately. It seems too little, too late; or perhaps the initiative is part of their effort to EoL Solaris, providing an upgrade path to Linux.
  - dfox 6 years ago
    
    Well in fact I gave up about 10 years ago :)
rabidferretOP 6 years ago

`void ()(void, everything_else)` is in my personal experience a much more common API than that of `qsort` (probably for exactly the point you're pedantically trying to make), so I chose to focus on that API for this article.
There's really no reason to pass a rust closure to `qsort` instead of sorting in Rust. That said, if there's demand for real world use cases that require passing Rust closures to C APIs that take only a function pointer and not a data pointer, I'll be happy to write a follow up.
- kazinator 6 years ago
  
  In any decent language, all functions are first-class, so if you want to use them as callbacks, you need that to work.
  That's still true even if the API takes a separate context pointer that is given to your function as an argument.
  There is still a function pointer there, and what you'd like to use as a function pointer is a function in your local language, and that's an object with an environment. Even if some instances of it have no environment, the FFI mechanism might want to tack on its own. For instance, the FFI needs to be able to route the callback to the appropriate function. Whatever function pointer FFI gives to the C function, when the C library calls that function, FFI has to dispatch it back to the original high level function. That requires context. Now that context could be shoehorned into that context parameter, but it could be inconvenient to do so; that parameter belongs to the program and to the conversation that program is having with the C API.
  - comex 6 years ago
    
    Generating native function pointers on the fly:
    - is inherently slow, because CPUs have separate data and instruction caches;
    - is extra slow in practice because you need a separate allocation for executable memory (unless your stacks and heap are RWX, which is a terrible idea);
    - is not portable, requiring architecture- and OS-specific code; and
    - is not supported at all in many environments (of varying levels of braindeadness).
    For a statically compiled language like Rust, it makes much more sense to use the context pointer.
- tedunangst 6 years ago
  
  XSetErrorHandler for instance.
  - kazinator 6 years ago
    
    signal, sigaction.
loeg 6 years ago

I don't think qsort changes anything about the mechanism described in the blog post, but maybe I'm missing something. (I.e., use qsort_r...)
(qsort is really only for C. Other languages can potentially inline the comparison function, so using FFI for that is kind of insane.)
- 0db532a0 6 years ago
  
  Gets me thinking: I wonder how good Intel CPUs are with dealing with this sort of thing. Can the CPU detect repeated jumps to comparators and in-line them from there? I’d be interested to see a comparative benchmark.
marcan_42 6 years ago

Use `qsort_r()`, which gives you an extra argument for your closure.
comex 6 years ago

If a Rust closure doesn’t actually close over any variables, it can be coerced to a function pointer. Otherwise, you’re stuck with the workarounds others have mentioned.
jstarks 6 years ago

I’m sure we could have rustc generate a trampoline function directly on the stack like gcc can. What could go wrong?
- saagarjha 6 years ago
  
  Nothing, if your Rust implementation is bug-free ;)

richardwhiuk 6 years ago

Just stash the data in a stack in thread local storage.

Problem solved.

gpderetta 6 years ago

Then it wouldn't be a lexical closure.

richardwhiuk 6 years ago

It can be from the perspective of the calling code.

Roughly speaking:

  thread_local! {
    static CBQ: Option<Box<impl FnMut(i32, i32) -> i32>>;
  }

  #[no_mangle]
  extern "C" fn qsort(array: *mut i32, val: usize, callback: impl FnMut(i32, i32) -> i32);

  pub fn rust_qsort(array : Vec<i32>, callback: impl FnMut(i32, i32) -> i32){
    CBQ.replace(Box::new(callback)).unwrap_none();
    
    unsafe {
        qsort(array.as_mut_ptr(), array.len(), &rust_qsort_callback);
    }
    
    CBQ.take().unwrap();
  }

  fn rust_qsort_callback(a: *mut i32, b: *mut i32) -> i32 {
    let callback = CBQ.take().unwrap();

    let (a, b) = unsafe { 
        (*a, *b)
    };

    let result = callback(a, b);
    
    CBQ.replace(callback).unwrap_none();

    result
  }

  fn main() {
    let a = vec![4,5,6,3,2];

    rust_qsort(a, |a, b| {
        if a < b {
            -1
        } else if a > b {
            1
        } else {
            0
        }
    })
  }

ought to work. (There's some fun with generics and panics, which is some fun to solve, but nothing which breaks the premise above).

a1369209993 6 years ago
This fails horribly if called recursively (or from a signal handler). You need something like:
```
  wrapped_qsort(/* array,callback */)
    {
    auto tmp = CBQ;
    CBQ = wrap(callback);
    qsort(array.ptr,array.len,cbq_callback);
    CBQ = tmp; /* pop old value from stack */
    }
```
- richardwhiuk 6 years ago
  
  It'll fail from a signal handler inside qsort, or inside the callback function.
  It won't fail recursively - while the second call is happening, the first will be stored on the stack (see the take and replace in the callback shim function)
imstuff 6 years ago

Your `fn rust_qsort` takes ownership of the vector, so it frees its memory after sorting and it can't be used after sorting in the caller function. And generic `impl FnMut` won't work in `extern "C"`, it only accepts function pointers.
gpderetta 6 years ago

you have reinvented dynamic scoping :).

kazinator 6 years ago

  This is the TXR Lisp interactive listener of TXR 228.
  Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
  1> (with-dyn-lib nil
      (deffi qsort "qsort" void ((ptr (array wstr)) size-t size-t closure))
      (deffi-cb qsort-cb int ((ptr wstr-d) (ptr wstr-d))))
  #:lib-0005
  2> (let ((vec #("the" "quick" "brown" "fox"
                  "jumped" "over" "the" "lazy" "dogs")))
       (prinl vec)
       (qsort vec (length vec) (sizeof wstr)
              [qsort-cb (lambda (a b) (cmp-str a b))])
       (prinl vec))
  #("the" "quick" "brown" "fox" "jumped" "over" "the" "lazy" "dogs")
  #("brown" "dogs" "fox" "jumped" "lazy" "over" "quick" "the" "the")
  #("brown" "dogs" "fox" "jumped" "lazy" "over" "quick" "the" "the")

The lambda is pointless; we could create the FFI closure directly from cmp-str with [qsort-cb cmp-str]. It shows more clearly that we can use any closure.

dmitrygr 6 years ago

> If you’re not familiar with C’s syntax, here’s the equivalent signature in Rust

Author is hilarious. Who is familiar with that but not c?

alkonaut 6 years ago

I came to Rust without writing C before. Most of my experience with C comes from problems exactly like this. I doubt I'm alone in this.

jonny383 6 years ago

Rust is already doomed. The amount of literature being published about either comparisons or compatibility with C is a strong indicator C is here to stay.

cellularmitosis 6 years ago

If Rust is intended to replace C, wouldn't you expect lots of this sort of literature? i.e. isn't this actually a sign of its _success_?
- steveklabnik 6 years ago
  
  Also, being able to add Rust to an existing C or C++ codebase was an important design consideration. Big projects like Firefox aren’t just going to re-write millions of lines of code all at once.
  - The_rationalist 6 years ago
    
    And this is why rust is not going to succeed. I has not a great compatibility with c++.
- jonny383 6 years ago
  
  No. It shows that people are still struggling with changing the way in which they write software to the "rust" way. The attitude of falling back to C or using unsafe rust just undermines the premise of the argument of why you should use rust.
  This is just like the node.js craze a few years ago - people will rant on trying to justify why you should use rust and write the "rust" way before realising that what they already had worked as intended.
  A true replacement for C (when one is finally developed) will remove all of these doubts and back-shadowing behaviour almost instantly (kind of like the react way of ux did)
  EDIT: typo
  - MaulingMonkey 6 years ago
    
    > It shows that people are still struggling with changing the way in which they write software to the "rust" way.
    It shows no such thing.
    I generally work on relatively small ~1MLOC C++ codebases. There are codebases out there measured in the hundreds of MLOC. These are not the comparatively tiny javascript codebases you find React used in - where additional milliseconds of download / parse / evaluation time has a measurable effect on your user retention statistics. There is no "near instant" at 100M+ LOC scales. There is only incremental rewrites, and incremental rewrites means making your new code talk to your old code, and to other people's existing code.
    This means interop with existing C ABIs. There is no such thing as a C "replacement" that can't talk to an existing C ABI, or expose an existing C ABI.
    Of course, a C ABI doesn't mean C. It's more frequently C++ in my ecosystem, for example. But it could just as easily be Rust, or any number of other languages capable of exposing a C ABI.
    
    jonny383 6 years ago
    
    >There are codebases out there measured in the hundreds of MLOC.
    I agree with your argument, but I think in practice trying to "port" something with hundreds of MLOC is a losing battle (especially away from C). By the time you finished porting to rust (or your other language of the week), rust will likely have come and gone and will be been replaced by something either better or "better".
    IMO people should spend less time trying to re-invent the wheel in rust and more time either improving C or the tooling / static analysis for C. It would avoid _so many_ of these issues.
    
    MaulingMonkey 6 years ago
    
    I agree that porting 100MLOC is a losing battle no matter the language. And I'll concede I'm not certain Rust will have the staying power - although I hope it might.
    And I'm all for more C tooling. Static analysis, fuzzing, sanitizers, valgrind, clang thread safety annotations, etc. are all wonderful tools I lean on heavily. But these are opt in, patchwork, platform specific, rife with false positives, false negatives, inconsistent, slow, painful to configure and use... I've wanted far more out of my C and C++ tooling than it's been able to give me for many years now. I'll frequently try out new attributes and annotations, only to curse when they fail to handle really trivial edge cases.
    Meanwhile, Rust? It already catches things I didn't even realize I wanted to catch. Static checks opt-out into fast dynamic checks opt-out into heisenbugs in auditable unsafe blocks. The defaults are great.
    I doubt C or C++'s tooling will reach the state of Rust, as frozen today, within the next decade. Smart people have tried long and hard to improve things, with quite middling results, convincing me it's a hard problem. I'm a bit more optimistic that it might catch up within the next century, but if I'm not long dead by then, I'll almost certainly be long retired. If it eventually does catch up, I suspect it'll have taken more than a few notes from Rust's approach.
    I share your wariness of re-inventing the wheel, but the C & C++ static analysis ecosystem has left enough to be desired that I think it's warranted in this case. It's to rust's credit that they aren't re-inventing everything, and e.g. leverage LLVM for codegen, optimizations, debug info generation, etc.
    
    pjmlp 6 years ago
    
    Given the 30+ years of proven security exploits due to memory corruption, the C community has proven multiple times that it doesn't care about those solutions, except when required to do so in certified software.
    That Solaris, iOS and in the future Android, pursue hardware memory tagging as workaround for memory corruption exploits, it is a proof how bad the situation in terms of security is.
pjmlp 6 years ago

Rome wasn't built in a day.
As for C staying around, unfortunately yes, until we get rid of POSIX based OSes, C will be around.
After all we need to keep those <UNIX clone OS> Security conferences alive. /s
gnode 6 years ago

I think a language being highly compatible with C is what would have the greatest potential to replace it. In some ways it's similar to Microsoft's "embrace, extend, extinguish" strategy.

Settings

Neat Rust Tricks: Passing Closures to C

Keyboard Shortcuts