Zero-Cost POSIX Compliance: Encoding the Socket State Machine in Lean's Types

ngrislain.github.io

34 points by ngrislain 4 months ago · 23 comments

Reader

e-dant 4 months ago

This is cool stuff, but a nitpick: It’s not undefined behavior in the language sense in C to do socket ops on a bad file descriptor. It’s just an error from the kernel’s point of view, and the kernel will throw -errno at you.

singron 4 months ago

Yes it's not UB, but the consequences are not limited to a EINVAL/EBADF/EBADFD. Calling close twice is essentially the same problem as calling free twice, so you get all the use-after-free problems on your file descriptors.
ngrislainOP 4 months ago

Fair point! Updated. I’m definitely coming at this more from a Lean 4/formal methods perspective than a POSIX one.

yjftsjthsd-h 4 months ago

> Lean 4 offers a fourth option: make the bug unrepresentable at the type level, then erase the proof at compile time so the generated code is identical to raw C.

Couldn't you do that in a more conventional type/class system without using an actual proof system? Instead of there being a Socket type/class, just make a Socket_Fresh, Socket_Bound, Socket_Listening, Socket_Connected, and maybe Socket_Closed (not 100% sure, would have to think about whether that's a thing or not), each of which takes the previous in its constructor. Or does that make it too hard to use?

hackyhacky 4 months ago

That wouldn't work because there would be nothing stopping you from re-using a value representing an old state.
- ridiculous_fish 4 months ago
  
  That's exactly what affine / linear types do.
paulddraper 4 months ago

The innovation is making that have zero runtime cost. (Though to be fair, I doubt the runtime cost is really significant...)
- skavi 4 months ago
  
  Their suggestion is also zero runtime cost.
- jibal 4 months ago
  
  That's very odd response if you know what a type system is.

russdill 4 months ago

This is a based on such a surface level understanding of one type of posix socket. Calling close twice on a socket is a normal allowed thing, particularly for non blocking sockets. Datagram sockets can be operated with bind, without bind, with connect and bind and with both called multiple times.

comex 4 months ago

Some of what you said is true, but you definitely can’t call close multiple times on the same file descriptor. close always immediately drops the file descriptor and isn’t like non-blocking socket operations that you have to try repeatedly until they succeed. You could, however, create multiple file descriptors pointing to the same socket with dup or other methods, in which case you’d need to close all of them to disconnect the socket.
- russdill 4 months ago
  
  Bah, I was thinking of shutdown

diowldxiks 4 months ago

Does this work in the face of state changing out from under the socket? I'm not super familiar with low level socket details but I'm thinking something like connect returning EINPROGRESS and you not knowing if the connection has completed. It may complete, it may fail, but during that time this state machine is invalid I think. It seems like strict logical programming like this gets much harder in the face of mutable state changing out from under the program, but that can probably be worked around with enough effort.

ngrislainOP 4 months ago

You are right, what I wrote is more of a PoC. It's valid for blocking sockets on the happy path.

LtdJorge 4 months ago

One thing I don’t understand is the table at the end. It says Rust requires 30 lines of code for a type state pattern, and I’m sure that Rust will be more verbose than Lean here, but wouldn’t all the typing shown in the article count as lines for Lean? I don’t see how it’s 0.

khaledh 4 months ago

Interesting take on enforcing state machine rules using a proof system. I'm interested in this space, and have been developing a new programming language to enable typestate / state-machine representation at the type system level[0].

I don't know where it will end up on the spectrum of systems languages; it may end up being too niche or incomplete, but so far I think I'm scratching the right itch, at least for myself.

[0] https://github.com/khaledh/machina

wk_end 4 months ago

Lean doesn’t have any kind of substructural typing, does it? At a glance it looks like you need to manually (lexically) rebind the socket at each step in the operation, and there’s nothing stopping you from holding onto a socket in a now-invalid state and making mess of things, right?

Also, boo AI slop. If you’re going to use AI to help write your technical blog posts please please please edit out all the “No X. No Y. Just pure Z.” marketing-speak.

tczMUFlmoNk 4 months ago

This is what I was thinking, too. Without some kind of linearity, `connect` et al. don't give the claimed guarantees if you can just reuse the old socket handle. Especially if it's aliased in a list or something. I was surprised to see this not mentioned at all in the section specifically dedicated to double-close prevention.
Likewise, with implicit weakening, nothing stops you from dropping the socket without closing it.
ngrislainOP 4 months ago

Yes the user has to be cooperative somehow. You could emulate linear/affine types like features with indexed monads though.

12_throw_away 4 months ago

I'm like 3 sentences in and already things do not quite make sense.

> Calling [socket] operations in the wrong order [...] is undefined behaviour in C.

UB? For using a socket incorrectly? You sure about that?

> Documentation — trust the programmer to read the man page (C, Rust).

I'm sorry, are they saying that rust's socket interface is unsound? Looks to me like it's a pretty standard Rust-style safe interface [1], what am I missing?

[1] https://doc.rust-lang.org/std/net/struct.TcpListener.html

tom_ 4 months ago

The C standard doesn't have anything to say about sockets at all, so not like it's defined to use them even in the right order.
WhyNotHugo 4 months ago

They say “undefined behaviour”. They mean “returns an error”, or “can return an error”.

strongly-typed 4 months ago

Feels like we're living in parallel universes. You're building Hale, and I'm building Abs XD : https://github.com/rdavison/abs

Settings

Zero-Cost POSIX Compliance: Encoding the Socket State Machine in Lean's Types

Keyboard Shortcuts