Settings

Theme

The Type of Sprintf

ryanbrewer.dev

13 points by verdagon 2 years ago · 23 comments

Reader

trealira 2 years ago

Finding out about how, in a programming language with dependent types, types can be returned from functions like values, feels about the same way as when I learned functions can be returned as first-class values in functional programming. It went from "wait, what?" to "of course, why wouldn't you be allowed to do that?" From seeming confusing to seeming perfectly natural, leaving you more confused of its abscence than its presence. Though admittedly, I don't have much experience with using dependent types seriously.

This property of dependent types:

> In fact, if you don't mind writing a proof or two, our sprintf function can be called on format strings that aren't fully known at compile-time! You only need to be able to prove that the string will have certain format specifiers, and then calling sprintf can typecheck.

It's not new to me, but I also find it pretty cool. Proving things at compile-time that certain runtime conditions will hold using a programming language is so much cooler than just proving it to yourself in your head or on paper, and then doing a runtime assertion in the actual code. Or, not doing a check at all, and just having crashes or security vulnerabilities if you're mistaken.

_nalply 2 years ago

Sometimes I dream about a Rust-like programming language with optional garbage collection similar to Swift that runs in three stages: The first one is an idempotent interpreter so heavily sandboxed that it can only read the source code and output only diagnostics and the second stage: a more static programming language. Meta-programming!

This would allow simplifying the second stage massively because it is generated code and won't need ergonomy. This programming language would parse the sprintf format string exactly as if the compiled program would do it at runtime. When I read the article I thought to myself, aha, dependent types, could it be that it is similar to a user-provided program that computes the types?

I have to disclaim that designing programming languages is not my expertise. I am only an amateur.

These two stages avoid the complexity of Rust compile-time handling. Instead of hard coding things like lifetimes, generics, const generics, return position impl trait in trait (RPITIT), async fn in trait (AFIT), and so on, have a first stage standard library handling all these things.

And the third stage? A sandboxed interpreter with the same or similar syntax as the rest of the thing. I think this would be useful as glue code or just for tiny throw-away scripts. All these three stages would profit from a common base, so they should be provided together.

Makes sense?

anonymousiam 2 years ago

I am at a loss to understand this. I learned C in 1989 and sprintf takes a minimum of two arguments, but typically takes three or more, the first one being the string for the output to be written to. Perhaps the author was writing about printf, which takes a minimum of one argument, but typically takes two or more?

  • hoping1 2 years ago

    Author here. I originally wrote this about printf, but changed it because I wanted to stay away from any discussion of side effects. The rewrite wasn't perfect! In my head, a printf that returns the new string instead of printing it is called "sprintf," and I know others have this impression in a culture that is more general than just C. But you're absolutely right!

  • trealira 2 years ago

    Yes, I was also confused at first.

    The closest C function to this one is asprintf, because this function gives you back a new, separate string, which you may or may not decide to print. Just imagine if asprintf simply returned a char* pointing to an allocation by malloc, instead of taking an extra char ** as an out parameter.

      char *my_sprintf(const char *, ...);
  • jznsis 2 years ago

    This is just a choice of having to provide an output buffer to the function instead of returning a new buffer since it's easier to inject custom allocator behaviour that way

    • jwilk 2 years ago

      There's no return value here:

        void my_sprintf(char* format, ???) {
            sprintf(format, ???);
        }
      • hoping1 2 years ago

        Author here, thanks for pointing that out! It was a mistake on my part (: I originally wrote this about printf, but decided I should include the section where I implement it, and I decided I didn't want to say anything about side effects, so I changed everything to a version that just returns the new string. In my head this is called "sprintf" (and the notes I based this on, linked in the post, did the same thing) though I can see now how this was confusing to people. Especially since I didn't rewrite it perfectly!

    • anonymousiam 2 years ago

      Can you elaborate please? Your description does not match what I see. My understanding is that this is a way to more strongly type function arguments, as an alternative to va_arg.

      • jznsis 2 years ago

        Yes, but the final result is out_str, not a forward to console, isn't it?

        Either way, the format string discussion for the printf familiy of functions is very similar between them

jznsis 2 years ago

Nice writeup! It would be helpful to seen an actual invocation of the new sprintf function

I know it's supposed to be pseudo code and most likely it's something like sprintf "% world %d" "hello" 42 but I got a little confused by the type list being backwards from what I expected

  • trealira 2 years ago

    I'm confused why it's said to be backwards (in the article and by you).

    This paragraph shows that with a format string with an int format specifer, and then a string format specifier, the token list shows is exactly that, and the type it returns is a function that takes an Int and a String as input and returns a new string.

    > For example, if the format string was "%d %s", the tokenization would be [IntFormat, NormalChar ' ', StrFormat], so the resulting type would be Int -> String -> String`. This might be confusing because it's a post-order traversal of the list, and might feel kind of backwards.

    Isn't that exactly the order you'd expect?

    I'm confused why it's called backwards, and why they say "post-order" traversal of a list, since I'm only used to hearing about traversing trees that way.

    • hoping1 2 years ago

      Author here. It's a linked list, which is a tree, so post-order here means "recurse, then operate on the result" as opposed to "operate on something and then recurse on the result." So `f(head, r(tail))` instead of `r(f(head, tail))`. This post-order-ness can make it appear strange because first we go to the end of the list and produce the type String, then we process the second-to-last to get, say, the type Int->String, and so on until the first element, adding something to the front each time instead of to the back. The final result still has the parameters listed in the right order, as you mentioned, it's just that the order of operations to get there feels weird, where we're prepending to the return type instead of appending to the first parameter type. Sorry that was poorly written!

    • jznsis 2 years ago

      I'm not that familiar with syntax of functional programming languages.

      Isn't Int -> String -> String a shorthand for Int -> (String -> String)?

      Thus I would have assumed the first argument to the result of sprintf "%s %d" would be an integer, then a string and the final result would be a string.

      But the invocation would be sprintf "%s %d" "hello" 42, wouldn't it?

      • trealira 2 years ago

        > Isn't Int -> String -> String a shorthand for Int -> (String -> String)?

        Yes. Though I find it more convenient think of it as a function with multiple arguments and not a function that returns another function.

        > Thus I would have assumed the first argument to the result of sprintf "%s %d" would be an integer, then a string and the final result would be a string.

        Why assume that? (Not rhetorical.) Wouldn't that be like if in C, you had to write the arguments in reverse order of the format string, like this?

          printf("%s %d!\n", 5, "hi");
        • jznsis 2 years ago

          I seem to be missing something, thus thank you for taking your time

          > Wouldn't that be like if in C, you had to write the arguments in reverse order of the format string, like this?

          Exactly and that's not what we want

          Let's go through this step by step:

          sprintf "%s %d" returns a function, taking a string and returning a function taking an integer and returning a string, right?

          sprintf "%s %d" "hello" returns a function taking an integer, returning a string, right?

          sprintf "%s %d" "hello" 42 returns a string, right?

          Thus my intuition of the return type of sprintf "%s %d" would be string -> (int -> string) not int -> (string -> string)

          • hibernator149 2 years ago

            Your intuition is correct. I think you accidentally swapped %s and %d in one of your previous posts, leading to the confusion.

            The "backwards" in the article is referring to the way the list is processed by the recursive function. The list and the resulting type are in the correct/intuitive order. I also found this sentence confusing.

            By the way, thinking of a function of type a->b->c as a->(b->c) is technically correct and useful in certain cases, for example, when you are currying. But in most cases you can just read this as a function taking two arguments of types a and b, returning something of type c.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection