The new Gödel Prize winner tastes great and is less filling
blog.computationalcomplexity.org108 points by baruchel a day ago
108 points by baruchel a day ago
I feel like this blogpost isn't filling either
What exactly was the awarded paper about? What does "extractor" and "resilient function" mean?
Why did the discussion shift towards Ramsey theory? Why waste half of the post arguing about something already discussed in linked previous blogposts?
Wow, crazy randomly seeing my Algo prof at the top of HN winning this! Congrats Eshan!!
Eshan's Bachelor's thesis advisor from IIT Kanpur, Prof Manindra Agrawal, also won the Gödel prize in 2006. Wow.
This just generally reminded me about STOC. Need to read the last few years of the proceedings. I totally forgot about it. I like that 2026 will be held in the US. Some very fun to read papers at STOC. Even if they seem narrowly technical.
If I give you a biased coin can you simulate a truly random coin flip with it? The answer turns out to be yes. Flip the biased coin twice: HT = heads, TH = tails, and HH/TT = flip twice again.
The general study of such things is called “randomness extractors”. The new Gödel prize goes to this paper which shows how to extract a nearly perfect random bit out of two sources with low min-entropy.
Yes, but - you need to replace "twice" there with "an unbounded number of times". If you apply this in an environment where the biased coin is coming from an external source, your system becomes susceptible to DoS attacks.
While I obviously think randomness extractors over adversarial sources are very interesting, I think talking about them specifically in this example complicates the point I'm trying to make, which is that it's incredible it can be done at all.
Note that adversarial is kind of a red herring, not sure why they mentioned that. The number of flips is unbounded regardless. Which is why it's not really incredible that it can be done: it can't, not as the problem was originally stated. What can be done is solving a different (but useful) problem than the one originally posed.
I realize this sounds like a minor detail to someone who finds this cool (and so do I), but I don't think it is. It's kind of frustrating to be told that your intuition is wrong by someone smarter than you, when your intuition is actually correct and the experts are moving the goalposts. IMO, it makes people lose respect for experts/authority.
So, the problem in its original framing is: can we simulate a fair coin flip with an unfair coin? As stated, I do actually think the von Neumann response answer ("this is actually technically possible") is fair, in that if I wanted a solution in O(1), I think I should have to say so ahead of time.
I suppose we'll have to disagree about whether this is incredible. The response shows that (1) this can be done at all, and (2) that the answer is exponentially likely as time goes on, not asymptotically, but for finite n. Incredible! You don't see finite-decay bounds very often! If you don't think that's incredible I invite you to ask a room full of people, even with the qualifications you deem appropriate, e.g., "solution does not need to be constant-time", or whatever.
What do you mean by "exponentially, not asymptotically, but for finite n"? Exponential is by definition asymptotic and continues infinitely, no?
And to be clear, I'm not disagreeing (or agreeing) with the result being inherently incredible. I'm just saying it's not an incredible example of simulating a fair coin, because it just... isn't doing that. As an analogy: communicating with the Voyager spacecraft might be incredible, but it's not an incredible example of infinitely fast communication... because it just isn't. Telling me to go ask a room full of people whether they find either of these incredible is missing the point.
> What do you mean by "exponentially, not asymptotically, but for finite n"? Exponential is by definition asymptotic and continues infinitely, no?
In statistics, we generally separate asymptotic bounds—which usually make guarantees only asymptotically—from finite-decay bounds like Chernoff, which decay exponentially not only in the limit, but also at each particular finite n. The second is much rarer and much more powerful.
This is an important distinction here: we are not talking about an asymptotic limit of coin flips. Each and every round of coin flips reduces the probability of not having an answer exponentially.
> I'm just saying it's not an incredible example of simulating a fair coin, because it just... isn't doing that. As an analogy: communicating with the Voyager spacecraft might be incredible, but it's not an incredible example of infinitely fast communication... because it just isn't.
Ok, no problem, it's up to you to decide if you want to use your own definition here.
But, FYI, in computability theory, it is definitely fair game and very common to "simulate" some computation with something that is either much more memory-intensive or compute-intensive, in either the deterministic or probabilistic case. For example, it's pretty common to add or remove memory and see whether the new "simulated" routine takes more or less compute power than the "standard" algorithm, and that is kind of what people are up to with the PSPACE stuff that is going on right now.
Using that lens—and this is entirely off the cuff, but in the 10 seconds of thinking, I believe probably mostly correct—this algorithm "simulates" a fair coin toss by using 1 extra bit of memory and O(p) compute, where p is the reciprocal of |0.5-<coin's bias>|. You can choose p to be infinite but you can do that for a sorting algorithm too (and that is why both sorting and this algorithm have DoS implications). Is this the best we can do? Well, that's the problem we're studying with randomness exractors: given this imperfect source of randomness, can we extract perfect randomness out of it at all?
> In statistics, we generally separate asymptotic bounds—which usually make guarantees only asymptotically—from finite-decay bounds like Chernoff, which decay exponentially not only in the limit, but also at each particular finite n. The second is much rarer and much more powerful.
I don't know if you're using different terminology than I understand here, but I'm reading "exponential in the limit" to mean "eventually exponential", or in other words "there is some finite n_0 where for n > n_0 the decay becomes exponential"? In which case it basically means that the first n_0 tosses would have to be discarded? It's nice that that's not the case, I guess. Somehow I don't find myself blown away by this, but perhaps I'm misunderstanding what it means to be "exponential but only in the limit but not for finite n".
> in computability theory, it is definitely fair game and very common to "simulate" some computation with something that is either much more memory-intensive or compute-intensive, in either the deterministic or probabilistic case.
"Much" more is one thing, "arbitrarily more if you're unlucky" is another. I'm not an expert in computability theory by any means, but whenever I've heard of simulation in that context, it's been with a hard bound (whether constant, polynomial, or whatever). I've never heard it called simulation in a context where the simulator can take arbitrarily long to simulate step #2. Even if this is normal to those in the field, don't think the average person thinks of it this way -- I very much think most people would want their money back if they were sold a "simulator" that took arbitrarily long to get to step #2.
What is our goal here? I'm willing to continue this discussion but after answering these questions and seeing your responses elsewhere it does not seem like your understanding is improving (e.g., still misunderstanding that the algorithm does not take "arbitrary" time), so I'm not sure you're getting the value here... If you want to continue maybe we should try you asking specific questions instead.
The difference is deterministic termination:
There is no N such that your algorithm is guaranteed to terminate before N flips — even for a single bit. The complexity class in the worst case (100% T) is infinite; it’s only the average cases that have something reasonable.
To borrow your phrasing:
If you only have a stochastic algorithm, I think you should have to say so.
Kind of, kind of not. I think it's fair to argue a lot of theory people would argue that it is effectively guaranteed to terminate, for any reasonable definition of "guaranteed."
As I mention elsewhere, for a coin where P(H)=0.75, with 600 flips, the probability of not having received an answer is less than 1 divided by the number of atoms in the universe. While this is not 0, neither is the probability of the atoms of the coin lining up just right that it falls through the table, producing neither H or T. Theoretically and practically we do generally consider these kinds of things immaterial to the problem at hand, perhaps papering over them with terms-of-art like "with high/extremely high/arbitrarily high/ probability". They all mean roughly the same thing: this algorithm is "essentially guaranteed" to terminate after n flips.
You can relax the constraints to bound the cost.
If I asked a someone gambling, does it matter if this coin is biased 0.0001%, they will probably say no. And just like that, the algorithm is now guaranteed to terminate.
It's a bit unfair to talk about the case (100% T) because in that case you don't have a source of randomness anymore, we've dropped the assumption that you have a source of randomness, and as expected you can't make one from nothing.
You have a source of randomness — you just got exceptionally unlucky. A truly random source can produce the same value indefinitely.
My point is that you have stochastic termination, but not guaranteed termination. Those are different things.
A random source can produce the same value indefinitely, but it cannot produce the same value forever. It is impossible in the sense that the probability that it will happen is zero. That is, even the unbiased version of the algorithm will terminate with 100% probability.
> My point is that you have stochastic termination, but not guaranteed termination. Those are different things.
You missed their suggestion about that.
If you rephrase the algorithm as one that almost almost almost almost almost completely eliminates bias, you can guarantee termination by giving up after a certain number of repetitions.
Sure — but now we’re back in the case the person above me was calling out and I was emphasizing by pointing out the stochastic nature: you moved the goal posts.
From their comment:
> Which is why it's not really incredible that it can be done: it can't, not as the problem was originally stated. What can be done is solving a different (but useful) problem than the one originally posed.
You can have an arbitrarily small bias, at the cost of increased runtime — but you can’t have a zero bias algorithm that always terminates. You have to move the goalposts (allowing some bias or exceedingly rare cases to not terminate).
I’m not sure why people have such a hard time admitting that.
I don't think the termination thing is moving the goalposts.
If you have infinite time, then you can wait for it to take more than a few hundred iterations.
If you don't have infinite time, it won't take more than a few hundred iterations.
Also "constant time" wasn't even part of the original promise! If a goalpost was moved, it was by the person that decided on that requirement after the fact.
> If you have infinite time, then you can wait for it to take more than a few hundred iterations.
To be clear, you're making this argument while also arguing "this wouldn't happen in the real world". [1] You can't have it both ways.
> Also "constant time" wasn't even part of the original promise!
We're not even asking for "constant time". We're literally only asking for "will finish in any bounded amount of time". Even an exponential time bound would've been better than unbounded!
A simulator that can't even promise to get to step #2 of the simulation in any bounded amount of time very much needs a giant proactive asterisk on its labeling, because that's not what people understand to be a simulator. Again, if you sold someone that in such a manner that obscured this fact, they would absolutely be justified in wanting their money back.
> You can't have it both ways.
I addressed that in my next sentence! The whole point of making it two sentences was to split up the real world and not real world cases. Come on.
> We're not even asking for "constant time". We're literally only asking for "will finish in any bounded amount of time". Even an exponential time bound would've been better than this!
N is 1. Every bound is a constant bound.
> if you sold someone that in such a manner that obscured this fact, they would absolutely be justified in wanting their money back.
They would not be entitled to a penny back because it's impossible for them to hit the failure case.
Even if they could hit it, nobody is getting a refund for "it crashes once per googolplex runs".
> I addressed that in my next sentence! The whole point of making it two sentences was to split up the real world and not real world cases. Come on.
Your next sentence was just obviously flat-out wrong though. Not just because who says that's the case (maybe I don't have that much time?) but because it's trivial to find P(H) that makes it false for any duration of time. "A few hundred iterations" literally doesn't guarantee anything unless you make unstated assumptions about the biases your device works for.
I don't get why we're going in circles here. It seems we've hashed everything out.
> N is 1. Every bound is a constant bound.
Kind of a meaningless statement when you don't even say what your N is. I can imagine lots of N where that's not the case. But whatever you want to call it, my point entirely stands.
> They would not be entitled to a penny back because it's impossible for them to hit the failure case.
No, it is very possible. All they need to be given is a coin whose bias they don't know beforehand, whose bias is unfortunate. Or a coin whose bias they do know to be much worse than whatever you imagined a few hundred tosses would be enough for.
> because it's trivial to find P(H) that makes it false
As I already said in the comments you read, it's proportional to the bias. A few hundred multiplied by the ratio between heads and tails or vice versa will never be reached.
> when you don't even say what your N is
Number of outputs?
Look, if you want to invoke concepts like exponential time then you tell me what N is. Exponential of what?
> All they need to be given is a coin whose bias they don't know beforehand
See first answer.
If someone expects the bias to not matter, even a one in a million coin, they're the one with the problem, not the algorithm.
If they accept the bias affects speed, things are fine.
> the worst case (100% T)
You're no longer randomly flipping a coin to get heads or tails at that point. There's a good argument that this is not within the original scenario.
Yes you are — you just have exceptionally bad luck to generate that sequence.
Oh, I thought you were saying the coin bias was 100%. I misread how you were using complexity class.
Still, when a worst case is physically impossible I don't think it needs a mandatory disclaimer.
All infinite sequences are physically impossible, but they’re the basis of asymptotics; arbitrary failure and non-production of entropy are still physically possible.
Note that the solution dataflow objects to already operates in constant time on an average-case basis. There isn't room to make a complexity improvement.
I have the exact opposite reaction, that if someone told me the answer is "no" because it requires an unbounded number of coin flips that they were the ones trying to bullshit me. In antic's formulation, nothing is said about requiring a bounded number of flips.
"Simulate a truly random coin" implies it IMO. You're not simulating a truly random coin if you need unbounded time for a single flip. The truly random coin definitely doesn't need that. It just feels like a scam if someone sold me such a machine with that description - I'd want my money back. I don't expect everyone would feel the same, but I think a lot of people would.
I'm not sure we're on the same page about what this result practically means, so let me re-state it a few different ways, so that people can draw their own conclusions:
* The von Neumann approach will "appear to be O(1) (constant-time)" for any particular biased coin, but that constant might be big if the coin is very, VERY biased in one direction.
* How can this be true? Every flip reduces the probability you do not have an answer exponentially. The "concentration" around the mean is very sharp—e.g., at 275 coin tosses for P(H)=0.5 (the fair case), the probability of not having an answer is smaller than 1 divided by the number of atoms in the known universe. It is technically possible, but I think most people would say that it's "effectively constant time" in the sense that we'd expect the coin to phase-shift through the desk before flipping it 275 times in a row and not getting answer. So it takes 275 flips, it's "constant" time! Interpret it how you like though.
* As you make the coin more and more biased, that "horizon" increases linearly, in that 0.99^1,000 is approximately the same thing as 0.999^10,000. So, an order of magnitude increase in probability requires roughly an order of magnitude increase in the number of flips. This is why it's not useful for the adversarial case, and why adversarial extractors are held apart from normal extractors.
Whether this is a "give me my money back" type thing is for you to decide. I think for most people the claim that you can simulate a fair coin from a biased coin in, effectively, O(1), and that the constant increases in O(n) in the bias, is plainly incredible. :)
You don't need unbounded time for a single flip, that's all in your imagination. The worst-case time is unbounded, but you can't achieve the worst case.
It's very easy to achieve if someone hands you a "coin" that is made such that it never lands on tails.
Sorry for still being in the adversarial mindset, but this means that you essentially have to hardcode a maximum number of same-side flips after which you stop trusting the coin.
That's not a situation of "can't be done", that's just a consequence of casually describing the problem instead of exhaustively specifying it.
Yes the bias has to be finite / the entropy can't be zero, and the slowdown is related to how big the bias is / how low the entropy is.
> You don't need unbounded time for a single flip, that's all in your imagination. The worst-case time is unbounded, but you can't achieve the worst case.
There literally isn't a bound, it can be arbitrarily large. This isn't just in my head, it's a fact. If it's bounded in your mind then what is the bound?
It's not a fact of the real world, at least. "You can't achieve it" is true. A pretty small number of failures and you're looking at a trillion years to make it happen. And if you buffer some flips that number gets even smaller.
> A pretty small number of failures and you're looking at a trillion years to make it happen.
This depends on the bias of the original coin. P(H) can be arbitrarily large, making P(HH) the likeliest possibility even for a trillion years. "This wouldn't happen in the real world" would be a sorry excuse for the deliberate refusal to clearly state the problem assumptions upfront.
IMO, if you really want to pleasantly surprise people, you need to be forthcoming and honest with them at the beginning about all your assumptions. There's really no good excuse to obfuscate the question and then move the goalposts when they (very predictably) fall into your trap.
> This depends on the bias of the original coin. P(H) can be arbitrarily large
> There's really no good excuse to obfuscate the question and then move the goalposts when they (very predictably) fall into your trap.
Interesting. Because I see the guy pulling out the one-in-a-million coin and expecting it to run at a similar speed to be doing a gotcha on purpose, not falling into a trap and having the goalposts moved.
And I think "well if it's a million times less likely to give me a heads, then it takes a million times as many flips, but it's just as reliable" is an answer that preserves the impressiveness and the goalposts.
It's fast relative to the bias. Which seems like plenty to me when the original claim never even said it was fast.
(And if the coin never gives you a heads then I'd say it no longer qualifies as randomly flipping a coin.)
i read "flip twice" as recussion, so, given we're talking randomness, yes, that could go on forever. but i don't think you really need to replace "twice."
Speaking from complete ignorance, with apologies to those who that will annoy:
I'm sure it's possible to make a coin with what one might term "complex bias" where the bias extends over two events (thick gloop inside the coin, possibly heat activated or non-Newtonian).
This method sounds like the bias needs to be fixed ("simple bias" if you like)?
I guess that's just out of scope here.
Aside: There's a strong 'vibe' that HHHH HHHH with a single coin is somehow less random than HTHTHTHT HTHTHTHT with two coins when discarding HH and TT. I wonder what the origin of that feeling is - maybe just HT doesn't seem like it's a constant result simply due to nomenclature? If we call HT "type-A", then we just have HHHH HHHH vs. AAAA AAAA; and I _feel_ happier about the result!
Could the vibe be due to the fact that HHHH… seems like the coin could not just be biased - it could be completely broken and come up heads every time. There are two distinct possibilities here:
1) the coin is broken and is always H
2) the coin is random or possibly biased, and you got a string of H by chance
And observing the string of H… increases the probability of 1) in the Bayesian sense.
With the two coins you eliminate this possibility altogether - a broken coin can never produce HT or TH.
It’s tough. You take Math 55, you’re a smart kid, you learn all this math, and it opens up all these opportunities from VCs in the real world for you, so long as it has to do with payments.