Last Updated: 2025-10-09
Binary solo!
zero zero zero zero zero zero one
zero zero zero zero zero zero one one
zero zero zero zero zero zero one one one
zero zero zero zero one one one one!
keywords: numbers, conversion, digits, floats, binary, decimal
Introduction
The strtod function in the C standard library seems innocent enough: take a decimal representation of a number, such as "10.35" and return the nearest floating-point binary number (the double C type) [note 1].
With this, we can write numbers in a human-friendly way and the computer will convert them to a computer-friendly representation for internal use.
A novice programmer could write a routine that gets this conversion mostly correct, most of the time, and it would not be too hard. It's the sort of task you might get for a homework assignment in a college programming class. But to make it 100% correct, 100% of the time is actually pretty difficult — I was surprised how deep the rabbit hole goes.
The strtod Interface
The first hints of the depth involved can be seen in the strtod function prototype, which defines its interface:
double strtod(const char *beginptr, char **endptr);
It seems innocuous, but notice that there is no limit to the size of the input string. In theory, this function might need to scan millions of digits, and still produce the correct answer. It is sounding slightly more difficult.
For the simpler case of converting an integer (such as with strtol), there are only a few tricky edge cases. You might have a number like: 12340000000… — this is arbitrarily long, but as more input is consumed, it will quickly become too big to be represented. So, once that happens, the algorithm can terminate with a "too big" error code. Perhaps slightly harder is a case like: 00000…00000123 — this is also arbitrarily long, but there is a real answer at the end of it. Obviously, this only happens with leading zeroes; it is still not especially complex to handle properly.
Tricky inputs
Of course, strtod has to handle all the integers, but also quite a bit more.
Given the freedom that arbitrarily-long input allows, combined with the allowable syntax for floating-point numbers, we can come up with more tricky cases:
- There are many ways to say the same thing:
1000=100e1=0.1e4=0.00000000001e14=100000000000e-8, etc. Since the input size is unlimited, we can add as many zeroes in either direction as we please, in this manner. - Accuracy:
strtodwants to give the answer which is as close to the mathematically perfect answer as possible. So, if a trailing 0.0000000001 would push it to a different result, the algorithm must account for it. For numbers that are exactly between two possible answers, that means that an *arbitrarily small* extra amount can change the final value.- You can run this example on godbolt.org:
printf("val: %a\n", 0x1.88888888888888000000000000000000000000001p1023); // insert any number of zeroes here, and ▲ △ // the trailing "1" will still make the △ // difference between two possible results.
- You can run this example on godbolt.org:
- Rounding: There are a variety of rounding choices, and IEEE floats have certain rules in that regard. The
strtodalgorithm must account for rounding.
The Implementation
Considering the above, it becomes clear that a robust strtod implementation must employ some sort of arbitrary-precision arithmetic code.
This is getting into Serious Math™ territory.
Fortunately, some smart people already did the hard work back in the 80s and 90s, when C was young.
When you look at common¹ strtod² implementations³, you'll quickly start seeing the name "David M. Gay" as a core author of the routines. With a little more research, you'll find some papers, such as "Correctly Rounded Binary-Decimal and Decimal-Binary Conversions" by the same person [note 2].
That work, in turn, is based on earlier results such as W. D. Clinger's "How to Read Floating Point Numbers Accurately," and more.
We'll focus on Gay's implementation, which is largely the basis for many versions used today.
He's still around, by the way — and still fixes bugs when they're discovered.
As you look at it, you may notice that the strtod code is ... *pretty intense*.
There's the arbitrary-precision integer type (Bigint), as expected.
The routine can also allocate an unbounded amount of memory(!), in the worst case.
I also notice it doesn't check for out-of-memory (ಠ_ಠ), but oh well.
Aside: I'm not the first person to be surprised by all this memory allocation happening deep inside a "simple" C function like strtod.
Here is a discussion amongst embedded developers, where it uses too much memory on their constrained systems.
There is also an entirely different lineage of strtod implementation, not derived from David Gay's work — for example: LibCL/strtod.c.
In that Sun Microsystems-derived implementation, there is less emphasis on perfect accuracy, and (as far as I can tell) no memory allocation.
Both are valid, since the C standard does not specify the behavior of the function very precisely.
Update: Helpful commenters on HackerNews (link) and Lobsters (link) informed me that there are some additional, more modern implementations to consider, too:
- The Grisu family of algorithms (PDF paper), one implementation being in Google's double-conversion library.
- The Eisel-Lemire algorithm, which handles most cases efficiently, and simply fails in the uncommon cases it cannot handle. It is used in the Go and Rust languages.
- Also from Lemire, the fast_float library, purportedly about 4x the speed of classic
strtod.
The rabbit hole deepens (:
Relevance
I stumbled upon this topic as I was doing research for arbitrary-precision arithmetic, and representing numbers in that way.
In my game, since the player can infinitely zoom, there needs to be a way to track their position with arbitrary precision.
It might not be immediately obvious, but you only need integers to do this.
This is similar to how strtod only needs a "Bigint" type, not anything involving fractions.
If you can zoom forever, numeric precision becomes a problem. This is an early proof-of-concept from (checks notes) over four years ago (⊙▂⊙) That was before I fixed this problem.
In fact, in the game, in order to do infinite pan+zoom, the only arbitrary-precision math needed is integers, addition, subtraction, and bit-shifting (i.e. "moving the decimal point," but in binary). All of those operations are relatively easy to perform. Compare this with something like a fractal zoom application, which needs to (for example) multiply arbitrary-precision numbers frequently. [note 3] That is much more computationally intense, so I am lucky to avoid it. But that is a story for another post!
Take care
– John
Discussion:
Footnotes:
- In truth the C standard makes few guarantees about the quality of the conversion — "nearest" is not required. But in practice, most strtod implementations give pretty strong guarantees about finding the best possible answer in all cases.
- See the original paper (.ps.gz) at archive.org or a PDF version I host here.
- In the description for that video, they say rendering the last frame took about 6 hours — it gets harder to compute the farther you zoom, since the numbers need so much precision.
If you like these articles, feel free to sign up for the mailing list:
We also have some social media links, if you want to follow or contact us.