Settings

Theme

Large single compilation-unit C programs (2006)

people.csail.mit.edu

50 points by Sadkov 5 years ago · 22 comments

Reader

lou1306 5 years ago

May be relevant to the discussion: SQLite is also compiled from a single, 220k LOC C file called "the amalgamation".

https://www.sqlite.org/amalgamation.html

  • dhekir 5 years ago

    I also found this "amalgamate" script on GitHub, intended to allow creating such amalgamations from C/C++ projects:

    https://github.com/rindeal/Amalgamate

    Which seems interesting, however when I tried the FreeType example, there seemed to be some preprocessing issue, such that some function definitions are conditionally excluded even though they are called later. I didn't have the time to find out if this was an issue in the original code or if the amalgamation script introduced it.

    In any case, such single-C programs are very useful for quickly testing tools, so having more of them would be great.

    • blacksqr 5 years ago

      I'm not a C programmer, but I have heard of amalgamation, and I wonder why a standard workflow to create a single compilation unit from multiple source files isn't more straightforward.

      • TylerE 5 years ago

        Because C is stuck in the dark ages.

        Textual preprocessors are evil.

    • beached_whale 5 years ago

      That took a few minutes to make it work on newer macos.

klelatti 5 years ago

I've been working on a project that auto generates c programs - sometimes up to 1.5m lines of code - in a single file (actually two files but the second is only 35 lines)

Not open source but happy to share benchmarks if that would be useful.

  • dhekir 5 years ago

    Too bad it's not open source, but will some of the generated programs be?

    Also, would you mind comparing it to Csmith (https://embed.cs.utah.edu/csmith/)?

    • klelatti 5 years ago

      There is quite a lot of IP in the generated programs so probably not possible to share sadly.

      I wasn't aware of Csmith so thanks for highlighting. My C code doesn't really test many features of the compiler so I suspect mainly of interest in seeing just how the compiler handles a really large single file.

  • klelatti 5 years ago

    Some compile times for those interested:

    Hardware 2016 12" MacBook (1.1GHz Core m3) Ubuntu 20.04 running in Docker Clang 9 -O0 optimisation (more optimisation increases the compile times a lot!)

    0.53m LOC 41MB 34s

    0.99m LOC 76MB 91s

    1.44m LOC 110MB 167s

    I suspect the code is relatively straightforward to compile - few function calls etc.

  • pulse7 5 years ago

    Please share the benchmarks...

enriquto 5 years ago

I like to code this way. You just include "foo.c" instead of "foo.h", which does not exist at all. The compilation is really simple, and there's half of the files!

SadkovOP 5 years ago

1283 = continue 1432 = license 1766 = gnu

So for every loop continue statement there is a GPL license text :D

  • dvfjsdhgfv 5 years ago

    I know it's half serious but it's simply not true, in the same way as grepping for "Stallman" in the leaked Windows source code (nobody actually mentioned RMS there, these were false positives). In this case, some headers contain multiple occurrences of GNU in a single header. Then there are several #ifdefs like "__GNU_LIBRARY__" or "__GNUC__" or e-mail addresses of people in the gnu.org domain.

    In practice, it doesn't matter at all as the preprocessor replaces all license headers with a single space even before the compiler has the chance to look at it.

    • colejohnson66 5 years ago

      The preprocessor removes the comments? I thought that was the compiler?

    • gridlockd 5 years ago

      The preprocessor doesn't remove comments and at least in clang, comments are parsed into the AST.

      • dvfjsdhgfv 5 years ago

        Well, according to C99, it should. Section "5.1.1.2 Translation phases" says (in phase 3): "Each comment is replaced by one space character."

        Edit: just checked and clang behaves just like gcc with -E. Maybe you didn't mean comments but preprocessor directives?

        • tom_mellior 5 years ago

          Clang, like GCC, has -C and -CC flags to preserve comments during preprocessing. However, these are really flags for the underlying preprocessor. Your parent might be thinking of some application of the Clang frontend that does not run preprocessing. For example, clang-format will probably not want to preprocess the code nor strip out comments.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection