Nuke the C++ implementation of Zig from orbit using WASI by andrewrk · Pull Request #13560 · ziglang/zig

The idea here is to use a small WASI binary as a stage1 kernel that is committed to source control and therefore can be used to build any commit from source. We provide a minimal WASI interpreter implementation that is built from C source, and then used to translate the Zig self-hosted compiler source code into C code. The C code is then compiled & linked, again by the system C compiler, into a stage2 binary. The stage2 binary can then be used repeatedly with zig build to build from source from that point on.

The WASI stage1 blob only needs to be updated when a breaking change or new feature affects the self-hosted compiler when building itself. For example, a bug fix that the self-hosted compiler does not trigger when building itself can be ignored. However, if the bug fix is required for zig to build itself, then the stage1 WASI blob needs to be updated. Similarly when the language is changed and the compiler wants to use the changes to build itself, the blob needs to be updated.

The WASI blob is produced with zig build update-zig1 which uses the LLVM backend to produce a ReleaseSmall binary that targets wasm32-wasi with a CPU of generic+bulk_memory. This produces a 2.6 MiB file. It is then optimized with wasm-opt -Oz --enable-bulk-memory bringing the total down to 2.4 MiB. Finally, it is compressed with zstd, bringing the total down to 655 KB. This is offset by the size of the zstd decoder implementation in C, however it is worth it because the zstd implementation will change rarely if ever, saving a total of 1.8 MiB every time the blob is updated.

I built this branch and master branch from source at the same time and got these results:

compiling from source with `ninja install`,
configured with `-DCMAKE_BUILD_TYPE=Release -DZIG_NO_LIB=ON`:

        master branch: 13m20s with 10.3 GiB peak RSS
wasi-bootstrap branch: 10m53s with  3.2 GiB peak RSS

Big thanks to @jacobly0 who has done significant work on the C backend to enable this possibility, as well as helping write the WASI interpreter and make it go fast, over in the external zig-wasi repo. In fact, after rewriting the interpreter a few times he figured out how to make it run even faster by translating the wasm code to C instead of interpreting it directly.

Closes #5246
Closes #6378
Closes #6485

Prerequisites:

These are already merged inside this branch:

Enhancements Needed

Merge blockers:

make zig1.c detect the host -target and pass correct flags to WASI argv
avoid hard-coding --color on in CMake
fix the -target parameter computation in CMake

Nice to have:

look into compressing zig1.wasm with zstd or gzip and having a mini zstd or gzip decoder in zig1.c
enhance C backend so it generates fewer bytes of .c code
enhance C backend so it generates .c code that compiles faster
enhance C backend so compiling the .c code generates no warnings
C backend: generate MSVC-compatible code #13574
C backend: generate GCC-compatible code #13575