Cosmopolitan Libc: your build-once run-anywhere C library
justine.lolI understand the love these incredible projects are getting on HN right now, but this one was discussed a couple months ago:
Cosmopolitan Libc: build-once run-anywhere C library - https://news.ycombinator.com/item?id=25556286 - Dec 2020 (166 comments)
and the current related thread is still high on the front page—long may it reign:
Show HN: Redbean – Single-file distributable web server - https://news.ycombinator.com/item?id=26271117 - Feb 2021 (182 comments)
There are also these related threads:
Actually Portable Executable - https://news.ycombinator.com/item?id=26273960 - Feb 2021 (133 comments)
αcτµαlly pδrταblε εxεcµταblε - https://news.ycombinator.com/item?id=24256883 - Aug 2020 (286 comments)
How Fat Does a Fat Binary Need to Be? - https://news.ycombinator.com/item?id=26103769 - Feb 2021 (68 comments)
Others?
"That's a huge improvement in generated code size. The above two compiles used the same gcc flags"
It would have been awfully nice to state the version of the compiler and the flags being used. With gcc 10 on Linux/AMD64, libc 4.15.0, I get with '-Os':
--8<--
strlcpy:
.LFB5:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rdi, %rbp
movq %rsi, %rdi
pushq %rbx
.cfi_def_cfa_offset 24
.cfi_offset 3, -24
movq %rdx, %rbx
subq $24, %rsp
.cfi_def_cfa_offset 48
movq %rsi, 8(%rsp)
call strlen
testq %rbx, %rbx
movq 8(%rsp), %rsi
je .L1
leaq -1(%rbx), %rdx
movq %rbp, %rdi
cmpq %rax, %rdx
cmova %rax, %rdx
movq %rdx, %rcx
movq %rdx, %rcx
rep movsb
movb $0, 0(%rbp,%rd
.L1ubq $24, %rsp
.cfi_def_cfa_offset 48
movq %rsi, 8(%rsp)
call strlen
testq %rbx, %rbx
movq 8(%rsp), %rsi
je .L1
leaq -1(%rbx), %rdx
movq %rbp, %rdi
cmpq %rax, %rdx
cmova %rax, %rdx
movq %rdx, %rcx
cmova %rax, %rdx
movq %rdx, %rcx
addq $24, %rsp
.cfi_def_cfa_offset 24
popq %rbx
.cfi_def_cfa_offset 16
popq %rbp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
-->8--
Which doesn't seem so bad.GCC optimizations are also smart enough to remove memcpy calls for small numbers of bytes entirely and just output the unrolled movs/loads inline. If you're memcpy'ing a struct from a byte buffer and reading some fields, the optimizer is usually smart enough to output the minimum instructions necessary. As I recall, MSVC's optimizations perform similarly.
I missed an opportunity to ask in the previous thread: what would it take to link an app in a different language (say, Rust) with this library? Is it enough to just build an object file, that has libc functions assuming LP64 ABI as unresolved exports?
Previous discussion https://news.ycombinator.com/item?id=25556286
Run anywhere x86 ;-)
Getting less relevant by the minute but maybe they can do the same thing for ARM.
I think x86's death is one of those arbitrary dates that are continually referenced yet never come. See also: Year of the Linux Desktop, Windows 11, and The Repairable iPhone.
Try the Graviton servers on AWS. Compare r5.large to r6g.large. Way cheaper (.10 vs .166) and faster for the things I have tested (also, no hyperthreaded cores). No doubt in my mind that folks are going to be using them in cloud for sure. This isn't comparable to the stuff you listed. Not even counting the M1, which is crazy.
This seems incredibly ideal for writing malware / rats?