How to speed up the Rust compiler some more in 2019
blog.mozilla.org> But I was able to work around this by using a trick: creating two variants of the function, one marked with #[inline(always)] (for the hot call sites) and one marked with #[inline(never)] (for the cold call sites).
Can't PGO make inlining decisions like this? Otherwise, propeller/LTO might work well.
> But there’s a trade-off. Sometimes a simpler, smaller function is slower.
Without a doubt! Imagine the naive/simple/portable memcpy versus a target-aware one that capitalizes on wider or aligned loads and stores.