More optimizations in the compiler and JIT

172 points by asabil 3 years ago · 18 comments

Reader

This is interesting, thank you.

I really should learn from BEAM and the OTP and learn Erlang. I get the feeling it's super robust and reliable and low maintenance. I wrote a userspace multithreaded scheduler which distributes N lightweight threads to M kernel threads.

https://github.com/samsquire/preemptible-thread

I recently wrote a JIT compiler and got lazy compilation of machine code working and I'm nowhere near beginning optimisation

https://github.com/samsquire/compiler

How do you write robust software, that doesn't crash when something unexpected goes on?

I looked at sozo https://github.com/sozu-proxy/sozu

and I'm thinking how to create something that just stays up and running regardless.

fredrikholm 3 years ago

> I get the feeling it's super robust and reliable and low maintenance
The patterns defined by OTP are a work of art and tremendously rewarding. I've yet to use any other system/runtime that elegantly solves the amount of issues that come up when writing these type of systems up front.
The BEAM is a work of art. Nothing comes close.
- mst 3 years ago
  
  OTP itself is very definitely worth learning about even if you aren't using it directly.
  Reading up on the Erlang GC was fun.
  So was reading Kernel.ex (Elixir's bootstrap file)

alberth 3 years ago

Did this just make numerical computations, something Erlang is known to be very slow at, 4x faster?

wahern 3 years ago

It looks more like the JIT improvements made it profitable to manually unroll some loops in the base64 module: https://github.com/erlang/otp/commit/a03cf1601605dee767cd9d5... IOW, the 4x improvement seems to largely come from a refactor of the base64 module, not from compiler improvements per se.
di4na 3 years ago

No. What makes it "slow" is fundamentally that 1. Everything is bignum. 2. That any operation can be yielded.
It makes it really hard to optimise things that could be sped up by fusing operations.
- PhilipRoman 3 years ago
  
  I wish x86 had an easy way to accumulate overflow flags, you could compile entire basic blocks as if they were using native integers, do a single check at the end, and if needed, roll back the computation.
  The yielding part is harder. You need to have infrastructure in place to dynamically flush certain operations when unexpected yields happen.
  - moonchild 3 years ago
    
    Floats can do it. Do your computation, and then check the inexact flag.
    
    PhilipRoman 3 years ago
    
    TIL such a thing exists. Although it seems like it is rarely used; there is some discussion here https://cs.stackexchange.com/questions/152373/what-are-the-u...
    I guess most practical calculations would end up triggering the inexact flag.
    
    moonchild 3 years ago
    
    Floats can represent all integers with magnitude <=2^53. Which is a sight smaller than integers, under the standard representation (magnitude <2^62), but still plenty for most applications—the point is to use this to implement integer math, not float math. That covers multiplication, addition, and subtraction; div/mod is admittedly a bit trickier, though it's a doozy with avx512.
    (There's also a cute consequence: you can construct numbers outside the contiguous range of representable integers; so long as you never incur any rounding error, the result will be correct, and the inexact flag will not be set.)
- hinkley 3 years ago
  
  IBM's early versions of the J9 compiler had a clever way of yielding (I assume you mean green thread yielding and not generator/closure yielding?)
  Basically the thread scheduling system could trigger an overflow/underflow exception via a fairly fast operation by pushing an illegal value into a watchdog for the thread. An instruction was injected at the top of each function, loop, where the condition of the registers was in a knowable state, and I think a few other places to guarantee a degree of fairness.
  For more realtime behavior you'd need to pepper these calls in many places, and any fusion operations would need to inject something similar into the instruction stream. Then you'd have to be very, very careful to avoid cache line aliasing that would crash the throughput via false sharing.
- gregors 3 years ago
  
  older conversation about multiplying benchmarks. I saw a huge improvement when the JIT was introduced. Can't wait to see further improvements
  https://news.ycombinator.com/item?id=26707354
macintux 3 years ago

I have no idea, but I suppose when every integer is effectively a bignum, there’s plenty of room for optimization.
- di4na 3 years ago
  
  Funnily enough...
  It is really hard if not impossible to optimise when everything is bignum.
  - titzer 3 years ago
    
    While JavaScript didn't have BigInt until a couple years ago, it did start with all numbers being floats. JSVMs optimize heavily for the small integer (SMI) case. They do both dynamic profiling, speculative optimization, and significant type analysis to get very good code for the SMI cases.
    
    di4na 3 years ago
    
    you realise floats are far easier because in any case you are limited to 64 bits and you know that?
  - Joker_vD 3 years ago
    
    Yeah, just look at Scheme or any other Lisp!

Settings

More optimizations in the compiler and JIT

Keyboard Shortcuts