YJIT: Building a New JIT Compiler for CRuby
shopify.engineeringThat's very interesting! It seems that big Ruby shops like Shopify and Stripe are investing a lot on improving Ruby's performance. Here is a tweet from Stripe's CEO regarding the performance of their custom Ruby compiler https://twitter.com/patrickc/status/1410269843585069056?lang...
I kind of think it would be more accurate to describe it as Sorbet compiler. Since it only works on a subset of Ruby ( Which Stripe runs on ).
But it is AOT! That means we could at least have some widely adopted and supported AOT solution.
There is already one, but it is commercial so might not please everyone, and has a limited focus.
Just in case anybody finds the subject interesting and would like to play with this topic, there's an experimental native-Ruby JIT project by @tenderlove: https://github.com/tenderlove/tenderjit (and the companion native-Ruby assembler Fisk: https://github.com/tenderlove/fisk), which is (IMO) considerably easier to work with.
How does it compare to Truffle Ruby? Is it really harder to run C FFI libs inside graal, than writing a new JIT compiler?
I went to a talk by the (one-of?) leads behind Truffle and it looks really great, but getting all C libs to work reliably did seem to be a big issue.
The performance of Truffle is amazing when it works, but I think the compatability issue is going to hamstring it for a while.
The reason I’m asking is that Graal also has an LLVM frontend, meaning that C libs should mostly work just fine within the VM (with the added benefit of possibly inlining them, further increasing performance)
This one is Oracle free :)
This is so exciting to see! I'm excited about the internal changes to MRI to unlock even bigger performance. I'd also love to see some research on what FFI changes would be helpful for enabling more performance across the FFI boundary. A lot of conversation ends up at "well, we go into C and it's terrible" but that feels defeatist.
Is the main difference in performance between JS and the others (Perl, Ruby, PHP) the fact that JS is JIT'ed and the others are not? I mean JS used to be slowish and until V8 came, what is V8 doing? Why can't Python / Ruby do the same thing?
>I mean JS used to be slowish and until V8 came
There was JS Engine performance war before Chrome v8 came around. From JavaScriptCore to whole bunch of monkeys [1] from Mozilla. Google v8 just makes the competition super heated and every one of them were working around the clock trying to out compete each other on the latest benchmarks. ( That was the dark era when Browser only cares about JS benchmarks scores and nothing in real world )
It would not be an exaggeration to say all the man hours on JS JIT is more than Perl, Ruby and PHP's VM combined. That is why I often said Ruby is the only Top 10 language that gets little to no funding and backing of FAANG. So both YJIT and Sorbet are much needed contribution from Stipe and Shopify. Along with help from Github and Gitlab ( hopefully :) ).
The VMs that gets more resources than JS would be JVM and .Net. And JVM is a monster on its own. Easily a multi billion dollar of investment over all these years. Or something using less man hours and resources like LuaJIT. But then Mike Pall is a super human.
On-going project to clone Mike: https://github.com/LuaJIT/LuaJIT/issues/45
Python does, via pypy [1], which has very impressive performance and Python3 support. However, there are some subtle drawbacks, e.g., building from source takes hours instead of seconds (like CPython), and C extensions are slower. Still, the performance of pypy on a wide range of pure Python microbenchmarks is extremely impressive, and comparable to V8 (I've been collecting such benchmarks at [2] lately).
[2] https://github.com/sagemathinc/JSage/tree/main/packages/jpyt...
Not sure - Google pouring money into faster JavaScript execution is probably a big reason, but maybe JavaScript's comparative simplicity and fewer built-ins make it easier to get more out of it?
My understanding is that the main obstacle to higher performance Python is the huge value of preexisting extensions written in C. Maintaining compatibility with existing C extensions, or at least minimizing the porting effort for such extensions, puts a lot of constraints on the solution space.
I think one could radically change the way Python objects work internally, and have the C foreign function interface (FFI) wrap every object passed to a C extension in an API/ABI-preserving facade (which itself would wrap any objects returned from its methods). However, this would probably greatly slow down C extensions, which are often performance-critical sections of Python applications. It's also possible that there are portions of the C extension API that expose enough details of object internals to even make such facades herculean to implement. (I've only written some small simple C extensions and am not very familiar with the API.)
V8 didn't have to deal with API/ABI compatibility with any preexisting C extensions that may have made too many abstraction-violating assumptions about how objects and the VM worked.
Breaking too many important C extensions would almost certainly send Python the way of Perl 6.
Edit: as an aside, a big difficulty with JS is that objects can have their prototype changed arbitrarily at runtime. Even with Metaclass programming in Python, the class of an object can't be changed after creation, making it much easier to cache/memoize dynamic method dispatch. On the other hand, high performance implementation of Python's bound methods require a bit more flow analysis than you need in JS. In Python, if you write f = x.y, f is a "bound method" (a closure that ensure x is passed as "self" to y). It's expensive to create closures for each and every method invocation, so a high performance implementation would need to do a bit of static analysis to identify which method look-ups are used purely for invocation, and which look-ups need to create the closures because the method itself is passed around or stored in a variable.
> I think one could radically change the way Python objects work internally, and have the C foreign function interface (FFI) wrap every object passed to a C extension in an API/ABI-preserving facade
HPy is building an API abstraction layer which is designed to be used with both the CPython API and JITs. However, IIUC they are not proposing any changes to CPython itself, but rather to provide a smaller API surface and fewer JIT impedance mismatches when extensions are built against something other than CPython. The lead developer is a longtime PyPy developer.
To a certain extent yes, but the largest obstacle is simply that, until rather recently, the core python team was hostile to including complex stuff like a JIT into the main python interpreter codebase.
What is simple in Javascript today? To me it seems feature bloated and you can do a thing in ten different ways.
Look past the syntax and it has very simple semantics. Ruby has extraordinarily complicated semantics due to an enormous core library.
I don't disagree but there is less to the base language than there is Ruby and Python (though all three languages continue to add new stuff). Even looking at the primitive types - JavaScript has three, Python has four, Ruby has... classes, but probably more base types.
MRI has more primitives of sort, but they're an implementation detail. Ruby itself conceptually only have objects.
MRI however implements some of them (integers, floats, symbols, true, false, nil, I might have forgotten one or two) using type tagged values instead of pointers to objects.
Type-tagging is easier to make fast for a highly gc'd dynamic language, as it reduces gc pressure substantially to not have to allocate lots of small objects without massive amounts of complex optimisations.
Actually Ruby 2.5 w/o MJIT is faster than Python 3.8 and Perl 5.28 at prime number crunching (using quite mundane stuff: basic math + for loops + hash/dict + array operations) but Node/V8, Dart and Luajit are still an order of magnitude faster along with the added JIT penalty. Racket, Julia and Lua are somewhere in between. The story used to be quite different two years ago, with Ruby being slower than both Perl and Python.
I'd be very surprised if well written Julia was in the middle speed pack here. I suspect there. are some known performance anti-patterns used if it's slower than Luajit.
It's most probably due to the startup and JIT penalty, but it was on par with Racket which is quite fast. I'm sure it can be optimized, but the implementation is quite similar to the other languages in order to have a level playing field. It's basically two nested for loops (in julia and racket is a single one with two indexes) adding valus to a set and another one doing set lookup and pushing into an array.
Ah, you're benchmarking wall time including startup and compilation?
If you're interested, I'd be happy to take look anyways and see if there are any easy, idiomatic performance changes that can made for the Julia code without changing the algorithm.
IMHO, basically money & talent. Google hires the best and pours money into V8. The very best developers can work full time and without any distractions. Also, Google is personally invested in making Chrome fast. Not to mention the vast sea of developers Google has and can throw at any project.
Other languages struggle in this regard. Comparatively I imagine far fewer developers working full time on Ruby/Python, not to mention they would have budget constraints to hire and retain talent.
How many Jit compiler attempts did we already see? It seems to me there has never been substantial progress towards speed or is it just that other languages implementations increase speedwise at the same pace?
Source: Computer Languages Shootout
How many? Quite a lot, actually, I personally can count at least 10 attempts at JIT-ing Ruby.
The are 2 main reasons for that.
1. For a good decade - from ~2005 to ~2015 - Ruby was among the most often used tech stacks at startups, and any performance work on Ruby (of which two major fronts were GC and JIT compilation) were perceived as extremely impactful and attractive.
2. Ruby is actually one of the most if not the most dynamic and complex programming languages out there and thus is one of the most challenging to build a runtime for and optimize. It became a de-facto benchmark for JIT research (among the more conventional Java). Over the years many vendors invested into Ruby compilers to push their underlying VM technology. Microsoft sponsored IronRuby to improve their .NET runtime, Sun sponsored JRuby and Oracle sponsors TruffleRuby.
As for progress, the biggest roadblock to Ruby JITs adoption has always being Rails. Rails uses a lot of Ruby features and pushes the language pretty far. Thus, you can't run Rails without your Ruby implementation being very complete and very MRI-compatible (MRI is the default Ruby implementation).
Plus, Rails uses Ruby in a way that is defeats virtually all best practices to produce a JIT-friendly code. Thus, there's no JIT compiler that offers any performance improvements for Rails apps - and in truth there might not be one ever. In fact, YJIT is exciting because it's the first JIT-compiler that seems to offer some speedups for at least a fey Rails benchmarks. People follow it closely, because these speedups might be a fluke, and as the compiler becomes more compliant, they may disappear (that happened in past with some JITs).
Other people tackle this problem by switching away from Rails to other frameworks. AFAIK Stripe themselves don't use Rails in most of their code, so they might benefit a lot from this work even without big improvements for common Ruby benchmarks (language shootout, Techempower, Discource benchmarks).
> People follow it closely, because these speedups might be a fluke, and as the compiler becomes more compliant, they may disappear (that happened in past with some JITs).
They're not. YJIT is really 100% compatible with regular interpreter MRI and already run a small % of production traffic at Shopify as well as fully pass the gigantic test suite of Shopify's 10+ years old monolith as well as GitHub's test suite.
This is not a fluke, that's what you get by building a JIT directly inside MRI rather than starting from scratch. It's harder and slower, but you get full compatibility from day 1.
Oh, I'm sure know that. YJIT is such a marvel of a technology!
I'm so happy for Maxime Chevalier-Boisvert to have such a resounding success with her JIT research after so many years!
I've seen this mentioned a lot, and certainly the history of compiled Ruby + Rails benchmarks indicates this is true.Rails uses Ruby in a way that is defeats virtually all best practices to produce a JIT-friendly code.However, I've never quite understood -- why exactly is this the case?
Is it just the sheer size of Rails? Or is it (ab)using Ruby in weird ways?
edit:
This is the more or less definitive answer, I guess, though it's a bit over my head!
https://k0kubun.medium.com/ruby-3-jit-can-make-rails-faster-...
There have been at least 16 attempts to write a compiler for Ruby!
I’m giving a talk about the history of compiling Ruby at RubyConf.
Wow! I'll try to count.
- 2 Rubinius attempts (Gnu Lightning and LLVM)
- MacRuby
- JRuby (do you count InDy separately?)
- IronRuby (do you count DLR separately? They didn't start with DLR, as far as I recall)
- MagLev - I think they hoped to GemStone would JIT user's Ruby code with bits of interpreter.
- TruffleRuby
- RubyOMR
- Vladimir's MJIT
- Koichi's MJIT
- YJIT
Do we count HotRuby and Opal? - both compile to JS.
I miss a few.
EDIT: mRuby, duh
A few AOT attempts as well, including mine (I've hardly touched mine in the last couple of years, but I have a bunch of GC improvements that haven't been pushed to GitHub that brings it close to self-compiling, but still huge gaps - e.g. no float or regex support, no exceptions)
Adding a few more:
- DragonRuby - I don't know if it should count separately form MacRuby
- There were at least 5 Ruby-to-JavaScript compilers I remember from about 10 years ago, but they are very hard to find. Surprisingly, one of them - ruby2js - is still alive and actively developed!
DragonRuby's just mRuby scripting a framework written in C.
RubyMotion, the compiler created after MacRuby's creator left Apple.
Could you provide some online resources available on that topic?
It's a bit late. If it was released by the time Rails was hip and cool it would for certain help with Ruby adoption and retention. But people in search for new and shiny thing jumped ships already to platforms such as Node.js which has decent performance.
Even for PHP the new VMs and speed improvements might be too late.
Too late how? Shopify isn't doing this so all the Node people will come back to Rails. They're doing it to run their platform faster and eventually save money. They also like Ruby. Also - for the companies running on Ruby (there are still quite a few) this is big news.
Too late to stop Rails going out of favor of developers. Too late to retain it's user base or grow.
Ruby is not Rails. Ruby has been my primary language for 16 years by now (my work projects are in Ruby; my text editor is written in Ruby; I'm close to switching to a terminal written in Ruby), and in that time only a couple of side-projects have involved Rails. For web projects I prefer Sinatra or Padrino depending on requirements.
While the loss of the Rails hype has certainly made things quieter in the Ruby community, for people like me who never liked Rails in the first place the reduced Rails hype has been a blessing, because it means I don't get asked about Rails every time I bring Ruby to the table anymore.
What is your text editor?
Unimaginatively named "re" for "Ruby Editor" - I started with the very tiny "Femto" [1] and expanded it heavily.
It's on Github [2] but note that the version on Github lacks a huge amount of changes sitting in my local tree that I haven't gotten around to cleaning up and pushing (including dependencies on how I've structured my local setup), and it'd be rough for anyone who isn't me and knows where the issues are. I'll eventually around to putting it into a somewhat more usable state for others to try...
A couple of "fun" aspects of it, though:
- It uses Rouge for syntax highlighting, and generally I've tried to rely as much as possible on gems rather than writing custom code (and I'm on a quest to split out whatever I can make generic enough into separate gems).
- It talks to a server process via Drb (going to change that for various reasons). The server process holds all the buffers, and snapshots them to a json file frequently. As a result every single file I've opened in Re the last several years - all 1600 of them - are retained as buffers and loaded into memory when I restart my laptop. I've not gotten around to adding a way to kill a buffer because they take up "only" 68MB total. The client-server approach meant I could switch to use it long before it stopped crashing, since as long as my changes don't corrupt the server-side buffers, it usually only crashes the client (even server-side exceptions gets passed along by Drb to the client)
- I rely on it calling out to a script to split panes, since I use a tiling vm (bspwm), so I have emacs like keybinding to split horizontally and vertically that uses bspc to control bspwm (I have a script for i3 as well) to split the pane accordingly and start another copy of Re that opens a view to the same serverside buffer. E.g. "split-vertical" just does "bspc node -p south ; exec #{cmd}" where "cmd" is a command line passed from re that will typically be <full path to re.rb> --buffer <numeric id of the buffer to open>.
- To open files or switch between buffers, or select themes, I use rofi rather than build a selector into Re. But it calls rofi by calling out to a script so anything that can take a list of buffers and open a dialog to select one will work (such as e.g. dmenu). E.g. here's the "select-buffer" script that is executed when I do ctrl x + b:
Part of the idea is to make the editor itself as minimal as possible, and farm out everything that can be farmed out to either separate tools or separate gems.re --list-buffers | rofi -dmenu -filter `pwd`/ -p "buffer"
What's funny to me is that Rails 7 is the release I've most looked forward to in years.
Being able to do page interactions over Hotwire without Webpacker with be a real productivity boost for me.
Having stuff like YJIT and Sorbet Compiler available if needed only makes it more attractive.
> Too late to stop Rails going out of favor of developers.
I don't think anyone investing in this aims to restore peak Rails hype, so saying its too late to do that misses the point entirely.
Yes so thats not why they are doing it, and I am not quite sure people flocked to Node because the runtime is faster...