Building a Compiler with Multi-Level Intermediate Representation (MLIR) (2020) [pdf]
llvm.orgI love MLIR. The modularity and friendly abstractions make it incredibly flexible. I've now used it to write _multiple_ domain-specific optimizations and transformations for some of my recent research! It truly bridges the gap between different devices (CPUs, GPUs, TPUs, etc.). I pray more people adopt it so it doesn't end up abandoned!
It’s so hard to get a complete environment up and running. I had a 700-level project that was ripe for MLIR, and I just couldn’t figure out all the tools. I ended up managing with just clang.
Much of MLIR requires compiling from source, from what I can tell, and I just could’ve figure out exactly what to build so that I had access to all of the tool chain from clang to MLIR.
this is quickly getting better
>Much of MLIR requires compiling from source, from what I can tell
you can get apt packages from https://apt.llvm.org/ and build projects out of tree. you can also get packages from conda (https://github.com/conda-forge/mlir-feedstock). finally, if you look around on github you'll find tarred up releases too maintained by downstream users (e.g.https://github.com/ptillet/triton-llvm-releases).
you can also (as of very recently) build mlir-opt plugins just like for clang:
https://github.com/llvm/llvm-project/tree/main/mlir/examples...
It definitely seems like an active and vibrant community.
Saw the notes about putting distributions together while I was putzing around.
Part of the issue is that I couldn’t find MLIR-specific packages on Ubuntu jammy. Built-from-source was a different version than the packaged LLVM and clang. I couldn’t find a way to get mlir-translate, for example, without compiling from source.
I’m sure I could have figured it out with enough time, but I was short on time. Hopefully graduate with my master’s on Thursday actually.
Sounds very interesting. Could you explain a bit more what you did and why it was easier with MLIR?
What is the difference between MLIR and LLVM IR code that we used to?
LLVM IR is just one IR dialect in the MLIR ecosystem, and there are a bunch of (included) higher-level IR dialects that can be transformed (usually, automatically) into the LLVM IR dialect, and from there, the normal LLVM compiler can take over and produce runnable machine code.
Your own languages can target the higher-level IR dialects in MLIR, or directly target the LLVM IR dialect, or both: MLIR is unique in that multiple IR dialects are allowed to be "live" at any time in the compiler, there are no strict "phases" where one IR is lowered, one-shot, into a lower-lever IR, like most compilers require (and compiler books teach).
MLIR is a really, really neat bit of technology.
As I understand it, MLIR is the new subsystem that the LLVM project is transitioning to in the long term, and LLVM IR is the old.
As such, LLVM IR isn't a proper subset of MLIR. Rather, there is a LLVM "dialect" in the MLIR system which can be translated 1:1 to LLVM IR.
MLIR in its structure and textual syntax is a bit different. A "dialect" is more like a namespace for your ops than a different language, in my view.
In the transition, traditional LLVM IR isn’t being left behind. It’s simply a later step in the compilation process. All MLIR is (eventually) translated through LLLVM IR before machine code.
>As I understand it, MLIR is the new subsystem that the LLVM project is transitioning to in the long term, and LLVM IR is the old.
this is very much not a forgone conclusion and many people in LLVM would boo vociferously at the idea (see last year's LLVM US meeting where Johannes Doerfert actually argued the exact opposite - extending LLVM IR to do some/many of the things that MLIR does).
It depends what you mean by "new subsystem" and "transitioning to": what seems like a given is that the notion of "one size fits all" of LLVM IR is behind us and the need to multi-level IR is embraced. LLVM IR is evolving to accommodate this better, within reason (that is: it stay organized around a pretty well defined core instruction set and type system), and MLIR is just the fully extensible framework beyond this. It is to be seen if anyone would have the appetite to port LLVM IR (and the LLVM framework) to be a dialect, I think there are challenges for this.
Can this handle undelimited continuations?
I thought you could implement those in terms of delimited continuations easily by making your top level the delimiter.
That's something I have thinking about lately. It would be interesting to know if there's any prior work of implementing delimited continuations on MLIR.
I say programs are digraphs and any compiler/compiler-ish library which represents them as serialized textual statements (other than as the final backend output) is just doing it all wrong.
Any idea if/when Clang will move to MLIR?