OCaml, all the way down
Press enter or click to view image in full size
Some history
In 2021, we decided to evaluate Melange as an alternative to ReScript for compiling Ahrefs’ frontend codebase. We wrote about the reasons that led us there, as well as the limitations we encountered at the time, in a previous article.
After this experiment, discussions continued inside the team. Switching to a different compiler, which was in a very early stage, involved quite some risk. But so did the continued use of ReScript, which seemed to be diverging further and further away from OCaml.
Finally, in September 2022 (during ICFP in Ljubljana), we decided to bite the bullet and kicked off a project to deepen the integration between Dune (OCaml’s most used build system) and Melange. This better integration was the key to solve two of the three limitations we had encountered during our initial exploration of Melange:
- Build speed would increase due to less work needed to parse
dunefiles, and more efficient rules planning and execution. - Developer ergonomics would get better, as Melange would become a first-class citizen in Dune, with concepts like Dune libraries and other stanzas becoming available to Melange users.
Ahrefs’ leadership backed the project and agreed to financially support the development of this tighter integration. With this support, we set about building a team that included Rudi Grinberg, who maintains Dune as part of its development team, and Antonio Monteiro, who created Melange and is also part of the Dune development team.
Heads down
During the following months, we focused on two tasks, iterating over multiple cycles where the progress on one task would inform the next steps to take for the other:
- Evolve Dune to add stanzas, fields, and documentation to support Melange projects.
- Migrate Ahrefs’ frontend codebase to use the Melange compiler and Dune, adapt third-party libraries and bindings to Melange, and polish the editor integration, build scripts, and other aspects of the development experience.
We believe that tackling these two tasks in parallel led us to better results, compared to a more waterfall-based approach. As we applied the changes over Ahrefs’ large-ish frontend codebase — it will soon reach 5000 modules — we kept finding and fixing bugs, improving the ergonomics of the Dune and Melange integration, and in general making the solution more robust, real-world ready, and developer-friendly.
Another upside of the way the project was implemented is that we developed it initially in stealth mode, keeping it quite private. By working on it within a tight-knit team, before making a public release, we could make progress faster. We believe that this approach saved future Melange users a lot of churn and burn caused by the multiple changes in Dune stanza options, Melange flags, and other configurations we changed along the way, as we learned more about this integration.
Migration strategies
Initially, our plan was to progressively migrate Ahrefs’ code to Melange. As the frontend codebase is divided into different tools, each being self-contained, we thought we could introduce Melange to build one tool, then another tool, gradually migrating them one by one.
However, this approach turned out to be too complex because configuring a development environment that works on both Melange and ReScript is challenging. As developers could be working on multiple tools during the same week, or even within the same day, we realized that it was unfeasible to reconfigure the environment every time a developer switched from a tool built with Melange to a tool built with ReScript.
Therefore, we changed our minds and opted for a one-shot migration. We would ensure that CI, development, and staging environments were working with Melange and Dune. And we would do this on separate branches, while still using ReScript on our main branch CI and development scripts. Once we were confident everything was building and functioning correctly with Melange, we switched all CI and development scripts to use the Melange and Dune commands. We tried to keep the PR that applied this switch as small as possible, with just a few hundreds of lines of changes so that we could switch back to ReScript if needed. In fact, after a first attempt in March, we had to switch back to ReScript due to some issues on the developer experience side, related to build performance and ergonomics, which took a few more weeks to solve.
In terms of package management and third-party Melange dependencies, we followed a more gradual approach. Dune is quite flexible when it comes to vendoring, so in the initial phase, we downloaded Melange libraries with npm, and had Dune include them in the project as if they were local sources. Now we have started migrating some of these libraries so that we can consume them using opam, the OCaml package manager. This will involve first publishing them in our private opam mirror, but the plan is to have them published in the public opam repository in the future so that other Melange developers can also use them.
Timings
You may be curious about the performance differences between the previous and current approaches. Measuring performance is tricky, but we attempted to measure a few different scenarios with both setups. The results can be seen below.
Keep in mind that Ahrefs frontend setup has specific characteristics, which affect the performance measurements:
- Before migration: Dune generated
mlfiles fromatdfiles, then ReScript build toolbsbbuilt all hand-written source files plus the ones generated fromatdfiles. - After migration: everything is built with Dune and Melange.
All measurements were taken on a node with 2x AMD EPYC 7742 cpu @3.2 GHz (nproc=256), 1TB RAM, Debian 11 x86_64 GNU/Linux. The build target is always the entire Ahrefs frontend codebase.
Get Javier Chávarri’s stories in your inbox
Join Medium for free to get updates from this writer.
Cold build:
- Before:
real 0m28.232s,user 9m23.883s,sys 13m33.939s - After:
real 1m14.208s,user 10m33.708s,sys 5m45.644s
Warm build, noop (no file is built):
- Before:
real 0m14.687s,user 3m17.058s,sys 3m57.903s - After:
real 0m21.895s,user 0m20.528s,sys 0m1.372s
Watch mode, modifying an “edge” file with almost no reverse dependencies:
- Before:
1002ms - After:
1576ms
Watch mode, modifying an “inner” file belonging to a library, with many reverse dependencies:
- Before:
7032ms - After:
15394ms
In general, Melange and Dune are slower than ReScript for cold builds in our setup. However, the differences are smaller for warm builds. For watch mode, the difference gets reduced when modifying edge files.
There is room for improvement in the way the Melange and Dune rules are arranged so that cold builds can get faster. For example, delaying some optimizations in Melange might allow to parallelize more work.
Conclusions
The results so far are quite encouraging. These are some of the things that are possible thanks to the deeper integration between Dune and Melange, and its application within the Ahrefs codebase:
- The same OCaml compiler is used on both frontend and backend codebases.
- Access to all the bug fixes, error improvements, and new features that the OCaml compiler team added between versions 4.06 and 4.14 of the compiler.
- A shared developer environment across teams, including editor extensions, OCaml LSP server, etc. No more need to maintain a different set of tooling for backend and frontend.
- Removal of hand-written CI checks that were ensuring different tools in the frontend codebase would not access components from other tools. This is now solved by Dune libraries, and the OCaml compiler will complain if logical units try to reach outside their bounds.
- Frontend and backend shared dependencies, such as anuragsoni/routes, can now be defined in a single place: an opam file.
- Faster rebuilds and better watch mode, as Dune now controls all the build artifacts. Previously, Dune and ReScript were sharing responsibilities, which was leading to unnecessary rebuilds of some artifacts. Or alternatively, rebuilds were not starting when required due to the build system not tracking changes in some subsets of the sources.
- Easier PPX maintenance, as there is no longer a need to publish pre-built versions of these tools.
- Melange allows to run all ppxs from a single executable file, which has some nice performance benefits.
- All the other advantages of using Dune: virtual libraries, watch mode, leverage integrations with tools like odoc…
What’s next?
We are excited about this project becoming a reality, and we believe that the deeper integration between OCaml and Melange through Dune, together with Melange’s ergonomic integration with the JavaScript ecosystem through its bindings, can enable projects that were previously impossible to imagine. For example, full-stack React by hydrating components that are rendered server-side using native OCaml.
Now, we have to make it easier for other people who are willing to try and use Melange and Dune. So our focus is shifting to documenting how Melange works.
There is a section for Melange in the Dune manual that will be included in the next stable release, and that can be consulted today in the latest branch: https://dune.readthedocs.io/en/latest/melange.html.
The next step will be to design and create a site where everyone can read and learn about what is needed to create and maintain a project using Melange and Dune. This site will include a playground, in the spirit of the ReasonML one, so that we can share snippets, see the resulting JavaScript compilation output, and iterate on ideas together.
Besides the above, we have plenty of other things we will be working on in the next months. We will share more information about the roadmap as soon as Dune 3.8 and its respective Melange version are published in the main public opam repository, which should happen in the next weeks.
How to contribute?
If you want to be a part of this, or you want to write or port your libraries to Melange, the best way to do so is by reaching out on the ReasonML Discord. There is a #melange dedicated channel where one can get help and advice on how to get started.
Otherwise, if you are missing features, find bugs, or run into confusing errors, please open an issue in the Melange public repo.
We hope you share our excitement about this update. Our journey to integrate our frontend stack more naturally within the incredible language and ecosystem of OCaml will be well-documented. Stay tuned for further updates in the future!