The future for Tyr

9 min read Original article ↗

The team behind Tyr started 2025 with little to show in our quest to produce a Rust GPU driver for Arm Mali hardware, and by the end of the year, we were able to play SuperTuxKart (a 3D open-source racing game) at the Linux Plumbers Conference (LPC). Our prototype was a joint effort between Arm, Collabora, and Google; it ran well for the duration of the event, and the performance was more than adequate for players. Thankfully, we picked up steam at precisely the right moment: Dave Airlie just announced in the Maintainers Summit that the DRM subsystem is only "about a year away" from disallowing new drivers written in C and requiring the use of Rust. Now it is time to lay out a possible roadmap for 2026 in order to upstream all of this work.

What are we trying to accomplish with Tyr?

Miguel Ojeda's talk at LPC this year summarized where Rust is being used in the Linux kernel, with drivers like the anonymous shared memory subsystem for Android (ashmem) quickly being rolled out to millions of users. Given Mali's extensive market share in the phone market, supporting this segment is a natural aspiration for Tyr, followed by other embedded platforms where Mali is also present. In parallel, we must not lose track of upstream, as the objective is to evolve together with the Nova Rust GPU driver and ensure that the ecosystem will be useful for any new drivers that might come in the future. The prototype was meant to prove that a Rust driver for Arm Mali could come to fruition with acceptable performance, but now we should iterate on the code and refactor it as needed. This will allow us to learn from our mistakes and settle on a design that is appropriate for an upstream driver.

What is there, and what is not

A version of the Tyr driver was merged for the 6.18 kernel release, but it is not capable of much, as a few key Rust abstractions are missing. The downstream branch (the parts of Tyr not yet in the mainline kernel) is where we house our latest prototype; it is working well enough to run desktop environments and games, even if there are still power-consumption and GPU-recovery problems that need to be fixed. The prototype will serve the purpose of guiding our upstream efforts and let us experiment with different designs.

A kernel-mode GPU driver such as Tyr is a small component backing a much larger user-mode driver that implements a graphics API like Vulkan or OpenGL. The user-mode driver translates hardware-independent API calls into GPU-specific commands that can be used by the rasterization process. The kernel's responsibility centers around sharing hardware resources between applications, enforcing isolation and fairness, and keeping the hardware operational. This includes providing the user-mode driver with GPU memory, letting it know when submitted work finishes, and giving user space a way to describe dependency chains between jobs. Our talk (YouTube video) at LPC2025 goes over this in detail.

[SuperTuxKart running on Tyr at LPC]

Having a working prototype does not mean it's ready for real world usage, however, and a walkthrough of what is missing reveals why. Mali GPUs are usually found on mobile devices where power is at a premium. Conserving energy and managing the thermal characteristics of the device is paramount to user experience, and Tyr does not have any power-management or frequency-scaling code at the moment. In fact, Rust abstractions to support these features are not available at all.

Something else worth considering is what happens if the GPU hangs. It is imperative that the system remains working to the extent possible, or users might lose all of their work. Owing to our "prototype" state, there is no GPU-recovery code right now. These two things are a hard requirement for deployability. One simply cannot deploy a driver that gobbles all of the battery in the system — making it hot and unpleasant in the process — or crashes and takes the user's work with it.

On top of that, Vulkan must be correctly implementable on top of Tyr, or we may fail to achieve drop-in compatibility with our Vulkan driver (PanVK). This requires passing the Vulkan Conformance Testing Suite when using Tyr instead of the C driver. At that point, we would be confident enough to add support for more GPU models beyond the currently supported Mali-G610. Finally, we will turn our attention to benchmarking to ensure that Tyr can match the C driver's performance while benefiting from Rust's safety guarantees. We have demonstrated running a complex game with acceptable performance, so results are good so far.

Which Rust abstractions are missing

Some required Rust infrastructure is still work-in-progress. This includes Lyude Paul's work on the graphics execution manager (GEM) shmem objects, needed to allocate memory for systems without discrete video RAM. This is notably the case for Tyr, as the GPU is packaged in a larger system-on-chip and must share system memory. Additionally, there are still open questions, like how to share non-overlapping regions of a GPU buffer without locks, preferably encoded in the type system and checked at compile time.

$ sudo subscribe today

Subscribe today and elevate your LWN privileges. You’ll have access to all of LWN’s high-quality articles as soon as they’re published, and help support LWN in the process. Act now and you can start with a free trial subscription.

On top of allocating GPU memory, modern kernel drivers must let the user-mode driver manage its own view of the GPU address space. In the DRM ecosystem, this is delegated to GPUVM, which contains the common code to manage those address spaces on hardware that offers memory-isolation capabilities similar to modern CPUs. The GPU firmware also expects control over the placement of some sections in memory, so it will not work until this capability is available. Alice Ryhl is working on the Rust abstractions for GPUVM as well as the io-pgtable abstractions that are needed to manipulate the IOMMU page tables used to enforce memory isolation. These are both based on the previous work of Asahi Lina, who pioneered the first Rust abstractions for the DRM subsystem.

Another unsolved issue is DRM device initialization. The current code requires an initializer for the driver's private data in order to return a drm::Device instance, but some drivers need the drm::Device to build the private data in the first place, which leads to an impossible-to-satisfy cycle of dependencies. This is also the case for Tyr: allocating GPU memory through the GEM shmem API requires a drm::Device, but some fields in Tyr's private data need to store GEM objects — for example, to parse and boot the firmware. Lyude Paul is working on this by introducing a drm::DeviceCtx that encodes the device state in the type system.

The situation remains the same as when the first Tyr patches were submitted: most of the roadmap is blocked on GEM shmem, GPUVM, io-pgtable and the device initialization issue. There is room to integrate some work by the Nova team, as well: the register! macro and bounded integers. Once we can handle those items, we expect to quickly become able to boot the GPU firmware and then progress unhindered until it is time to discuss job submission.

Another area needing consideration is the paths where the driver makes forward progress on completing fences, which are synchronization primitives that GPU drivers signal once jobs finish executing. These paths must be carefully annotated or the system may deadlock, and the driver must ensure that only safe locks are taken in the signaling path. Additionally, DMA fences must always signal in finite time, or someone elsewhere in the system may block forever. Allocating memory using anything other than GFP_ATOMIC must be disallowed, or the shrinker may kick in under memory pressure and wait on the very job that triggered it. All of this is covered in the documentation. We conveniently ignore this in the prototype, meaning it can randomly deadlock under memory pressure. Addressing this is straightforward: it is just a matter of carefully vetting key parts of the driver. Doing so elegantly, however, and perhaps in a way that takes advantage of Rust's type system is something that remains to be discussed.

Looking into the future

We have not touched upon what is next for Linux GPU drivers as a whole: reworking the job-submission logic in Rust. The current design assumes that drm_gpu_scheduler is used, but this has become a hindrance for some drivers in an age where GPU firmware can schedule jobs itself, and it's been plagued by hard-to-solve lifetime problems. Quite some time was spent at the X.Org Developer's Conference in 2025 discussing how to fix it.

The current consensus for Rust is to write a new component that merely ensures that the dependencies for a given job are satisfied before the job is eligible to be assigned in the GPU's ring buffer, at which point the firmware scheduler takes over. This seems to be where GPU hardware is going, as most vendors have switched to firmware-assisted scheduling in recent years. As this component will not schedule jobs, it will probably be called JobQueue instead. This correctly conveys the meaning of a queue where new work is deposited in and removed once the dependencies are met and a job is ready to run. Philip Stanner has been spearheading this work.

The plan is to also expose an API for C drivers using a technique I have described here in the past. This will possibly be the first Rust kernel component usable from C drivers, another milestone for Rust in the kernel, and a hallmark of seamless interoperability between C and Rust.

One way that Tyr can fit into this overall vision is by serving as a testbed for the new design. If the old drm_gpu_scheduler can be replaced with the JobQueue successfully in the prototype, it will help attest its suitability for other, more complex drivers like Nova. Expect this discussion to continue for a while.

In all, Tyr has made a lot of progress this past year. Hopefully, it will continue to do so through 2026 and beyond.


Index entries for this article
GuestArticlesAlmeida, Daniel