Benefits for LWN subscribersThe primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!
The Lustre filesystem has a long history, some of which intersects with Linux. It was added to the staging tree in 2013, but was bounced out of staging in 2018, due to a lack of progress and a development model that was incompatible with the kernel's. Lustre may be working its way back into the kernel, though. In a filesystem-track session at the 2025 Linux Storage, Filesystem, Memory Management, and BPF Summit (LSFMM+BPF), Timothy Day and James Simmons led a discussion on how to get Lustre into the mainline.
Day began with an overview of Lustre, which is a "high-performance
parallel filesystem
". It is typically used by systems with lots of GPUs that
need to be constantly fed with data (e.g. AI workloads) and for
checkpointing high-performance-computing (HPC) workloads. A file is split
up into multiple chunks that are stored on different servers. Both the
client and server implementations run in the kernel, similar to NFS. For
the past ten or more years, the wire and disk formats have been "pretty
stable
" with "very little change
"; Lustre has good
interoperability between different versions, unlike in the distant past where both
server and client needed to be on the same version.
The upstreaming project has been going on for a long time at this
point, he said. A fork of the client was added to the staging tree and
resided there for around five years before "it got ejected, essentially
due to insufficient progress
". It was a "bad fit
" for the
kernel, since most developers worked on the out-of-tree version, rather
than what was in staging.
But "the dream of actually getting upstream still continued
". There
have been more than 1000 patches aimed at getting the code ready for the
kernel since it got ejected; around 600 of those were from Neil Brown and 200
came from Simmons. Roughly 1/3 of the patches that have gone into the
out-of-tree repository since the staging removal
have been related to the upstream goal, Simmons said.
Day said that the biggest question is how the project can move from its
out-of-tree development model to one that is based around the upstream
kernel repository. The current state is "a giant filesystem, #ifdef-ed
to hell and back to get it working with a bunch of kernel versions
".
The next stage, which is currently being worked on and is slated to
complete in the next year or so, is to split the compatibility code out of
the core filesystem code; the goal is to eventually have two separate trees
for those pieces. The core filesystem tree would go into the kernel tree,
while the compatibility code, which is meant to support customers on
older kernels, would continue to
live in a Lustre repository.
Another area that needs attention is changes to the development process to
better mesh with kernel development. The Lustre project does not use
mailing lists; it uses a Gerrit instance
instead. "We have got to figure out how to adapt.
" Simmons said
that there are some developers who are totally Gerrit-oriented and some who
could live with a mailing list; "we have to figure out how to please
both audiences
".
Amir Goldstein said that the only real requirement is that the project post the patches once to the mailing list before merging; there is no obligation to do patch review on the list. Simmons said that he and Brown have maintained a Git tree since Lustre was removed from staging; it is kept in sync and updated to newer kernels. All of the patches are posted to the lustre-devel mailing list and to a Patchwork instance, so all of the history is open and available for comments or criticism, he said.
Josef Bacik asked about what went wrong when Lustre was in the staging
tree. From his perspective, Lustre has been around for a long time and
there are no indications that it might be abandoned, so why did it not make
the jump into the mainline fs/ directory? Simmons said that the
project normally has a few different features being worked on at any given
time, but Greg Kroah-Hartman, who runs the staging tree, did not want any
patches that were not cleanups. So the staging version fell further and
further behind the out-of-tree code. Bacik said that made sense to him;
"that's like setting you up to fail
".
Christian Brauner said that he would like to come up with a "more streamlined
model
" for merging new filesystems, where the filesystem community
makes a collective decision on whether the merge should happen. The
community has recently been "badly burned by 'anybody can just send a
filesystem for inclusion' and then it's upstream and then we have to deal with
all of the fallout
". As a VFS maintainer, he does not want to be the
one making the decision, but merging a new filesystem "puts a burden on
the whole community
" so it should be a joint decision.
Bacik reiterated that no one was concerned that Lustre developers were
going to disappear, but that there are other concerns. It is important
that Lustre is using folios everywhere, for example, and is using "all
of the modern things
"; that sounds a little silly coming from him,
since Btrfs is still only halfway there, he said. Simmons said that the
Lustre developers completely agree; there is someone working on the folio
conversion currently. At the summit, he and Day have been talking with David
Howells about using his netfs
library.
Jeff Layton asked if the plan was to merge both the client and the server.
Simmons said that most people are just asking for the client and that is a
slower-moving code base. The client is "a couple-hundred patches per
month, the server is three to four times the volume of patches
", which
makes it harder to keep up with the kernel. Layton said: "baby steps
are good
", though Day noted that it is harder to test Lustre without
having server code in the kernel.
The reason he has been pushing for just merging the client, Ted Ts'o said,
is because the server is "pretty incestuous with ext4
". The server
requires a bunch of ext4 symbols and there is "a need to figure out how
to deal with that
". That impacts ext4 development, but other changes,
such as the plan to rewrite the jbd2
journaling layer to not require buffer heads,
may also be complicated by the inclusion of the Lustre server, he said.
Simmons asked about posting patches to the linux-fsdevel mailing list before Lustre is upstream so that the kernel developers can start to get familiar with the code. Bacik said that made sense, but that he would not really dig into the guts of Lustre; he is more interested in the interfaces being used and whether there will be maintenance problems for the kernel filesystem community in the Lustre code. Goldstein suggested setting up a Lustre-specific mailing list, but Simmons noted that lustre-devel already exists and is being archived; Brauner suggested getting it added to lore.kernel.org
The intent is that when Linus Torvalds receives a pull request for a new filesystem that he can see that the code has been publicly posted prior to that, Goldstein said. It will also help if the Git development tree has a mirror on git.kernel.org, Ts'o said. Bacik said that he thought it was a probably a lost cause to try to preserve all of the existing Git history as part of the merge, though it is up to Torvalds; instead, he suggested creating a git.kernel.org archive tree that people can consult for the history prior to the version that gets merged.
Given that Lustre targets high performance, Ts'o said, it will be important to support large folios. Simmons said that someone was working on that, and that it is important to the project; the plan is to get folio support, then to add large folios. Matthew Wilcox said that was fine, as long as the page-oriented APIs were getting converted. Many of those APIs are slowly going away, so the Lustre developers will want to ensure the filesystem is converted ahead of those removals.
| Index entries for this article | |
|---|---|
| Kernel | Filesystems/Lustre |
| Conference | Storage, Filesystem, Memory-Management and BPF Summit/2025 |