Global Kernel Locks in APFS
gregoryszorc.comI wonder if this is why recent versions of macOS performs so poorly on old HDD-based MacBooks. The difference between my work's 2012 MacBook (HDD) and my personal 2015 (SSD) MacBook is like night and day. Most apps open in a second on my SSD MacBook yet take up to a minute to open on my employer's HDD MacBook. It's not just this MacBook, I also see the same on a work iMac as well. Such a shame as these older machines previously didn't run so slowly.
Replacing an HDD with an SSD really is a night-and-day kind of difference for any laptop I've ever seen, regardless of the OS or filesystem.
IMO, that's likely to be a much bigger factor than the filesystem.
Maybe it's analoguous to the old adage "what intel giveth microsoft taketh away". As soon as a critical mass has SSDs, no one cares to test regressions with spinning disk systems...
Actually, bad I/O patterns are still bad with SSDs. They're just less bad. So, while a HDD user maybe has to wait a week for your app to do its thing, the SSD user may only need to wait an hour or two. But if you bothered to do things the right way, either could be done in a few minutes.
One of the large World of Tanks (online game) patches last year turned 10 second map load times into full minute wait for HDD owners ;-) SSD users didnt notice a thing. Some idiot with dev workstation (xeon, high xx GB ram, NVMe storage) decided to rearrange data structures in files loaded every game round, and nobody has time for testing apparently.
Unfortunately not the biggest one though. I too have a old 2012 macbook pro with hdd. While it was flying under snow leopard it now grinds to a halt every time i open any app or document. Already ordered an ssd but still dreaming of downgrading it back to snow leopard.
Microsoft Windows used to be notorious for slowing down over time. A fresh OS install would restore performance. Many would recommend doing a fresh Windows install yearly to maintain performance.
I have a late 2012 MacBook Pro Retina with a SSD and unfortunately notice the same decreased performance over time with macOS. A fresh OS install breaths new life into the machine.
When you get your SSD do a fresh OS install instead of restoring the complete HD to the SSD.
Or just clean out your random LoginItems and kexts and whatever now and then. Check Activity Monitor and uninstall anything running that you don’t need.
Could be that the rewritten areas for the OS are showing some wear, requiring multiple reads.
Cause and effect is ambiguous because it is so transparent, one doesn’t know what interrupts where are taking the most time.
Might also be that newer versions of macOS need more RAM than Snow Leopard and therefore less is available for filesystem caches.
This might be a factor although IIRC there has been some work after Lion (Mavericks?) where OS RAM usage went down significantly. It was really noticeable on 2GB VMs.
Mavericks introduced compressed memory, so that may be the improvement you’re talking about.
2015 was the year that Apple moved to NVMe SSDs, so they are bound to feel much more snappy than spinning rust disks, regardless of the file system involved.
https://www.anandtech.com/show/9136/the-2015-macbook-review/...
Except for their desktops models, the base models of even the 5K iMac still ship with a Fusion Drive, and we only just got support for APFS on those configurations with macOS 10.14. As the base models are typically all authorized Apple Resellers and pop up stores get they must sell a bunch of those especially in regions where there aren’t a lot of Apple Stores (and where people don’t want to pay full price for a year old system).
They are going to need to support and perform well on non-SSD systems for quite a while.
APFS is still (afaik) only used on SSDs. So your HD boot volume is probably HFS even on Mojave.
The author worked very hard to determine the root cause of the problem but he was stymied because the APFS source code is not available. Why not? Apple has open-sourced the Swift compiler with great success, but there seems to be no movement within Apple to open up other system software components.
They do plan to document it fully later:
> Is APFS open source?
> An open source implementation is not available at this time. Apple plans to document and publish the APFS volume format specification.
Source: https://developer.apple.com/library/archive/documentation/Fi...
My theory is that Apple rushed APFS and its implementation isn't in a great shape for the public review. APFS doesn't feel any faster on my SSDs on macOS. I suspect they'll add it in a few years, after a few iterations.
> My theory is that Apple rushed APFS and its implementation isn't in a great shape for the public review.
The rollout of APFS was _incredibly_ well done; I don't think there is even a remote comparison to how successful it was. A new file system was rolled out to millions of devices in an automated fashion with very few issues. I don't think anyone else has even considered such a thing. I highly doubt the code is that rough, I can't imagine pulling this off without fairly solid core code in place.
Agreed. For iOS, the rollout was so smooth that most people weren't even aware it was taking place, with the only observable effect being that the iOS upgrade took a bit longer than usual. For the macOS rollout, there were more hiccups, but still, extremely smooth. In fact, I just had to launch Disk Utility to double-check that this computer was in fact migrated to APFS because I couldn't recall actually seeing anything specific about it (as this computer has a fusion drive, as well as a Bootcamp partition).
Mojave update trashed my Fusion Drive to the extent that plugging any other mac into my imac in target disk mode made the host mac instantly kernel panic. And I lost all my files obv.
But you did backup before update, didn’t you?
You mean besides the fact that the volume encryption password was stored in plain text in the disk?
They released the format spec here: https://developer.apple.com/support/apple-file-system/Apple-...
What makes you think it’s a rush job?
Perhaps it’s just the lawyer reviews holding it up.
> What makes you think it’s a rush job?
A global lock on readdir().
My personal guess is that they wanted to release the printed spec before they released the code. The spec has now been released, the 10.14 source has not, maybe apfs.kext will be in the 10.14 source drop.
(also, HFS+ and its fsck have always been open-source with OS X)
HFS+ was first released for Mac OS 8.1 in 1998. The earliest source code I see is from Mac OS X 10.0, which was released in March of 2001 (not sure when the source was released). The first iPod was released in October of 2001 which used HFS+. I imagine by the time that code was released it had been battle hardened quite a bit.
Because on two separate occasions, APFS ate all of my free space resulting into me reinstalling macOS as the system was locking up. (Yes, I filed a radar, and yes they fixed it, only to regress 6-8 months later).
> They do plan to document it fully later:
> > Is APFS open source?
> > An open source implementation is not available at this time. Apple plans to document and publish the APFS volume format specification.
They claim to plan to document it fully later.
The spec has already been released.
It's a partial spec.
The APFS source code isn't available, but you're free to disassemble apfs_vnop_readdir in apfs.kext. I'm seeing a couple of calls to lck_rw_lock_shared, so it's entirely possible that there's a lock here.
This. It's what everyone I know who works in the Windows space does when they have a deep problem to investigate. Lack of source doesn't mean lack of ability to investigate, and sometimes the source doesn't tell the whole story either.
(Yes, I know about the anti-RE clauses in EULAs. If they were actually enforced as strictly as they claim, people like Mark Russinovich and Matt Pietrek would've been sued out of existence long ago, along with just about every Windows security researcher.)
The Wine source is often helpful too.
Apple releases most of the macOS kernel but I guess it doesn't include APFS: https://opensource.apple.com/source/xnu/
That's missing a good chunk of the interesting stuff related to how users interact with the os though. (app and framework source code)
Yeah, apps and frameworks aren't part of the kernel. But APFS is.
It's loaded into the kernel, but it's a kext (kernel extension) rather than being part of the xnu source tree itself. Only a subset of kexts are open source. (By comparison, HFS used to be in the xnu tree, but it was moved into a kext as well a few years back; that one is open source.)
Kind of talking out of my ass, but Darwin can act as a mirokernel, so APFS could be implemented as a user mode service (no idea if it's the case, though).
Darwin's XNU is not a microkernel, in spite of there being a microkernel version of XNU. While Darwin does support FUSE, performance would almost certainly be inadequate, given the constant transitions between user and kernel space.
I thought it was a mixture of OSFMK and BSD, which is why I said it can act like a microkernel. Don't know more than what wikipedia says, though.
Yes, it is, but the Mach layer and BSD layer were fused into a monolithic kernel before NeXT Step turned into OS X. In OS X, HFS (and presumably APFS) are in the BSD layer.
Pretty sure the code would expose just how much of a rush-job apple did with APFS. It may be released one day..
I'm too late to the thread, but wonder if locking might have to do with updating access times on the dirs on read--that would help justify a read-only workload taking some sort of exclusive lock. The "fix" there would be something like relatime to get atimes set infrequently.
There're def a ton of other possibilities. Some locking could be overbroad, like an exclusive lock where a reader lock would be enough. I don't see super-strong evidence that any locks involved are actually global, just that they're under contention here. And I can understand shipping a product with somewhat excessive locking because perf issues with specific workloads are a better problem to have with your shiny new FS rollout than data loss.
Anyway it's mostly shooting in the dark for us here (though cool some folks disassembled the functions), but there are some shots in the dark that look different from what I could find in the comments already here :)
Absolutely nothing there indicates the presence of global kernel locks. readdir (obviously!!!) needs to hold a lock, it’s not global, it’s per-DIR.
As someone who used to work on BitKeeper, I'm tickled to see a more popular SCM's test suite as a kind of de facto filesystem benchmark -- confirming what we had been observing ourselves forever.