Settings

Theme

Most M1 Macs appear to have a serious SSD wear defect

twitter.com

31 points by jabberwcky 5 years ago · 31 comments

Reader

marcan_42 5 years ago

Okay, hold on for a second. I am the OP. This title is misleading. We didn't say "most". I wasn't able to clearly reproduce this, the people on Twitter are reporting wildly different numbers, some of which are reasonable, and certain usage patterns with heavy swapping have to wear down the SSD. And this also affects Intel macs to some extent.

What we know so far is that some people are reporting completely unreasonable, dangerous levels of SSD write volume, which on some configurations would be expected to prematurely wear down SSDs (which are not replaceable) well before the expected lifetime of the device. We don't know yet what triggers this; it looks like some kind of bug or software combination, but it's too early to have good leads.

The worst example so far is David's, which, if scaled proportionally (by Flash cell wear) from his 2TB SSD to a base 256GB model, would be reaching 100% lifetime usage within less than a year. This calculation is based on an assumption of equal (proportional) overprovisioning for those two models, and an assumption that whatever triggers this behavior doesn't care about total SSD size; these are sensible assumptions to make at this stage but not verified.

https://twitter.com/marcan42/status/1361160838854316032

This is clearly a software problem, and fixable with an update. It's not a hardware defect, it's the kernel (we think it's a VM* issue) hammering the SSD way too much.

There is reason to be concerned, and to make sure Apple fixes this if it is a real bug. There is no reason to be alarmed and panic about these machines. If this is a real issue they have to fix it; they aren't stupid, Apple knows full well that the SSDs in some M1 macs dropping like flies within a year would be a PR disaster for them. What we need to do know is gather data and try to find a way to reproduce this.

* VM means Virtual Memory, not Virtual Machine, in this context, for all you non-kernel folks.

  • marcan_42 5 years ago

    Just as an update, it turns out the endurance ratings are not proportional. That makes the known-worst-case prediction for a 256GB model being that max writes could be reached after ~2 years, for usage similar to what David got above.

uniqueid 5 years ago

No problem, just make sure you clone to an external dri... oh.

  • coldtea 5 years ago

    No problem, period. Just a guy that can't understand the diagnostics he reads.

wooger 5 years ago

Never buy 1st Gen Mac hardware. As true today as it was in 1995

8fingerlouie 5 years ago

Considering the data available, 1% used in 2 months, that is 6% in a year, or roughly 16 years to 100%. SSDs have wearout indicators for a reason, but how long do you honestly expect it to last ? Had it been spinning rust you'd be lucky to get 5 years out of it.

Most users will be able to get a decade worth of usage out of it.

Nothing to see here, move along.

  • teruakohatu 5 years ago

    > or roughly 16 years to 100%

    That is two big assumptions: assuming linear degradation and assuming the mac is usable until it reaches 0%.

    I would think that it starts degrades faster as it wears, and that by the time it is at least 90% on a 256gb or even 512gb SSD it will be effectivly unusable.

    • 8fingerlouie 5 years ago

      > That is two big assumptions: assuming linear degradation and assuming the mac is usable until it reaches 0%.

      I thought i already accounted for that by stating "most users will get a decade of use". That's roughly 66% of linear degradation to 0%.

      > and that by the time it is at least 90% on a 256gb or even 512gb SSD it will be effectivly unusable.

      As i understand it, the "lifetime" is a reflection of the spare sectors, meaning once the spare sectors run out you'll start seeing errors instead of relocated sectors. It will probably continue for some time after that.

      Knowing Apple, if it becomes a problem anytime within the first 5-6 years, they'll replace it for free. I had a 2008 iMac with a manufacturing problem on the Seagate 1TB drive it shipped with, and 5 years after purchase (2013), i was able to get the drive replaced for free despite my machine showing no signs of the error. 5 years after purchase, and a full 3 years after most consumer laws stop protecting you.

      Other than that, the only reason this is a "problem" is because SSDs has an indicator that tells you when they'll expire. Spinning rust doesn't have that, but some spinning rust will also expire after 5-6 years, and most will have expired by the time a decade has passed (assuming daily usage).

      • kalleboo 5 years ago

        Lifetime is just "numbers of bytes written"/"expected lifetime bytes written"

        There's a separate SMART attribute for remaining spare, and that is supposed to stay at 100% until you reach 100% lifetime, but the actual failure of cells can come sooner or later.

        • marcan_42 5 years ago

          The scary conclusion I came to is based on the assumption of "expected lifetime bytes written" being proportional to SSD size (i.e. based on fixed Flash erase counts, which is usually how it goes); for the worst example we have so have, scaling his 3% from 2T to 256GB machines would mean some of those could be reaching 100% before the first year is up.

supermatt 5 years ago

Seeing terabytes of writes may seem scary, but the device is reporting only 1% of its lifetime used ("percentage used"). It seems author doesn't know how to read (or for some reason doesnt trust) the report he run.

This is the figure that storage manufacturers use in their warranties to decide if a device has been excessively used - NOT the amount of writes.

  • marcan_42 5 years ago

    The worst example we have so far shows 3% lifetime usage in 2 months, on a 2TB model. Since the total lifetime write spec normally scales with SSD size (these things are usually specced in "drive writes per day" for a given fixed lifetime for that reason), assuming this excessive write volume issue itself does not scale with drive size, a 256GB SSD model would be at 23% lifetime usage under the same conditions, or about 100% in 8 months.

    Even if you just look at the 2TB model itself, 3% in 2 months means the thing will reach 100% in 5 years which is... not great.

    So there is reason to be concerned here.

    • supermatt 5 years ago

      Instead of "assuming", I checked my machine. I have a 256GB version with 8GB RAM. I received it 3 days after launch, and it has been used consistently for iOS, android and web development. It is showing 1% "Used" - the same figure as the OP. Im not sure where you are getting 3% from.

      • marcan_42 5 years ago

        Please don't judge me (OP) by the HN submission title, which is not at all an assertion I made. Nobody ever said "most" machines are affected.

        This is the worst example so far, 3% on a 2TB SSD.

        https://twitter.com/david_rysk/status/1361155414994407424

        Care to share the actual write volume that corresponds to 1% used for you on your 256GB drive, so we can validate whether the linear scaling by drive size assumption is reasonable?

        • supermatt 5 years ago

            SMART/Health Information (NVMe Log 0x02)
            Critical Warning:                   0x00
            Temperature:                        32 Celsius
            Available Spare:                    100%
            Available Spare Threshold:          99%
            Percentage Used:                    1%
            Data Units Read:                    42,513,003 [21.7 TB]
            Data Units Written:                 35,670,814 [18.2 TB]
            Host Read Commands:                 444,309,142
            Host Write Commands:                237,126,099
            Controller Busy Time:               0
            Power Cycles:                       213
            Power On Hours:                     199
            Unsafe Shutdowns:                   13
            Media and Data Integrity Errors:    0
            Error Information Log Entries:      0
          • marcan_42 5 years ago

            The SMART info does not include the drive size, which was an omission in my request for info tweet which I regret :)

            So 18TB / 256GB = 70 drive writes is 1%.

            For the 2T drive, we had 150TB = ~50 drive writes being 3%.

            So, it seems this isn't linear after all. But it's also not constant; 150TB would put you at 8% used on your 256GB model with this scaling (instead of the ~23% if it were linear), which is still not insignificant, and would still get you to 100% within a couple years. Though there is significant rounding error with the "1%" figure.

            I saw someone else mention 3% on Twitter too and asked for their drive size, so I hope that can give me another more accurate data point.

            • supermatt 5 years ago

              I just assumed that this was some deep "app nap", where its effectively swapping the entire state of an inactive app to disk. That would likely explain why the usage is scaling with available RAM, rather than storage space.

              I get that its a much higher wear than we would traditionally expect, but given I have 8GB of RAM that seems near inexhaustible., maybe it just a different approach to memory management, permitted by the blazingly fast storage we now have access to? I simply couldn't run all this stuff on my 16GB 2015 machine.

              I cant run figma though - that thing eats 4GB for a smallish project, so it almost always complains about my available memory when im using figma.

              • marcan_42 5 years ago

                So, going back over the numbers and with a 256GB 3% user confirmed on Twitter, it seems like the ratings are:

                2TB model: 5000TBW

                256GB model: 2000TBW

                That means that, at the worst known rate that David had (150TB in 2 months), a 256GB drive would reach lifetime writes in ~2 years. That's still way too fast, so this is still something Apple needs to adjust if it is happening to people (and given what he told me about the write rates he was seeing while using and switching apps, it really does seem like a bug, not normal app swapping).

                FWIW, the "app nap" thing is what iOS does, but macOS can't do that because Mac apps aren't really designed that way. macOS just does traditional swap plus compression, as far as I know.

                • supermatt 5 years ago

                  App Nap is definitely a MacOS thing - but it hasn't traditionally behaved in this way.

                  I'm aware it will be a form of virtual memory management - but given im currently (and actively) running safari, chrome, firefox, affinity designer, affinity photo, slack, mail, a handful of terminals, a couple of ios apps, an iphone 11 simulator, 2 android emulators, 4 instances of vscode recompiling code as I type, and a bunch of utility apps theres definitely more than just a traditional swap going on here. Its an 8GB machine, and it really feels like it has unlimited RAM (until i open figma, as mentioned before - theres something about that app that the m1 really doesnt like).

                  Of course, if it is a bug, ill be happy to have it fixed :)

                  Also, I just realised you are the OP :) And the guy working on asahi (awesome!). Would love to know more if you get to the bottom of this behaviour - intentional or otherwise!

                • rasz 5 years ago

                  Those ratings are unrealistic, suggest 8000 write cycle flash for the small drive.

      • rasz 5 years ago

        "Percentage used" might be as accurate as Cellphone reception bars. What is really important is the

            SSD capacity * ~600 / Bytes written 
        
        and quality of wear leveling firmware in Apple SSD.
tinus_hn 5 years ago

No, this is about that they, probably for software reasons, write a lot to the SSD which might cause defects in the future. This is not a hardware issue (apart perhaps from the 8Gb memory limitation)

  • jabberwckyOP 5 years ago

    We can conveniently nitpick whether or not we treat Apple products as an integrated hardware-software product, the reality is thousands of these units have shipped and from those screenshots, many have already racked up 2-4 years wear in a matter of weeks, for a component that cannot be replaced without replacing the entire motherboard.

    The product configuration as shipped inflicts permanent damage on itself, it's very difficult to see this as anything but a manufacturing defect.

    • coldtea 5 years ago

      >he reality is thousands of these units have shipped and from those screenshots, many have already racked up 2-4 years wear in a matter of weeks

      What "2-4 years wear"?

      It's 1% of usage in 2 months. That's par for the course for any SSD in any laptop with regular usage, and it should last long before the laptop is updated in 6 or so years...

      Heck, 50% "percentage used" would take 8 years with this rate...

      • feffe 5 years ago

        These are my stats from 2 identical sticks of 250 GB Samsung SSD bought in the summer of 2018. So yea, Apple probably want to tweak some settings. I use my computer a lot, Linux more than the Windows install which is mostly for gaming.

        Windows 10: Data Units Read: 11 332 875 [5,80 TB] Data Units Written: 8 173 369 [4,18 TB]

        Linux: Data Units Read: 5 837 296 [2,98 TB] Data Units Written: 4 891 744 [2,50 TB]

    • tinus_hn 5 years ago

      Then would ‘Most’ devices have this effect? Like a random hardware defect? Or all, like a software issue, only dependent on use?

rasz 5 years ago

Its ok people, Apple will offer you 1499.95 upgrade option once you run out of Write endurance.

perryizgr8 5 years ago

No worries, just get a nice Samsung SSD and repla... oh.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection