I wouldn’t put my mission-critical file server on BTRFS.
Oh, but I and a lot of people do and it is way more reliable than ext* filesystems ever were. Maybe ZFS or XFS is more your style then? Ext4 is very, very prone to total failure and complete data loss at the slightest hardware issue. I’m not saying you should rely on any filesystem ever, backups are important and should be there, the thing it that recovering from backups takes time and the amount of recovery that ext forced me into over the years isn’t just acceptable.
ZFS is still the de-facto standard of a reliable filesystem. It’s super stable, and annoyingly strict on what you can do with it. Their Raid5 and Raid6 support are the only available software raids in those levels that are guaranteed to not eat your data. I’ve run a TrueNAS server with Raid6 for years now, with absolutely no issues and tens of terabytes of data.
But, these copy on write filesystems such as ZFS or btrfs are not great for all purposes. For example running a Postgres server on any CoW filesystem will require a lot of tweaking to get reasonable speeds with the database. It’s doable, but it’s a lot of settings to change.
About the code quality of Linux filesystems, Kent Overstreet, the author of the next new CoW filesystem bcachefs, has a good write-up of the ups and downs:
ext4, which works - mostly - but is showing its age. The codebase terrifies most filesystem developers who have had to work on it, and heavy users still run into terrifying performance and data corruption bugs with frightening regularity. The general opinion of filesystem developers is that it’s a miracle it works as well as it does, and ext4’s best feature is its fsck (which does indeed work miracles).
xfs, which is reliable and robust but still fundamentally a classical design - it’s designed around update in place, not copy on write (COW). As someone who’s both read and written quite a bit of filesystem code, the xfs developers (and Dave Chinner in particular) routinely impress me with just how rigorous their code is - the quality of the xfs code is genuinely head and shoulders above any other upstream filesystem. Unfortunately, there is a long list of very desirable features that are not really possible in a non COW filesystem, and it is generally recognized that xfs will not be the vehicle for those features.
btrfs, which was supposed to be Linux’s next generation COW filesystem - Linux’s answer to zfs. Unfortunately, too much code was written too quickly without focusing on getting the core design correct first, and now it has too many design mistakes baked into the on disk format and an enormous, messy codebase - bigger that xfs. It’s taken far too long to stabilize as well - poisoning the well for future filesystems because too many people were burned on btrfs, repeatedly (e.g. Fedora’s tried to switch to btrfs multiple times and had to switch at the last minute, and server vendors who years ago hoped to one day roll out btrfs are now quietly migrating to xfs instead).
zfs, to which we all owe a debt for showing us what could be done in a COW filesystem, but is never going to be a first class citizen on Linux. Also, they made certain design compromises that I can’t fault them for - but it’s possible to better. (Primarily, zfs is block based, not extent based, whereas all other modern filesystems have been extent based for years: the reason they did this is that extents plus snapshots are really hard).
I started evaluating bcachefs in my main workstation when it arrived to the stable kernels. It can do pretty good raid-1 with encryption and compression. This combination is not really available integrated to the filesystem in anywhere else but zfs. And zfs doesn’t work with all the kernels, which prevents updating to the latest and greatest. It is already a pretty usable system, and in a few years will probably take the crown as the default filesystem in mainstream distros.
Yeah. I would not for example install ZFS to a laptop. It’s just not great there, and it doesn’t like things such as sudden power failure, and it uses kind of a lot of memory…
Meanwhile BTRFS provides me with snapshots and rollbacks that are a useful when I’m messing with the system. And subvolumes bring a lot of flexibility for containers and general management.
For sure. I would say if you run a distro like Arch, using it without cow filesystem and snapshots is not a good idea… You can even integrate snapshots with pacman and bootloader.
I’ve been running nixos for so long, that I don’t really need snapshots. You can always boot to the previous state if needed.
If you write software and run tests against a database, I’d avoid having the docker volumes on btrfs pool. The performance is not great.
Do you have a source for the ext4 failure stuff? I use ext4 currently and want to see if there’s something I need to do now other than frequent backups
They don’t. ext4 has been the primary production filesystem for over 15 years. And it’s basically modified ext3 so it’s been around even longer as a format.
It’s very stable. It’s still the default for many distros even.
I used ext4 extensively in an HPC setting a few jobs ago (many petabytes). Some of the server clusters were in areas with very unreliable power grids like Indonesia. Using fsck.ext4 had become our bread and butter, but it was also nerve wracking because in the worst failures that involved power loss or failed RAID cards, we sometimes didn’t get clean fscks. Most often this resulted in loss of file metadata which was a pain to try to recover from. To its credit, as another quote in this thread mentioned, fsck.ext4 has a very high success rate, but honestly you shouldn’t need to manually intervene as a filesystem admin in an ideal world. That’s the sort of thing next gen filesystem attempt to provide.
Not seen a fs corruption yet. But i have only run ext4 on around 350 production servers since 2010 ish.
Have ofcourse seen plenty of hardware failures. But if a disk is doing the clicky, it is not another filesystem that saves you.
ext4 is still solid for most use cases (I also use it). It’s not innovative, and possibly not as performant as the newer file systems, but if you’re okay with that there’s nothing wrong with using it. I intend to look into xfs and btrfs the next time I spin up a new drive or a new machine, but there’s no hurry (and I may not switch even then).
There’s an unfortunate tendency for people who like to have the newest and greatest software to assume that the old code their new-shiny is supposed to replace is broken. That’s seldom actually the case: if the old software has been performing correctly all this time, it’s usually still good for its original use case and within the scope of its original limitations and environment. It only becomes truly broken when the appropriate environment can’t be easily reproduced or one of the limitations becomes a significant security hole.
That doesn’t mean that shiny new software with new features is bad, or that there isn’t some old software that has never quite performed properly, just that if it ain’t broke, it’s okay to set a conservative upgrade schedule.
Well a few years ago I actually did some research into that but didn’t find much about it. What I said was my personal experience but now we also have companies like Synology pushing BRTFS for home and business customers and they have analytics on that for sure… since they’re trying to move everything…
I used to associate btrfs with the word unreliable for years based on what I’ve read here and there, while ext4 appears to be rock solid. Pointing to sources for this is not easy though. Here’s a start.
ext4 (fourth extended file system) is an open source disk filesystem and most recent version of the extended series
of filesystems. It is the primary file system in use by many Linux systems rendering it to be arguably the most stable
and well tested file system supported in Linux.
The “Caveats” section for BTRFS is trash, it is all about a ENOSPC issue that requires you to low level mess with the thing or run the fs for years over constant writes without any kind maintenance (with automatic defragmentation explicitly disabled). Frankly I can point from the top of my head real issues they aren’t speaking about: RAID56 (everything?), RAID10 (improve reading performance with more parallelization).
If we take subvolumes, snapshots, deduplication, CoW, checksums and compression in consideration then there’s no reason to ever use ext4 as it is just… archaic. Synology is pushing for BRTFS at home and business so they must have analytics backing that as well.
Oh, but I and a lot of people do and it is way more reliable than ext* filesystems ever were. Maybe ZFS or XFS is more your style then? Ext4 is very, very prone to total failure and complete data loss at the slightest hardware issue. I’m not saying you should rely on any filesystem ever, backups are important and should be there, the thing it that recovering from backups takes time and the amount of recovery that ext forced me into over the years isn’t just acceptable.
ZFS is still the de-facto standard of a reliable filesystem. It’s super stable, and annoyingly strict on what you can do with it. Their Raid5 and Raid6 support are the only available software raids in those levels that are guaranteed to not eat your data. I’ve run a TrueNAS server with Raid6 for years now, with absolutely no issues and tens of terabytes of data.
But, these copy on write filesystems such as ZFS or btrfs are not great for all purposes. For example running a Postgres server on any CoW filesystem will require a lot of tweaking to get reasonable speeds with the database. It’s doable, but it’s a lot of settings to change.
About the code quality of Linux filesystems, Kent Overstreet, the author of the next new CoW filesystem bcachefs, has a good write-up of the ups and downs:
I started evaluating bcachefs in my main workstation when it arrived to the stable kernels. It can do pretty good raid-1 with encryption and compression. This combination is not really available integrated to the filesystem in anywhere else but zfs. And zfs doesn’t work with all the kernels, which prevents updating to the latest and greatest. It is already a pretty usable system, and in a few years will probably take the crown as the default filesystem in mainstream distros.
Yes and that’s the reason why I usually pick BTRFS for less complex things.
Yeah. I would not for example install ZFS to a laptop. It’s just not great there, and it doesn’t like things such as sudden power failure, and it uses kind of a lot of memory…
Meanwhile BTRFS provides me with snapshots and rollbacks that are a useful when I’m messing with the system. And subvolumes bring a lot of flexibility for containers and general management.
For sure. I would say if you run a distro like Arch, using it without cow filesystem and snapshots is not a good idea… You can even integrate snapshots with pacman and bootloader.
I’ve been running nixos for so long, that I don’t really need snapshots. You can always boot to the previous state if needed.
If you write software and run tests against a database, I’d avoid having the docker volumes on btrfs pool. The performance is not great.
Do you have a source for the ext4 failure stuff? I use ext4 currently and want to see if there’s something I need to do now other than frequent backups
They don’t. ext4 has been the primary production filesystem for over 15 years. And it’s basically modified ext3 so it’s been around even longer as a format.
It’s very stable. It’s still the default for many distros even.
I used ext4 extensively in an HPC setting a few jobs ago (many petabytes). Some of the server clusters were in areas with very unreliable power grids like Indonesia. Using fsck.ext4 had become our bread and butter, but it was also nerve wracking because in the worst failures that involved power loss or failed RAID cards, we sometimes didn’t get clean fscks. Most often this resulted in loss of file metadata which was a pain to try to recover from. To its credit, as another quote in this thread mentioned, fsck.ext4 has a very high success rate, but honestly you shouldn’t need to manually intervene as a filesystem admin in an ideal world. That’s the sort of thing next gen filesystem attempt to provide.
Not seen a fs corruption yet. But i have only run ext4 on around 350 production servers since 2010 ish.
Have ofcourse seen plenty of hardware failures. But if a disk is doing the clicky, it is not another filesystem that saves you.
Have regularly tested backups!
ext4 is still solid for most use cases (I also use it). It’s not innovative, and possibly not as performant as the newer file systems, but if you’re okay with that there’s nothing wrong with using it. I intend to look into xfs and btrfs the next time I spin up a new drive or a new machine, but there’s no hurry (and I may not switch even then).
There’s an unfortunate tendency for people who like to have the newest and greatest software to assume that the old code their new-shiny is supposed to replace is broken. That’s seldom actually the case: if the old software has been performing correctly all this time, it’s usually still good for its original use case and within the scope of its original limitations and environment. It only becomes truly broken when the appropriate environment can’t be easily reproduced or one of the limitations becomes a significant security hole.
That doesn’t mean that shiny new software with new features is bad, or that there isn’t some old software that has never quite performed properly, just that if it ain’t broke, it’s okay to set a conservative upgrade schedule.
Oh ok, that makes sense. Thanks!
I love XFS but it is much more prone to failure than Ext4. No idea what they are talking about.
Well a few years ago I actually did some research into that but didn’t find much about it. What I said was my personal experience but now we also have companies like Synology pushing BRTFS for home and business customers and they have analytics on that for sure… since they’re trying to move everything…
Huh interesting!
I used to associate btrfs with the word unreliable for years based on what I’ve read here and there, while ext4 appears to be rock solid. Pointing to sources for this is not easy though. Here’s a start.
See Features and Caveats here for Btrfs : https://wiki.gentoo.org/wiki/Btrfs#Features
For Ext4 https://wiki.gentoo.org/wiki/Ext4
The “Caveats” section for BTRFS is trash, it is all about a ENOSPC issue that requires you to low level mess with the thing or run the fs for years over constant writes without any kind maintenance (with automatic defragmentation explicitly disabled). Frankly I can point from the top of my head real issues they aren’t speaking about: RAID56 (everything?), RAID10 (improve reading performance with more parallelization).
If we take subvolumes, snapshots, deduplication, CoW, checksums and compression in consideration then there’s no reason to ever use ext4 as it is just… archaic. Synology is pushing for BRTFS at home and business so they must have analytics backing that as well.