Re: [DNG] btrfs

Author: Adam Borowski
Date:
To: Hendrik Boom
CC: dng
Subject: Re: [DNG] btrfs

On Sat, Aug 13, 2016 at 02:26:13PM -0400, Hendrik Boom wrote:
> > For anything that can break your system, and for running unstable, btrfs is
> > awesome. You can make snapshots at any point (most people have at least a
> > daily cronjob), and then restore or mount live when you want. And when you
> > make it unbootable, you append subvol=backups/@2016-08-12 (or whatever
> > you named it) in grub, and there you go.
>
> i seem to remember that the btrfs developers still are developing it,
> and have sid it isn't really ready for a production release yet. I
> figured that might be dated information, butt then I read the headline,
> only a few days ago:
>
> Btrfs RAID 5/6 Code Completely Hosed

>
> ( https://soylentnews.org/article.pl?sid=16/08/06/2118237 )
>
> Maybe it's not time to use btrfs yet.

Heh, trusting Phoronix as a news source is not exactly the best idea :p
That's the worst tabloid in the Linux world.

Indeed, btrfs raid 5/6 code has some nasty bugs. That's natural for an
experimental feature (although it's somewhat embarassing it takes so long to
stabilize). If it happened in parts of the filesystem declared stable,
that'd be a reason to worry.

Phoronix picked an exaggerated post by a list member. You wouldn't fully
trust a random DNG regular either, would you? That person is not a btrfs
developer and has not a single commit in either the kernel nor btrfs-progs.
He's not a troll, though, and the concern is real: DON'T USE BTRFS RAID5 OR
6 FOR PRODUCTION!

On the other hand, most parts of btrfs are considered fit for use. Some
mainstream distributions like Suse already have it as default. I for one
use btrfs for a variety of tasks too: backups, vservers, a ceph cluster, an
ARM box doing Debian archive rebuilds, etc, for years.

There are two points to consider:
* btrfs's code is moving quite fast, thus less stable
* btrfs has a number of data safety features
Thus, you need to balance trusting the filesystem itself vs it protecting
you from PEBKAC, hardware failures and bugs in _other_ software.

Thus, it's a natural thing to hesitate before running it on stable.
Unstable, on the other hand, breaks regularly: any time your daily
dist-upgrade may give you broken X drivers, someone pulling runit-init from
under you, etc. That's why I recommend running btrfs on such systems for
its snapshot functionality.

So let's list some of such features:
* snapshots. Most people do it daily, or to clone a chroot, but if you want
to go extreme and snapshot every hour, go ahead!
* reflinks (CoW files/extents): copy a big file (like a VM image) and change
just a small part: it takes only as much space as your changes. Makes it
easy to keep many versions in case you make a mistake.
* deduplication: cram more backups on one disk. Can save your butt if you
realize you made a mistake months ago.
* data and metadata checksums. On single device, lets you know your data
went kaputt so you can restore from backups. On RAID, allows recovery
even if the disk claims the read went ok (in this case, md and hw raid
can't tell which copy is good, thus failing both).
* scrub: cheap test of validity of all extents on a mounted filesystem
* all writes are CoWed: on hardware failure, it's likely a recent earlier
generation of a file you've written to is still there. Been there, done
that, saved my ass.
* if you mount with flushoncommit (some versions have it as default, some
don't), all files on the entire filesystem are consistent in the event of
a crash. This is like data=journal on ext[34], except that there's only
a minor performance cost (unlike less than half of speed as on ext) and
the consistency guarantee also applies _between_ files (ie, no reorders).
* send/receive: O(delta) rsync. If your filesystem takes 30 minutes to stat
all files, btrfs send can do so instantly, knowing what has been changed.
Want to do hourly backups?
* likewise, O(delta) enumeration of new writes
* fancy RAID1/10 modes on differently sized disks
* mounting with -ocompress gets you more disk space, on uncompressible files
compression attempt is quickly aborted so you don't need to micromanage
this. Especially with -ocompress=lzo compression can speed up I/O if your
CPU is better than your disk.

On the other hand, downsides:
* stability: less mature code base
* the only such bug I've personally met on stable kernels was one occurence
of ENOSPC on a filesystem only ~80% full ("btrfs balance" to fix this)
* fragmentation: random writes to a file (like, a database, especially
sqlite as in Firefox's history db) on HDD can make the file ridiculously
slow. Mounting with -oautodedrag helps somewhat.
* when performance sucks, it sucks more than when it shines. Good:
unpacking a tarball, compression on slow disks. Bad: random writes.
Extreme good cases (compiling on eMMC on Odroid-U2) can be 2x ext4, extreme
bad cases (fsyncs in postgres on ext4 in a VM stored on btrfs) can be even
1/10x ext4.
* fsync in general. Consider copious use of eatmydata if you have snapshots
and/or the interrupted data would be worthless on power loss (sbuild for
example).

Features not to use:
* raid 5/6 (dangerous!)
* quota
* there are unconfirmed reports of badness of compression on encrypted

A quite close alternative to btrfs is zfs.
* stable
* sucks on Linux. Thank Oracle for not fixing intentionally anti-Linux
licensing inherited from Sun! This prevents integrating it well.
* ridiculous memory needs

--
An imaginary friend squared is a real enemy.

This message is part of the following thread:
	the complete thread tree sorted by date

Donate to Dyne.org