Autor: Curtis Maurand Data: A: Dng Assumpte: [DNG] BTRFS a story
Hello all,
I currently run several Beowulf VM's under KVM that have btrfs as the
main filesystem. One, in particular, has been giving me trouble. I kept
finding it in a state of read-only first thing in the morning. I was
attributing that to the backup scheme that I'm using to back up the VM
in such a way that doesn't require me to shut the machine down for half
an hour to transfer the image by using an external snapshot, copy the
image, then pivot back. That is working well for all of my virtual
machines.
Last week, the file system started failing every couple of hours with
btrfs errors. I cringe when that happens because btrfs does not recover
from errors very well. However, in addition to copying the entire image
I also take regular snapshots and and have a pretty good backup chain
using btrbk. Generally I'm impressed with the tool. It works well.
Restoration of data works really well from that for little things. With
a filesystem failure, restoration of the complete VM is necessary. Ok,
now that wasn't working. It's a 100GB VM, but the image is actually
200GB. It takes about 1/2 hour to copy that.
btrfs has, in the past, never recovered very well. I'm no sure why that
was, but if I ever used btrfs check --repair on a file system it always
resulted in disaster. I had to rebuild a machine from scratch once and
I was able to restore data from the snapshots successfully, but that
took days to complete.
In my preparations for Hari Kari I decided to try to recover the
partition using the btrfs recovery tool. I linked a chimaera minimal
live iso to the cd drive and booted from that. the btrfs-tools weren't
there, but apt let me install them. I gave the btrfs check --repair a
shot and lo ... success! repairs were made and the server has been up
for a few days. This is a major improvement from just 3 years ago.
I know, I know, I'm a glutton for punishment for using btrfs, but the
features are worth it. It really reminds of a couple of previous
filesystems from the old days, reiserfs and IBM's hpfs from OS/2. It
has the features of both (b-tree, extent based, dynamic inodes) and then
some, copy on write and striping of drives together. It says not to run
raid 5 or raid 10, but research shows that the problems that creep up
with that are due to trying to run raid 5 or 10 with non identical
drives (varying sizes, etc.). I never do that.