necheff.net

Upgrading My Zpool

If you are not familiar, ZFS is a file system, device manager, and all around storage solution. It started life at Sun Microsystems and has a lot of attractive features; snapshots, transactional writes, and block level error detection/correction to name a few. What is more, you don't even need to use the ZFS file system itself to get most of these benefits, if you want you can create a zvol (roughly analogous to a logical volume in LVM) and then place ext4 or swap or XFS on top and retain many of the features that make ZFS great. The zvol feature alone is probably one of the biggest reasons I prefer ZFS to btrfs. ZFS is very mature and has a sound, battle tested design. I may run btrfs on my workstations, but for my bulk data storage, there is no substitute for ZFS.

I have had my zpool for around 5 or 6 years now; it has gone through many OpenZFS driver updates, and a couple disk upgrades/replacements. Two releases I was really looking forwards to were the 0.7.0 release and the 0.8.0 release; because of sequential scrubs and native encryption. Now that 0.8.4 has made it into Debian Buster backports, I want to start using native encryption. Plus, the skein hashing algorithm is an attractive alternative to the fletcher4 algorithm my zpool was originally configured with.

While we can enable encryption at the zpool level and force each dataset contained in the zpool to be encrypted, there isn't such a thing as pool-wide encryption. Each dataset would be encrypted separately. Instead, we'll just create an encrypted dataset that can act as a root for other encrypted datasets, allowing us to add plaintext datasets to the zpool later.

First things first, lets make sure we have an up-to-date backup (or two).


~$ zfs snap zp0/prod@today
~$ zfs send -vPi zp0/prod@previous zp0/prod@today | zfs receive -v backup/crypted/prod

After making sure backups are up-to-date, delete the old zpool. This is technically not required, but I wanted to start fresh, this also ensures that pool metadata itself picks up the new hashing algorithm. As an added bonus, I can make sure my disks are organized optimally and later when I restore from my backups the data will be defragmented and balanced across the individual disks.


~$ zpool destroy zp0
~$ zpool create -o ashift=12 -o autoexpand=on -o listsnapshots=on zp0 \
    mirror /dev/disk/by-id/$DISK1 /dev/disk/by-id/$DISK2 mirror /dev/disk/by-id/$DISK3 /dev/disk/by-id/$DISK4 \
    mirror /dev/disk/by-id/$DISK5 /dev/disk/by-id/$DISK6 mirror /dev/disk/by-id/$DISK7 /dev/disk/by-id/$DISK8

This is effectively a RAID10, which gives the best balance between performance and redundancy.

The ashift option uses an exponent of 2 to set the blocksize. In this case, with ashift 12 or 2^12 we force 4096 byte blocks. In some cases an ashift of 13 might even be better. Check what your drives support.
The autoexpand option makes upgrading the size of the pool later by resilvering with larger disks a little easier.
I like to see what snapshots are available on a pool so I usually set the listsnapshots option.

With a fresh zpool assembled, it is time to start adding datasets.


~$ zfs create -o compression=lz4 -o checksum=skein -o snapdir=visible -o mountpoint=none -o encryption=aes-256-gcm -o keyformat=passphrase zp0/crypted
~$ zfs create -o compressoin=lz4 -o checksum=skein -o snapdir=visible -o mountpoint=none zp0/plaintext

These will be top-level datasets for other datasets to inherit from.

Enable lz4 compression, it offers a good compression to performance balance and ZFS will just not bother to compress files it senses won't get a good compression ratio.
Use skein for integrity checksums. skein is a SHA-3 finalist, and the ZFS implementation uses per dataset salt. This makes confidence in corruption detection much better.
Setting snapdir=visible lets users browse .snapshots from the shell, in a multiuser setting you may want to consider not setting this.
We want to require the pool to be manually mounted, that way, bootup doesn't hang waiting for a password.
Right now, aes-256-gcm is the best cipher available to ZFS users; relatively recent CPUs from AMD and Intel have accelerated instructions for AES.
You can use a keyfile to automatically unlock the pool; but I prefer entering a password manually.

Now it is time to move the data back onto the production zpool from the backups, note that zp0/crypted acts as a container for our datasets.


~$ zfs send -vP backup/crypted/prod@snapshot | zfs receive -v zp0/crypted/prod

Give that a few hours and you should be good to go. Don't forget to follow a similar process for the backup zpools!

When everything is done copying, the plaintext dataset can be temporarily filled with a large file full of zeros to wipe the freespace.


~$ mount -t zfs zp0/prod/plaintext /mnt
~$ dd if=/dev/zero of=/mnt/big.zero bs=4096 status=progress
~$ zpool sync zp0
~$ rm -f /mnt/big.zero
~$ zpool sync zp0

That is it. Now the zpool is using, native, AES encryption and the skein hashing algorithm.