Upgrading My Zpool

If you are not familiar, ZFS is a file system, device manager, and all around storage solution. It started life at Sun Microsystems and has a lot of attractive features; snapshots, transactional writes, and block level error detection/correction to name a few. What is more, you don't even need to use the ZFS file system itself to get most of these benefits, if you want you can create a zvol (roughly analogous to a logical volume in LVM) and then place ext4 or swap or XFS on top and retain many of the features that make ZFS great. The zvol feature alone is probably one of the biggest reasons I prefer ZFS to btrfs. ZFS is very mature and has a sound, battle tested design. I may run btrfs on my workstations, but for my bulk data storage, there is no substitute for ZFS.

I have had my zpool for around 5 or 6 years now; it has gone through many OpenZFS driver updates, and a couple disk upgrades/replacements. Two releases I was really looking forwards to were the 0.7.0 release and the 0.8.0 release; because of sequential scrubs and native encryption. Now that 0.8.4 has made it into Debian Buster backports, I want to start using native encryption. Plus, the skein hashing algorithm is an attractive alternative to the fletcher4 algorithm my zpool was originally configured with.

While we can enable encryption at the zpool level and force each dataset contained in the zpool to be encrypted, there isn't such a thing as pool-wide encryption. Each dataset would be encrypted separately. Instead, we'll just create an encrypted dataset that can act as a root for other encrypted datasets, allowing us to add plaintext datasets to the zpool later.

First things first, lets make sure we have an up-to-date backup (or two).


~$ zfs snap zp0/prod@today
~$ zfs send -vPi zp0/prod@previous zp0/prod@today | zfs receive -v backup/crypted/prod
        

After making sure backups are up-to-date, delete the old zpool. This is technically not required, but I wanted to start fresh, this also ensures that pool metadata itself picks up the new hashing algorithm. As an added bonus, I can make sure my disks are organized optimally and later when I restore from my backups the data will be defragmented and balanced across the individual disks.


~$ zpool destroy zp0
~$ zpool create -o ashift=12 -o autoexpand=on -o listsnapshots=on zp0 \
    mirror /dev/disk/by-id/$DISK1 /dev/disk/by-id/$DISK2 mirror /dev/disk/by-id/$DISK3 /dev/disk/by-id/$DISK4 \
    mirror /dev/disk/by-id/$DISK5 /dev/disk/by-id/$DISK6 mirror /dev/disk/by-id/$DISK7 /dev/disk/by-id/$DISK8
        

This is effectively a RAID10, which gives the best balance between performance and redundancy.

With a fresh zpool assembled, it is time to start adding datasets.


~$ zfs create -o compression=lz4 -o checksum=skein -o snapdir=visible -o mountpoint=none -o encryption=aes-256-gcm -o keyformat=passphrase zp0/crypted
~$ zfs create -o compressoin=lz4 -o checksum=skein -o snapdir=visible -o mountpoint=none zp0/plaintext
        

These will be top-level datasets for other datasets to inherit from.

Now it is time to move the data back onto the production zpool from the backups, note that zp0/crypted acts as a container for our datasets.


~$ zfs send -vP backup/crypted/prod@snapshot | zfs receive -v zp0/crypted/prod
        

Give that a few hours and you should be good to go. Don't forget to follow a similar process for the backup zpools!

When everything is done copying, the plaintext dataset can be temporarily filled with a large file full of zeros to wipe the freespace.


~$ mount -t zfs zp0/prod/plaintext /mnt
~$ dd if=/dev/zero of=/mnt/big.zero bs=4096 status=progress
~$ zpool sync zp0
~$ rm -f /mnt/big.zero
~$ zpool sync zp0
        

That is it. Now the zpool is using, native, AES encryption and the skein hashing algorithm.