ZFS Setup

https://blog.heckel.xyz/2017/01/08/zfs-encryption-openzfs-zfs-on-linux/

ZFS never allows zfs_arc_min to drop below 1/32 of total memory.

System Requirements

RAM Requirement It is highly recommended that you Use ECC RAM.

For general use as a rule of thumb you should have min 1GB RAM per 1TB of disk.

If you are using deduplication feature then a good rule of thumb is 5GB RAM for every 1TB of disk. The reason is that deduplicated blocks are stored in a DDT table (stored in the ZFS ARC) which is actually stored in RAM. You shoud use ECC regardless of ZFS for all your critical data.

Installation

Add backports to sources list

vi /etc/apt/sources.list.d/bookworm-backports.list

deb http://deb.debian.org/debian bookworm-backports main contrib
deb-src http://deb.debian.org/debian bookworm-backports main contrib

vi /etc/apt/preferences.d/90_zfs

Package: src:zfs-linux
Pin: release n=bookworm-backports
Pin-Priority: 990

Now Insall ZFS

apt update
apt install dpkg-dev linux-headers-$(uname -r) linux-image-amd64
DEBIAN_FRONTEND=noninteractive apt install zfs-dkms zfsutils-linux

Check Disks in your system

To check disks, there sizes, wwn and physical sector size etc run

lsblk -do name,size,tran,phy-sec,model,serial,wwn

Note down the output so you can use correct ashift and device wwn to setup zpool.

You may need to prefix “wwn-” to the values displayed under wwn col when creating zpool.

Following can be tried but may not return actual/physical sector size. For “sda” drive run:

cat /sys/block/sda/queue/hw_sector_size
sudo hdparm -I /dev/sda | grep -i physical

Setup Zpool

sudo zpool create tank /dev/sda

To create mirror zpool (recommended if you have just 2 devices)

zpool create -o ashift=12 tank mirror /dev/sda /dev/sdb

To create raidz2

zpool create -o ashift=12 tank raidz2 wwn-<11111> wwn-<22222>

Use of wwwn to specify disks is highly recommneded.

NOTE: Always use “-o ashift=12”. Using “-o ashift=12” is highly recommended for all drives specially so for 4K sector drives. Today’s 8 TB WD Reds are all 4K drives. If you don’t throw that option in, your performance will suffer and will perform at 25% of what it should.

Non-privileged users are allowed to run zpool list, zpool iostat, zpool status, zpool get, zfs list, and zfs get.

Enable LZ4 compression (recommended):

zfs set compression=lz4 tank

LZ4 uses multiple CPU cores so having a multicore CPU helps. LZ4 is not just fast enough and will put a max of 20% more CPU strain, but it also is able to fast abort/skip incompressible data therefore compression will be used where appropriate.

ZFS slows down as it fills up past 80% of its storage capacity. So do not fill your filesystems past 80%. If you don’t follow this advice your ZFS instance may get permanently broken in terms of performance.

You should know that a stripe of mirrors (RAID10), has a higher failure probability compared to RAIDZ2 (or higher) on most common setups by several orders of magnitude. Though a mirror can resilver much faster, and it doesn’t significantly affect the vdev performance while doing so.

URE Risks: 6x8TB RAID10 has a 47.3% chance of encountering a URE during rebuild while A 6x8TB raidz2 has a 0.0002% chance!

So generally you should go for RAIDZ2

Encryption

Do not use encryption for the root of the pool.

Encrypted datasets cannot have unencrypted children.

Create a encryption root under the pool under the pool and put all encrypted pools under it (they will auto inherit enc).

Create an encryted root dataset:

sudo zfs create -o encryption=on -o keyformat=passphrase tank/data

If interactive password is set, you need to manually mount the root enc dataset on reboot:

sudo zfs mount tank/data -l
sudo zfs mount tank/data/child

Compression / Optimization

 zfs set compression=lz4 tank
 zfs set xattr=sa tank

Set autoreplace=on (if you have hotspares)

Periodically scrub your pool

zpool scrub tank

SIMD Issue

Linux 5.1 (and other kernels) stopped providing SIMD Acceleration which is now fixed

check if the kernel you are using has fpu enabled:

grep -ir __kernel_fpu_begin /usr/src/kernels
cat /proc/spl/kstat/zfs/fletcher_4_bench

If AES-NI is being used then you will see ” ” at top in “perf top” output.

When Issue#9749 is resolved then AES-GCM performance will greately improve!

High Availability

See this page

Ruuning PostgreSQL

zfs set atime=off tank/data/lxc

This doc explains using PostgreSQL on ZFS

Further, ZFS and compression actually improves performance when queries are IO bound

A user argues that Memory usage was extremely inefficient

Running MySQL

See for running MySQL