[talk] Does swap still matter?

John Baldwin jhb at FreeBSD.org
Thu Apr 7 20:16:29 EDT 2022

On 4/7/22 1:46 PM, Charles Sprickman wrote:
>> On Apr 7, 2022, at 1:32 PM, John Baldwin <jhb at FreeBSD.org> wrote:
>> FWIW, I still configure swap on systems I install, but really it's just to
>> allocate space to hold kernel crashdumps, not because I really need/want
>> swap.
> I’m in the same habit, and I’m generally fairly generous on anything where space isn’t scarce (but on the odd “cloud” VM, I’m much more stingy as storage is pricey).
> A few questions, as I’ve been at this long enough that some of my information is probably outdated:
> - Is there a formula for how much swap space to allocate to be able to always fit the full kernel dump? I have one server at home that’s had a few issues, and with 8GB of RAM (which generally sees at least half of that being used by zfs ARC), my vmcore files are around 4GB. Curious what I’d see on a host with 64GB or 128GB of RAM.

Hummm, I don't know of a good one.  minidumps should mean that a crashdump
is less than all of RAM, but it can be a sizable chunk of it, especially
with ZFS.  The last machine I installed from scratch is a laptop with 16G
of RAM and I went with 16G of swap.  I recently upgraded my desktop to 64G
of RAM (good for all the compiling I do), but still went with 16G of swap
one each disk (and a core can't span multiple disks, so it's the single
disk value that counts).  I haven't had a panic on my new desktop yet to
see if it will fit. :-/

> - With SSDs being so common, especially for “boot” drives where swap would likely live, are there any special considerations? Do swap partitions get TRIM support? I also assume SSDs make a system going into swap due to physical memory exhaustion suffer far less than it would with spinny drives that are thrashing around between swap at the beginning of the drive and other data being spread out all over the remainder of the drive (but could also eat up your precious write cycles if swap usage is not being monitored).

Hmmmmm, I don't know if we try to TRIM swap partitions.  It is true that
SSDs make swapping less terrible.  An anecdote I heard about iPhones
early on is that they swap like mad, but using flash storage means it
still performs ok.  I'm not sure if that is still true of newer versions,
but it wouldn't surprise me as the amount of DRAM in mobile devices has
a power cost (and power is the primary "currency" of a mobile BOM).

In terms of write cycles, in my experience to date (and of folks I've talked
to at other places), that seems a bit overblown as SSDs seem to last a
fairly long time such that it's not really a practical worry.

> - Anywhere I have multiple drives, I always mirror my swap, using gmirror. Any drawbacks to mirroring? Is gmirror still the only proper option for this?

So I don't use gmirror for swap, I just let add them as individual swap
devices in /etc/fstab.  The swap pager in FreeBSD will spread swap
access across multiple devices just fine, and it's not clear that
mirroring is a win for a swap workload to me (you are doing double writes,
so swapping out it slower vs multiple swap devices which can potentially
schedule writes of different data to the disk).  Maybe mirroring would
help with reading the data back in faster perhaps, but usually if you are
reading enough from swap for that to matter you are thrashing and already

> - Is swap on zfs advisable in any situations yet?

I create a separate GPT partition for swap separate from ZFS rather than
trying to swap to ZVOLs (assuming that is what you mean).  I suspect that
will perform better generally speaking that a ZVOL for swap.  Also, swap
gets used when you are in a low-memory situation and you ideally want to
avoid lots of complexity in the I/O path to swap to avoid potential low
memory deadlocks (i.e. you need to allocate something like duplicating an
I/O request to cross a gmirror device, or allocating some data structures
in ZFS to deal with writes ZFS needs to do to a zvol) vs just a dumb
write straight to a partition on a disk that can generally avoid malloc.

John Baldwin

More information about the talk mailing list