Memory compression on hypervisor vs host

rbos@lemmy.ca · 2 days ago

Memory compression on hypervisor vs host

moonpiedumplings · 2 days ago

Unless you are running at really large scales, or really small scales and trying to fit stuff that quite fit, memory compression may not be significant enough of an optimization to spend a lot of time experimenting a lot. But I’m bored and currently on an 8 GB device so here are my thoughts dumped out from my recent testing:

Zram vs Zswap (can be done at hypervisor or at host):

One or the other is commonly enabled on many modern distros. It is a perfectly reasonable position to simply use the distro’s defaults and not push it any further
Zram has much, much better compression, but suffers from LRU inversion. Essentially after zswap is full, fresh pages (memory) goes to the swap instead. Since these pages will probably be needed, it will be slower to get them from the disk then to get them from zram.
Zswap has much, much worse compression but cold, unused pages are moved to swap automatically, freeing up space
I am investigating ways to get around the above. See my thoughts on this and other differences here: https://github.com/moonpiedumplings/moonpiedumplings.github.io/blob/main/playground/asahi-setup/index.md#memory-optimization

Kernel same page merging (KSM) (would be done at hypervisor level) (esxi also has an equivalent feature called something different):

Only really efficient if you have lots of the same virtual machines
Used to overcommit (promise more re
- Dangerous, but highly cost saving. Many cheap VPS providers do this in order to save money. You can run four 8 GB vps on 24 GB of ram and take a semi-safe bet that not all of the memory will be used.

In my opinion, the best thing is to enable zram or zswap at the virtual machine level and kernel same page merging at the hypervisor level, assuming you take into account and accept the marginal security risk and slightly weaker isolation that comes with KSM. There isn’t any point running zswap at two layers, because the hypervisor is just gonna spend a lot of time trying to see if it can compress stuff that’s already been compressed. Than KSM deduplicates memory across hosts. Although you may actually see worse savings overall if zram/zswap compression is only semi-deterministic and makes deduplication ahrder.

I agree with the other commenter as well about zram being weird with some workloads. Like I’ve heard of I think it was blender interacting weirdly with zram since zram is swap, making less total memory available in ram, whereas zswap compresses memory. If you really need to know you gotta test.