I’ve been wondering whether it’s better for memory pages to be compressed at the hypervisor level, or on the VM level.
I’m leaning toward the VM level, because
1: VMs have better knowledge of memory pressure by the application, and can better decide when to swap pages out to zram. The VM has access to information about memory pages that the hypervisor doesn’t have.
2: if pages are compressed on the hypervisor level, the VM doesn’t “see” any increased memory available. The host box gains free memory, but the application never sees it to make use of it, it’ll just see the same 8GB as it always has, so it never really benefits. This maybe lets you host more VMs on one box, but at the cost of the applications not being as efficient.
Is this a reasonable position? I’m wondering if I’m missing something obvious.


Unless you are running at really large scales, or really small scales and trying to fit stuff that quite fit, memory compression may not be significant enough of an optimization to spend a lot of time experimenting a lot. But I’m bored and currently on an 8 GB device so here are my thoughts dumped out from my recent testing:
Zram vs Zswap (can be done at hypervisor or at host):
Kernel same page merging (KSM) (would be done at hypervisor level) (esxi also has an equivalent feature called something different):
In my opinion, the best thing is to enable zram or zswap at the virtual machine level and kernel same page merging at the hypervisor level, assuming you take into account and accept the marginal security risk and slightly weaker isolation that comes with KSM. There isn’t any point running zswap at two layers, because the hypervisor is just gonna spend a lot of time trying to see if it can compress stuff that’s already been compressed. Than KSM deduplicates memory across hosts. Although you may actually see worse savings overall if zram/zswap compression is only semi-deterministic and makes deduplication ahrder.
I agree with the other commenter as well about zram being weird with some workloads. Like I’ve heard of I think it was blender interacting weirdly with zram since zram is swap, making less total memory available in ram, whereas zswap compresses memory. If you really need to know you gotta test.