I was wondering, Do you know of a limit on how many rootless conrainers can one run on a linux host?
Running fedora server, I have resources but once I pass about 15 containers podman starts to hang and crash.
I then need to manually delete the storage folder under ~./local/share/...
for podman to work again.
It might be related to user ns keep-id flag.
My config is nothing special. Running all containers as user 1000 via qadlets. Sporadically, I get:
Dec 05 13:40:27 home-lab systemd-coredump[200811]: [🡕] Process 200795 (podman) of user 1000 dumped core. Module libbz2.so.1 from rpm bzip2-1.0.8-18.fc40.x86_64 Module libsepol.so.2 from rpm libsepol-3.7-2.fc40.x86_64 Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc40.x86_64 Module libcap-ng.so.0 from rpm libcap-ng-0.8.4-4.fc40.x86_64 Module libgpg-error.so.0 from rpm libgpg-error-1.49-1.fc40.x86_64 Module libpam_misc.so.0 from rpm pam-1.6.1-4.fc40.x86_64 Module libpam.so.0 from rpm pam-1.6.1-4.fc40.x86_64 Module libattr.so.1 from rpm attr-2.5.2-3.fc40.x86_64 Module libacl.so.1 from rpm acl-2.3.2-1.fc40.x86_64 Module libcrypt.so.2 from rpm libxcrypt-4.4.36-10.fc40.x86_64 Module libeconf.so.0 from rpm libeconf-0.6.2-2.fc40.x86_64 Module libsemanage.so.2 from rpm libsemanage-3.7-2.fc40.x86_64 Module libselinux.so.1 from rpm libselinux-3.7-5.fc40.x86_64 Module libaudit.so.1 from rpm audit-4.0.2-1.fc40.x86_64 Module libseccomp.so.2 from rpm libseccomp-2.5.5-1.fc40.x86_64 Module podman from rpm podman-5.3.1-1.fc40.x86_64 Stack trace of thread 200805: #0 0x0000558789bfa4a1 runtime.raise.abi0 (podman + 0x934a1) #1 0x0000558789bd6cc8 runtime.sigfwdgo (podman + 0x6fcc8) #2 0x0000558789bd51a5 runtime.sigtrampgo (podman + 0x6e1a5) #3 0x0000558789bfa7a9 runtime.sigtramp.abi0 (podman + 0x937a9) #4 0x00007efdbc0cad00 __restore_rt (libc.so.6 + 0x40d00) #5 0x0000558789bfa4a1 runtime.raise.abi0 (podman + 0x934a1) #6 0x0000558789bbda26 runtime.fatalpanic (podman + 0x56a26) #7 0x0000558789bbc998 runtime.gopanic (podman + 0x55998) #8 0x0000558789bd64d8 runtime.sigpanic (podman + 0x6f4d8) #9 0x000055878a5a7842 github.com/containers/storage.(*layerStore).load (podman + 0xa40842) #10 0x000055878a5a9608 github.com/containers/storage.(*store).newLayerStore (podman + 0xa42608) #11 0x000055878a5bc7dd github.com/containers/storage.(*store).getLayerStoreLocked (podman + 0xa557dd) #12 0x000055878a5bc935 github.com/containers/storage.(*store).getLayerStore (podman + 0xa55935) #13 0x000055878a5cc451 github.com/containers/storage.(*store).Mounted (podman + 0xa65451) #14 0x000055878ac99b88 github.com/containers/podman/v5/libpod.(*storageService).UnmountContainerImage (podman + 0x1132b88) #15 0x000055878abec81a github.com/containers/podman/v5/libpod.(*Container).unmount (podman + 0x108581a) #16 0x000055878abe8865 github.com/containers/podman/v5/libpod.(*Container).cleanupStorage (podman + 0x1081865) #17 0x000055878abe965b github.com/containers/podman/v5/libpod.(*Container).cleanup (podman + 0x108265b) #18 0x000055878ac6c2ce github.com/containers/podman/v5/libpod.(*Runtime).removeContainer (podman + 0x11052ce) #19 0x000055878ac6aad0 github.com/containers/podman/v5/libpod.(*Runtime).RemoveContainer (podman + 0x1103ad0) #20 0x000055878ad05948 github.com/containers/podman/v5/pkg/domain/infra/abi.(*ContainerEngine).removeContainer (podman + 0x119e948) #21 0x000055878ad06745 github.com/containers/podman/v5/pkg/domain/infra/abi.(*ContainerEngine).ContainerRm.func1 (podman + 0x119f745) #22 0x000055878ace297b github.com/containers/podman/v5/pkg/parallel/ctr.ContainerOp.func1 (podman + 0x117b97b) #23 0x000055878aade678 github.com/containers/podman/v5/pkg/parallel.Enqueue.func1 (podman + 0xf77678) #24 0x0000558789bf8c41 runtime.goexit.abi0 (podman + 0x91c41) ELF object binary architecture: AMD x86-64
I have enugh RAM and CPU and Disk to spare…
when this error happens, I cant run any
podman
commands without a core dump. i.e I cantpodman images
podman ps
and so on… The only solution is to delete the storage folder manually, pull all the images again. and it’s back to normal.