Booting NixOS with a missing drive

leisesprecher@feddit.org · 2 months ago

Booting NixOS with a missing drive

Starfighter@discuss.tchncs.de · edit-2 2 months ago

You can boot a live-usb, mount the remaining drive(s) and nixos-enter over like you would if you were installing NixOS for the first time.

This allows you to make changes and build a new generation using the network connection of the live-usb.

leisesprecher@feddit.org · edit-2 2 months ago

I didn’t know NixOS has its own chroot.

I’m “chrooted” into the system, howver, user-mapping seems not to work currently. Logrotate fails to build because of a missing user ID.

EDIT: I commented out all the “advanced” features and had to add some flags.

services.logrotate.checkConfig = false; in the config solved the “user or group not found” issue and add NIXOS_SWITCH_USE_DIRTY_ENV=1to nixos-rebuild (https://nixos.wiki/wiki/Change_root).

Now it seems to work and I could piece by piece reactive all the deactivated parts.

over_clox@lemmy.world · 2 months ago

Can I ask what kind of drive the failed drive is?

If by chance it’s a Western Digital or Seagate spinning hard drive, I recommend to take the controller board off the drive and clean the contact traces with a pencil eraser.

You’d be surprised how many of those drives I’ve fixed that way. It’s like they didn’t even properly finish the manufacturing process, and those contact points corrode over time. Hitachi did their drives right though, they put solder beads on those contact points to prevent corrosion.

leisesprecher@feddit.org · 2 months ago

It’s a Seagate, yes. And that might actually do work, but currently it looks more like about 2 bazillions bad sectors.

over_clox@lemmy.world · 2 months ago

I can’t speak for whether those existing bad sectors can or can’t be recovered, but yeah, when there’s bad connections between the controller board and the drive, then the actuator heads will have a less than stable current going to them and won’t be able to reliably align over the tracks.

It still might be enough to get you back up and running long enough to change your fstab file to make appropriate changes to remove the failing drive from the configuration.