Hi,

reading here and there I’m getting scared and scared about bit rot or similar problems (firmware error? ssd nand ruined?): the disks seems to work fine, but data might be corrupted and so (my) easily backup strategy - that includes two rsynced copies in different places and time, one on ssd and another on hdd - cannot be enough. Please note that all my personal data are on ext4 filesystems and they are less than 1 TB (ok, it’s not a datahoarding size bu this is a sub where theme-related experts are). Maybe the probability is low, and the probability that a critical file is impacted is lowest, but you know Murphy? I do.

Now, the gold solution should be to replace all of my physical servers with others that support ECC ram; then I’ll have to buy at least 3 CMR-disks for building a ZFS raid or a btrfs similar one. Actually this solution is not sustainable because of time, space and cost: so I have to accept the risk to a second best solution… but which? I also would like to avoid the use of other (just optical) media type.

For example, using a backup tool - restic/kopia or proxmox backup server - might riduce the risk? I say so because of an incremental approach might allow me to restore data at selected point in the past. Of course, I have no way to find that point in the past and, moreover, i will lost all data produced after the time point. Maybe I could apply this strategy just to a subset of very critical and immutable data (official documents)? Or, for these documents, I could just use rsync with the checksum option?

As usual, thanks for any suggestion!

  • bofh2023@alien.topB
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    Checksum your files, as you say. Just md5sum or sfv tools will do this and are quick and easy to check at any time.

    If you want a way to recover as well, create a certain % of parity files with a tool like par2. If you’re just worried about the occasional flipped bit, a very small % of parity will go a really long way. Creating par2 is NOT super fast but you only have to do it once.

    This is basically file-level RAID.

    • wireless82@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Interesting, I will go deep. I think I should add or reformulate the question or I am missing a thing: I first have to avoid - or check - that source files are not bitrotten, otherwise all backups will be have bitrotten data, right? I mean, I might checksum but on working data how can I have evidence of it? And on “cold” data - f.e., pictures - how can be sure that source ruined files are not copied onto the backups? I cant - because of time - checksum 1tb of data every night…