I know that for data storage the best bet is a NAS and RAID1 or something in that vein, but what about all the docker containers you are running, carefully configured services on your rpi, installed *arr services on your PC, etc.?

Do you have a simple way to automate backups and re-installs of these as well or are you just resigned to having to eventually reconfigure them all when the SD card fails, your OS needs a reinstall or the disk dies?

  • ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    8 months ago

    I run everything on a 2 node proxmox cluster with ZFS mirror volumes and replication of the VMs and CTs between them, run PBS with hourly snapshots, and sync that to multuple USB drives I swap off site.

    The docker VM can be ZFS snapshotted before major updates so I can rollback.

    • twei@feddit.de
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 months ago

      You should get another node, otherwise when node1 fails node2 will reboot itself and then do nothing because it has no quorum

        • twei@feddit.de
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 months ago

          I know, but every time I had to do that it felt like it’s a jank solution. If you have a raspberry pi or smth like that you can also set it up as a qdevice.

          …and if you’re completely fine with how it is you can also just leave it like it is

          • ikidd@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            7 months ago

            So I started to write a reply that said basically that I was OK doing that manually, but thought that “hell, I have a PBS box on the network that would do that fine”. So it took about 3 minutes to install the corosync-qdevice packages on all three and enable it. Good to go.

            Thanks for the kick in the ass.

          • ikidd@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            7 months ago

            So since I now had a “quorate” cluster again, I thought I’d try out HA. I’d always been under the impression that unless you had a shared storage LUN, you couldn’t HA anything. But I thought I’d trigger a replication and then down the 2nd node just as a test. And lo and behold, the first node brought up my OPNsense VM from the replicated image about 2 minutes after the second node lost contact, and internet starts working again.

            I’m really excited about having that feature working now. This was a good night, thank you.

            • twei@feddit.de
              link
              fedilink
              English
              arrow-up
              2
              ·
              7 months ago

              If you need another thing to do, you could try to make your opnsense HA and never have your internet stop working while rebooting a node. It’s pretty simple to set up, you might finish it in 1-2 evenings. Happy clustering!

              • ikidd@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                7 months ago

                I’ll look into that. I did see the option in opnsense once upon a time but never investigated it.