Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup file replication between web servers to reduce risk in case of critical failure #1

Open
JonTheNiceGuy opened this issue Jul 23, 2020 · 6 comments
Labels
discussion This is an option for the future, and needs weight from one or more of the lug admins

Comments

@JonTheNiceGuy
Copy link
Member

JonTheNiceGuy commented Jul 23, 2020

Considerations are:

  1. Ceph
  2. Syncthing

Are there other options?

I considered using a single NFS share, but this introduces the risk if that node goes down. At least this way we have two nodes both sharing data!

I've found more success with Syncthing than Ceph, BUT... to do Syncthing needs to use BindFS, as I can't see how to use it to replicate the owner/group permissions, which BindFS does give you.

Syncthing experiments are spread across the lugorguk.syncthing and lugorguk.admins roles (mostly for bindfs and putting the right files in the right places.

Ceph install is in the lugorguk.ceph role, but no configuration nor setup is done.

@JonTheNiceGuy JonTheNiceGuy added the discussion This is an option for the future, and needs weight from one or more of the lug admins label Jul 23, 2020
@JonTheNiceGuy
Copy link
Member Author

One potential benefit to doing this means that we can set up short-ish lived round-robin DNS TTL entries for targetting both HOST1 and HOST2 (arbitrary names for the "new" web servers), but in the case where we want to perform an upgrade, reboot or whatever... we "just" comment out the host that needs to be maintained.

@RalphCorderoy
Copy link

What does the hosting infrastructure provide? Are the filesystems backed by block devices which are already redundant? Wouldn't that mean it's error we're protecting against, like fat fingers, rather than component failure?

@JonTheNiceGuy
Copy link
Member Author

I'm thinking more of critical failure like "I've just rebooted Server X and it's not come back up" than "I just deleted all the files". Syncthing does offer versioning too, if we wanted to try that route?

As mentioned, we could also use Syncthing to transfer files between hosts, which may be useful if we wanted to add an extra server for web hosting (going from HOST1 and HOST2 to include a new HOST3), or if we wanted to get an off-site backup, just by adding that remote host as a receive-only version. Does that make sense?

@RalphCorderoy
Copy link

If Ansible can provision a container as our web server, say, then I'd have thought no package upgrades are applied to it. Instead, a new container is provisioned by Ansible, which could pull in later versions of the packaging, and we switchover. If there's a problem, the old container is still present for rollback.

Adding syncing on top of Ansible and the package manager adds complexity, and another layer to debug.

Note, I've no practical experience with any of this, it's just based on my understanding.

@gwestwood
Copy link

Running multiple mirrored servers isn't something I've had much experience of, but I tend to agree with Ralph. The only thing I am thinking is - is this to do with the hosting of LUG sites? In which case we need something to trigger replication when a Lugmaster (or other person within the LUG managing the website) making changes to their website if we are mirroring it across 2 servers.

@JonTheNiceGuy
Copy link
Member Author

In which case we need something to trigger replication when a Lugmaster (or other person within the LUG managing the website) making changes to their website if we are mirroring it across 2 servers.

Syncthing has an iowatcher set on a path or set of paths, and updates all the attached nodes. A web interface or REST API can be checked to confirm there are no issues with the sync.

I've been using Syncthing at home for replicating all my projects between my NAS, my home server and my laptop. If I work on my home server (perhaps over SSH from my phone) when I next fire up my laptop, it automatically re-connects and re-syncs all the changes.

If Ansible can provision a container as our web server, say, then I'd have thought no package upgrades are applied to it. Instead, a new container is provisioned by Ansible, which could pull in later versions of the packaging, and we switchover. If there's a problem, the old container is still present for rollback.

Honestly, I don't think many of our lugmasters are yet at that point! I think we're more likely to be at the point where the web service is more in line with what we're doing today around "you have a [DOKUWIKI] web service. Here's where your files live." than "You have just upgraded from static.some.lug.org.uk:37de3a to static.some.lug.org.uk:ac2ba7. Rolling out to hosts A, B and C".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion This is an option for the future, and needs weight from one or more of the lug admins
Projects
None yet
Development

No branches or pull requests

3 participants