Thoughts on (not) automating the setup of our first cloud server

May 16, 2024

I recently set up our first cloud server, in a flailing way that's probably familiar to anyone who still remembers their first cloud VM (complete with a later discovery of cloud provider 'upsell'). The background for this cloud server is that we want to check external reachability of some of our systems, in addition to the internal reachability already checked by our metrics and monitoring system. The actual implementation of this is quite simple; the cloud server runs an instance of the Prometheus Blackbox agent for service checks, and our Prometheus server performs a subset of our Blackbox service checks through it (in addition to the full set of service checks that are done through our local Blackbox instance).

(Access to the cloud server's Blackbox instance is guarded with firewall rules, because giving access to Blackbox is somewhat risky.)

The proper modern way to set up cloud servers is with some automated provisioning system, so that you wind up with 'cattle' instead of 'pets' (partly because every so often the cloud provider is going to abruptly terminate your server and maybe lose its data). We don't use such an automation system for our existing physical servers, so I opted not to try to learn both a cloud provider's way of doing things and a cloud server automation system at the same time, and set up this cloud server by hand. The good news for us is that the actual setup process for this server is quite simple, since it does so little and reuses our existing Blackbox setup from our main Prometheus server (all of which is stored in our central collection of configuration files and other stuff).

(As a result, this cloud server is installed in a way fairly similar to our other machine build instructions. Since it lives in the cloud and is completely detached from our infrastructure, it doesn't have our standard local setup and customizations.)

In a way this is also the bad news. If this server and its operating environment was more complicated to set up, we would have more motivation to pick one of the cloud server automation systems, learn it, and build our cloud server's configuration in it so we could have, for example, a command line 'rebuild this machine and tell me its new IP' script that we could run as needed. Since rebuilding the machine as needed is so simple and fast, it's probably never going to motivate us into learning a cloud server automation system (at least not by itself, if we had a whole collection of simple cloud VMs we might feel differently, but that's unlikely for various reasons).

Although setting up a new instance of this cloud server is simple enough, it's also not trivial. Doing it by hand means dealing with the cloud vendor's website and going through a bunch of clicking on things to set various settings and options we need. If we had a cloud automation system we knew and already had all set up, it would be better to use it. If we're going to do much more with cloud stuff, I suspect we'll soon want to automate things, both to make us less annoyed at working through websites and to keep everything consistent and visible.

(Also, cloud automation feels like something that I should be learning sooner or later, and now I have a cloud environment I can experiment with. Possibly my very first step should be exploring whatever basic command line tools exist for the particular cloud vendor we're using, since that would save dealing with the web interface in all its annoyance.)

Written on 16 May 2024.
« Turning off the X server's CapsLock modifier
The trade-offs in not using WireGuard to talk to our cloud server »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu May 16 22:52:33 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.