Webmail providers (and others) hiding user IPs was the right decision
Once upon a time when GMail was new, one of my gripes about it was that, unlike Hotmail and Yahoo and other webmail providers at the time, it didn't add a header to outgoing email that had the original IP that the user submitted the message from. My mail filter systems of the time liked to use that origin IP address information as part of filtering decisions, and GMail lacking it made it harder to selectively deal with spam from them (there has always been spam from GMail).
In the spirit of admitting past mistakes, I was wrong here. Yes, not having that information made my life harder in dealing with GMail spam issues. But the history of people mining every piece of privacy invasive information they can has made it clear that GMail made the right decision overall. Denying other people potentially sensitive information about where particular GMail users are is the right decision for the modern Internet and has been for some time. All webmail providers should be doing the same if they aren't already, and in fact really everyone should be. Where your users submit email from is no one else's business and they shouldn't be allowed to snoop into it, because it reveals potentially sensitive information.
(These days it may be a violation of various privacy regulations to pass this information over to other people by putting it in email headers.)
We used to be pretty decent about this ourselves because all of our
email had to be submitted from local networks, including our VPN
servers, and so the outside location of our users wasn't revealed
(if they were outside and VPN'ing in). These days we have an
authenticated SMTP submission server with a standard MTA configuration,
which means that it leaks information in the default
headers, and also a webmail server that adds its own synthetic
Received header with IP address information. At some point we
should probably deal with both of these issues.
(The authenticated SMTP submission server can drop the IP address
Received header it generates and just put in the
authenticated user and the Exim message ID (which is enough to trace
it in our logs and recover that information). As for the webmail
system, perhaps it can be configured to leave out that information
and only put it into logs or the like.)
PS: If this feels like an obvious thing today, that shows how far things have shifted on the Internet since GMail was originally introduced, or at least how perceptions and understandings have shifted.
Our (unusual) freedom to use alerts as notifications
Many guides to deciding what to alert on draw a strong distinction between alerts and less important things (call them 'notifications'). The distinction generally ultimately comes about because alerts will disturb the people who are on call outside of working hours, and that should be reserved for serious things that they can and should take action on. This is often threaded through assumptions and guidelines in metrics and alerting systems; for example, Prometheus implicitly follows this in their guide to alerting, and the philosophy document they link to assumes that alerts will page people and so should be minimized.
Our alerts in our Prometheus setup don't follow this. I've already written up our reboot notifications, which are implemented as special Prometheus alerts that explicitly call themselves 'notifications' and are handled specially, but it extends beyond this in our alerts. We generate 'alerts' for things that we merely want to keep track of; one example is our automated reboots of hung Dell C6220 blades (which alert as if a machine went down and then came back, because it did), and these alerts are just the same as we would get for any machine that went down and then came back up.
(This is also part of why we have set Alertmanager to also send us email about resolved alerts (cf). Paging people to tell them something is now over would probably not be well received.)
The reason we have this freedom is not that we've done clever design in our Prometheus setup to avoid paging people for such notifications. Instead, it's because no one is on call here and so these notification alerts are not disturbing anyone when they trigger outside of working hours (even inside of working hours, they're just another email message). If we started having people on call, we would have to change this so that only genuine alerts paged people.
(This doesn't mean that limiting what we alert on is unimportant. Even for alerts that are just notifying us about things, we have to both genuinely care about the thing and find the notification useful. Generally our notifications are kept down so that they fire only rarely unless we're having real problems, such as a lot of Dell C6220 blades crashing all the time for some reason. And we might filter those out if they started becoming ubiquitous, or perhaps take the blades out of service on the grounds that they're now too unreliable.)
This blurring of alerts and notifications is not without its hazards, most obviously if we become acclimatized to notifications and treat a real problem (an alert) as merely a less important notification that can be let sit for a bit. But it's also been important for what alerts we actually create; our freedom to 'alert' on things that are perhaps not an immediate crisis allows us to watch more things and to be less cautious and conservative about the levels of things to alert on (and accept some false positives in the name of, say, alerting early about things that are real problems).
(I mentioned our situation with alerts not paging us back in why we generate alert notifications about rebooted machines, but I didn't think or talk about the impact it's had on what we alert about. It's sort of a 'fish in water' thing; I didn't think about how it affected what we alert on until recently.)