Chris's Wiki :: blog/sysadmin/AlertsIncludeObvious Commentshttps://utcc.utoronto.ca/~cks/space/blog/sysadmin/AlertsIncludeObvious?atomcommentsDWiki2020-08-11T15:57:24ZRecent comments in Chris's Wiki :: blog/sysadmin/AlertsIncludeObvious.By -dsr- on /blog/sysadmin/AlertsIncludeObvioustag:CSpace:blog/sysadmin/AlertsIncludeObvious:90cf2551f85f0cdbbe8d07e3cf5171b08d755015-dsr-https://blog.randomstring.org<div class="wikitext"><p>The standard that my company likes is: "It's 3 AM, you've just been woken up by an alert. The message should be clear about what is happening, why it is serious, and what to do about it. The what-to-do can be a link to the relevant page in the wiki or instructions to call particular people."</p>
<p>When alerts are likely to have false-positives, we add in a requirement that the test fail twice in a row before paging us, rather than alerting immediately.</p>
</div>2020-08-11T15:57:24ZBy Todd on /blog/sysadmin/AlertsIncludeObvioustag:CSpace:blog/sysadmin/AlertsIncludeObvious:d5fd0e8895a36baadd2935d6831f9825d8d86af4Todd<div class="wikitext"><p>On at least one occasion, I've left out information like that from an alert in order to intentionally slow down reporting of the situation because it's often transient. Some of my teammates at the time were junior and would maybe not know to wait a minute.</p>
</div>2020-08-01T11:41:38Z