Logging fatal exceptions in my Python programs is not enough
We have a few Python programs which run automatically, need to produce very rigid output (or lack of output) to standard output and even standard error, and are complex enough (and use enough outside code) that they may reasonably run into unhandled exceptions. One example is our program to report on email attachment type information under Exim; this runs a lot of code on untrusted input, and our Exim configuration expects its output to have a pretty rigid format (cf). Allowing Python to dump out the normal unhandled exception to standard error is not what we wanted. So for years that program has had a chunk of top level code to catch and syslog otherwise unhandled exceptions. I wrote it, deployed it, and considered it all good.
The other day I discovered that this program had been periodically experiencing, catching, and dutifully syslogging an exception about an internal error (caused by a package we use), going back months. In fact, more than one error about more than one thing. I hadn't known, because I don't normally go look through the logs for these exception traces. Why would I? They aren't supposed to happen and they mostly don't happen, and humans are very bad at consistently looking for things that don't happen.
Django has a very nice feature where it will email error reports to you, which has periodically been handy here. I'm not sure I trust myself to write that much code that absolutely must run, but I certainly could make my exception logging code also run an external script with very minimal arguments and that script could email me to notify me. Since the exception is being logged, I don't need a copy in email; I just need to know that I should go look at the logs.
(Django emails the whole exception along with a bunch of additional information, but I believe the email is the only place that information is captured. There are various tradeoffs here, but my starting point is that I'm already logging the exception.)
I could likely benefit from going through PyPI to see how other people have solved this particular problem, and maybe even use their code rather than write my own. I've traditionally avoided outside packages, but we're already using a bunch of them in this program as it is and I should probably get over that hangup in general.
(It helps that I'm slowly acquiring a better understanding of using
pip in practice.)