In Python, using the
logging package is part of your API, or should be
We have a Python program for logging email attachment type information. As part of doing this, it wants to peer inside various sorts of archive types to see what's inside of them, because malware puts bad stuff there. One of the Python modules we use for this is the Ubuntu packaged version of libarchive-c, which is a Python API for libarchive. Our program prints out information in a very specific output format, which our Exim configuration then reads and makes use of.
Very recently, I was looking at our logs for an email message and noticed that it had a very unusual status report. Normal status reports look like this:
1kX88D-0004Mb-PR attachment application/zip; MIME file ext: .zip; zip exts: .iso
This message's status report was:
Pathname cannot be converted from UTF-16BE to current locale.
That's not a message that our program emits. It's instead a warning
message from the C libarchive library. However, it is not printed
out directly by the C code; instead this report is passed up as an
additional warning attached to the results of library calls. It is
libarchive-c that is deciding to print it out, in a general
FFI support function.
More specifically, libarchive-c is deciding to 'log' it through the
package; the default logging environment then prints it out to
(Our program does not otherwise use
logging, and I had no
idea it was in use until I tried to track this down.)
A program's output is often part of its API in practice. When code
does things that in default conditions produces output, this alters
the API of the program it is in. This should not be done casually.
If warning information should be exposed, then it should be surfaced
through an actual API (an accessible one), not thrown out randomly.
If your code does use
logging, this should be part of its
documented API, not stuffed away in a corner as an implementation
detail, because people will quite reasonably want to know this
(so they can configure
logging in general) and may want to
turn it off.
In a related issue, notice that libarchive-c constructs the
logger it will use at
import time (here),
before your Python code normally will have had a chance to configure
logging, and will even use it at import time under some circumstances
as it is dynamically building some bindings. I suspect that it is
far from alone as far as constructing and even using its logger at
import time goes.
(It's natural to configure logging as part of program startup, in
main() function or something descending from it, not at program
load time before you start doing
imports. This is especially the
case since how you do logging in a program may depend on command
line arguments or other configuration information.)
(This is the background for this tweet of mine.)