Wandering Thoughts archives

2014-05-09

Some uses for Python's 'named' form of string formatting

I expect that every Python programmer is familiar with Python's normal way of formatting strings with % and 'printf' style format specifications. Let's call this normal way of formatting things a 'positional' way, because it's based on the position of the arguments given to be formatted. But as experienced Python programmers know, this is not the only way you can set up your formatting strings; you can also set them up so that they pick out what to format where based on name instead of argument position. Of course to do this you need to somehow attach names to the arguments, which is done by giving % a dictionary instead of its usual tuple.

Here's what this looks like, for people who haven't seen it before:

print "%(fred)d %(barney)d" % {'fred': 1, 'bob': 2, 'barney': 3}

Note that not all keys in the dictionary need to be used in the format string, unlike with positional arguments.

There are two general uses for named string format specifications, both of which usually start in a situation where the format specification itself is variable. The simple and straightforward use is rearranging the order of what gets printed, which can really come in handy for things like translating messages into different languages (this is apparently a sufficiently common need that it got its own feature in Python 3's new string formatting stuff). The more complex use is to print only a subset of information from a larger collection of available information. Effectively this makes '%' string formatting into a little templating system.

My uses of this have tended to be towards full blown templating where the person configuring my program is trusted to write the formatting strings (note that this can at least throw exceptions if they get it wrong). I can see uses for this in simpler setups, for example to log a number of different messages with somewhat different information depending on some combination of things. Rather than write full blown and repetitive code to explicitly emit N variations of the same logging call, you could just select different name-based formatting strings based on the specific circumstances.

(I'll have to remember to experiment with this idea the next time I have this need. It feels like this might be an interesting new approach to deal with the whole issue of verbosity and including or not including certain bits of information and so on, which can otherwise clutter up the code something awful and be annoying to program.)

PS: Python 3's string formatting does this differently. Following my current policy on Python 3 I'm not thinking about it at all.

python/NamedFormattingUses written at 23:34:26; Add Comment

Operating systems cannot be hermetically sealed environments

There's an idea that you can find rattling around operating systems; the simplest way to describe it is that operating systems and their OS-supplied components should be seen as essentially a black box that's there to provide you certain basic services. In a Unix environment, this would be very little beyond a standard library, standard shell script pieces, and a few similar things. The operating system may have other components but they are for its internal use, not for the use of your programs and systems. In OmniOS this idea is known as 'keep your stuff to yourself' but it's by no means exclusive to OmniOS, partly because it's attractive to many people who want to build a minimal OS.

The problem with this is that like it or not, operating systems are not hermetically sealed environments with minimal and standardized interfaces (libc, basic shell utilities, etc). I don't mean this in the sense that people using an OS will inevitably find it convenient to use the OS's versions of things even though they're not supposed to (which they totally will, by the way). I mean this in the sense that such a minimal interface is too small to be practical.

We saw one point of friction with the mailer dependency issue. MTAs are generally one to a system so the interface to the MTA implicitly becomes an API that the OS both exposes and uses itself. Another example is how you hook yourself into whatever fault monitoring and management system the OS has. How the OS reports faults (and what faults it reports) forms at least an implicit API because you need this information to sanely manage your systems.

('We syslog kernel messages' or 'we write messages to a file' is still an implicit API.)

This is what I mean by the OS not being a hermetically sealed environment in practice. You cannot give people a simple black box OS and have it be useful. All of those implementation details of logging and fault management and mail and so on will inevitably leak outside of the box whether you officially document them or not, because this is what's needed to run real systems.

(I think that we often don't notice this because we take them as 'part of Unix, more or less', and they aren't standardized across Unixes.)

Sidebar: one diagnostic test of 'is something purely internal'

My test is 'could the basic OS remove this entirely without people exploding'. For things like Perl and Python (when you've been told to not use the OS's versions of them) the answer is theoretically yes. Now imagine a Unix OS that did not log anything at all via syslog (or just at all). Would you accept that or would you immediately rule it out?

(Yes, there are some environments where this wouldn't be a disqualification. I don't think there are very many.)

sysadmin/OSesAreNotClosed written at 01:59:07; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.