The fun of awk

May 30, 2008

I like awk, I really do, but sometimes it really irritates me. Take, for example, this fun little awk program:

awk 'BEGIN {print "5" == "05"}' /dev/null

You might rationally expect this to print '1' (awk's boolean truth value). As I found out once, you would be sadly mistaken; this is false, presumably because awk winds up doing a string comparison instead of a numeric one. Too bad if you're reading one set of fields that are zero-padded and one set that aren't.

(The workaround is add 0 to the "05" to force the numeric interpretation; "5" == ("05"+0) comes out true.)

This shows two drawbacks of the sort of magical conversion between numbers and strings that awk does. First, this sort of stuff involves heuristics, and heuristics are inevitably wrong sooner or later. And second, if you do not have the fine details carefully memorized you can wind up surprised.

At the same time such magical conversions live on because they are oh so very handy when you are banging things in a hurry. Considering the sorts of things that awk was designed for, this is completely the right decision for it; having to write explicit Python-style conversions all the time would probably drive me up the wall, however much I like them in Python.

Written on 30 May 2008.
« Users are rational
What contracts aren't »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri May 30 23:25:47 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.