## The fun of `awk`

May 30, 2008

I like `awk`, I really do, but sometimes it really irritates me. Take, for example, this fun little awk program:

`awk 'BEGIN {print "5" == "05"}' /dev/null`

You might rationally expect this to print '1' (awk's boolean truth value). As I found out once, you would be sadly mistaken; this is false, presumably because `awk` winds up doing a string comparison instead of a numeric one. Too bad if you're reading one set of fields that are zero-padded and one set that aren't.

(The workaround is add 0 to the "05" to force the numeric interpretation; `"5" == ("05"+0)` comes out true.)

This shows two drawbacks of the sort of magical conversion between numbers and strings that `awk` does. First, this sort of stuff involves heuristics, and heuristics are inevitably wrong sooner or later. And second, if you do not have the fine details carefully memorized you can wind up surprised.

At the same time such magical conversions live on because they are oh so very handy when you are banging things in a hurry. Considering the sorts of things that `awk` was designed for, this is completely the right decision for it; having to write explicit Python-style conversions all the time would probably drive me up the wall, however much I like them in Python.

From 71.65.56.124 at 2008-05-31 11:01:34:

You can also take away the quotes if you know your input will be mathematical.

By cks at 2008-05-31 12:06:38:

In this case the real version was more like '`if (\$1 == \$3) ...`'; the actual values I was comparing were input fields instead of one being a constant.

From 71.65.56.124 at 2008-06-01 16:50:31:

Interesting

At my shell, I get this:

```Matt-Simmons-Computer:~ mattsimmons\$ echo "1 2 3" | awk '{print (\$1 == \$3)}'
0
Matt-Simmons-Computer:~ mattsimmons\$ echo "3 2 3" | awk '{print (\$1 == \$3)}'
1
Matt-Simmons-Computer:~ mattsimmons\$ echo "3 2 03" | awk '{print (\$1 == \$3)}'
1
Matt-Simmons-Computer:~ mattsimmons\$ echo "3 2 03" | awk '{print (\$1 == \$3)}'
```

Is this consistent with what you see on yours?

By cks at 2008-06-02 00:12:05:

I have managed to find the script with the specific issue. I was comparing an explicitly set awk variable with an input field, roughly:

```awk 'BEGIN { day = "'\$d'" }
/^From / {if (day == \$5) [...]
```

If the field value was 0-padded, this comparison is false. In retrospect, I could reasonably count on the 'day' value being a properly formed number and avoid making it into a string (if it's not a properly formed number, the rest of the awk will blow up anyways), which would avoid the whole issue.

By cks at 2008-06-02 00:13:50:

PS: I should mention that my versions of awk behaves consistently with Matt's for comparing input fields against each other.

Written on 30 May 2008.