An irritating awk
limitation: getting a range of fields
Writing things in awk
has a number of little irritations that come up
every so often. One of them is that it has no built in way to retrieve
a range of input fields as a string, ie there is no equivalent of what
in Python one could write as something like 'r = " ".join(input[2:])
'
(which turns everything from the third field onwards back into a single
string).
Of course, you can do this with an awk function. But it's irritating to
have to keep including that function in my awk
programs (especially
when they are tiny programs that are written inline in a larger shell
script), and it points out a deeper weakness in awk
, which is that
awk
has no really good way to manipulate how lines are split into
fields.
Take the example from yesterday and
consider the sed
invocation, which only exists because of this awk
issue. What we really want to do is split each line into two fields:
the first word of the line, and then everything else; then we will print
the second field and ignore the first one. However, you can't do this in
awk
(or at least not very easily).
(To beat people to the obvious approach: yes, you can assign an empty
string to $1
and then use $0
, but that puts a space at the front of
the new $0
, which is sometimes important.)
Sidebar: the necessary awk function
Here's the necessary awk function:
function fieldstr(s, e, i, r) { if (e > NF) e = NF r = $(s) for (i = s+1; i <= e; i++) r = r " " $(i) return r }
With no error checking, you can make this a sensible one-liner function
body without being too offensive. (In my version I put the 'return r
'
on a second line because otherwise it looked too crammed in.)
|
|