A trick for dealing with irregular multi-word lines in shell scripts
March 1, 2012
Suppose that you have a bunch of lines in what I've sort of described as a 'key=value' format, that look like this:
Also, let's suppose that the fields and their ordering isn't constant, for example some lines omit key2 and its value. If it wasn't for this inconsistency, there's lots of Unix tools that you could use; with this inconsistency, I can't think of a Unix program that naturally deals with this format (one where you can say 'give me key1 and key7' in the same easy way you can get field 1 and field 7 in awk).
Fortunately, Unix gives us some brute force tricks.
Selecting lines based on field contents is pretty easy:
(The space before the key name may not be necessary depending on what key names your file uses.)
I don't have any clever tricks if you want to aggregate or otherwise process several fields, but if you just want to pull out and analyze one field there is a brute force trick that you can often use. Let me show you a full command example:
The important trick is the
Of course, you don't necessarily need the lines to be in 'key=value' format. A variant of this 'split words into separate lines' trick can be done to any file format where you can somehow match the individual 'words' that you want to further process. And you don't have to split on spaces; any distinguishing character will do.
(If the field separator is several characters you can split things
I call this brute force because we're not doing anything particularly clever to extract just the words we care about from inside each line. Instead we're slicing up everything and then throwing most of the pieces away.
Written on 01 March 2012.
* * *
Atom feeds are available; see the bottom of most pages.