I really love
You might disagree and call me crazy, but while
awk might be a royal brainfuck at first, here’s a very simple example of its power which should explain my endorsement.
Figuring out space hogs
Every once in a while I run out of diskspace on
/home. Even though I am the only user on this laptop I’m always puzzled as of why and I start running
du trying to figure out which install or program stole my diskspace.
Here’s a example of how I start it off in
du -h --max-depth 1
If I run the above line in my
$HOME directory, I get a pretty list of lies — and thanks to
-h this list is including more or less useful SI units, e.g.
However, since I have a gazillion folders in my
$HOME directory, the list is too long to figure out the biggest offenders, so naturally, I pipe my
du command to
sort -n. This doesn’t work for the following reason:
The order of the files is a little screwed up. As you see
.config ate 3.3 GB and listed before
ubuntu, which is only 3.3 MB in size. The reason is that
sort -n (
-n is numeric sort) doesn’t take the unit into account. It compares the string and all of the sudden it makes sense why
3.3G is listed before
This is what I tried to fix this:
du --max-depth 1|sort -n
The above command omits the human readable SI units (
-h), and the list is sorted. Yay. Case closed?
AWK to the rescue
In the end, I’m still human, and therefor I want to see those SI units to make sense of the output and I want to see them in the correct order:
Let me explain the
- Whenever you pipe output to
awk, it breaks the line into multiple variables. This is incredible useful as you can avoid
grep‘ing and parsing the hell out of simple strings.
$0is the entire line, then
$2, etc. —
awkmagically divided the string by _whitespace. As an example, “Hello World” piped to
$0equals “Hello World”,
$1equals “Hello” and
$1(which contains the size in raw kilobytes) and devides it by 1024 to receive megabytes. No rocket science!
printfoutputs the string and while outputting we round the number (to two decimals:
%.2f) and display the name of the folder which is still in
All of the above is not just simple, but it should look somewhat familiar when you have a development background. Even shell allows you to divide a number and offers a
printf function for formatting purposes.
awk is a little less confusing now. For further reading, I recommend the GNU AWK User Guide. (Or maybe just keep it open next time you think you can put
awk to good use.)