Yet Another CLI Trick: simple awk reports
Songbird - Photo by Rebecca Wilson 2024

Yet Another CLI Trick: simple awk reports

If you read more than one or two of my posts, you might get the vibe that I'm a bit of an awk fanboy. You're not wrong. Once a sysadmin picks up one or two aspects of awk, she becomes unstoppable. So many possibilities with just a couple of simple tricks! Let's get into the details! (Requisite link to the online man page. ?? Can't say I'm not consistent!)


????????????

If you have experience with C, C++, or any of the scripting languages that lean on C as inspiration for syntax, you will be familiar with the perennial indexing question: start at 0 or start at 1?

Spoiler alert: awk starts at 1.

The substring function substr has two minimum required arguments: the original string, and the offset. If the third argument is specified, it defines the maximum length of the return value. If the third argument is not specified, the return value will be (length of original - offset).

For example: substr("Yet Another CLI Trick", 5, 7) will return "Another".


??????????

This is almost a mini-awk within awk. The entire premise of awk is splitting rows into fields using some designated separator (typically whitespace, but sometimes comma or semicolon). The split function does the same thing, but on a designated string. What I've always found odd about this function is that its return value is passed in and out as an argument.

For example: split("my-log-file.txt",arr,"-") writes the following entries into arr:

  • arr[1] = "my"
  • arr[2] = "log"
  • arr[3] ="file.txt"


The accumulator pattern

Time and again, I come back to this one aspect of awk. It is shaped a bit differently in perl, but the same feature also exists there. I think of it as the labeled accumulator.

I'm assuming you have a passing understanding of how awk scripts are put together, so I won't review the basics just now.

I want to find lines that match a pattern. If that pattern exists, I want to sum up the quantity in the first column, grouped by the second column as a label.

/pattern to match/ 
{
  accum[$2]+=$1 
} 
END 
{  for(e in accum) 
   { print e, accum[e] }
}        

For example, take this listing of a directory:

[log]$ ls -1s *log*
  648 Nudge.log
    0 alf.log
   16 fsck_apfs.log
    8 fsck_apfs_error.log
    8 fsck_hfs.log
33368 install.log
  392 jamf.log
    8 shutdown_monitor.log
   16 system.log
    8 system.log.0.gz
    8 system.log.1.gz
    8 system.log.2.gz
    8 system.log.3.gz
    8 system.log.4.gz
    8 system.log.5.gz
 1800 wifi.log
  232 wifi.log.0.bz2
  232 wifi.log.1.bz2
  248 wifi.log.2.bz2
  232 wifi.log.3.bz2
  224 wifi.log.4.bz2
  184 wifi.log.5.bz2        

What happens if we pipe this output into awk? With default settings, awk will split the listing into two columns. The first column is the "number of blocks used in the file system by each file." The second column is the name of the file.

For some crazy reason, I've decided that the first portion of the log file name has significance, and I will be using that as my label for the accumulator. Here's my approach:

ls -s *log* | \
awk '{ \
   split($NF,arr,"[^A-Za-z]"); 
   sum[arr[1]]+=$1 \
}
END { \
   for (s in sum) { \
      print sum[s], s \
   } \
}' | sort -rn        

?????? ??????????: regular expressions allow for a definition of a character class. If I want to define all characters in the alphabet, upper and lower case, I would define that regex as [??-????-??]. By preceding the character class within the square brackets with a caret ^, I've negated that class - anything BUT a letter in the alphabet.

By using this ad hoc character class as the separator for my split function, I've cleanly broken any text in the various filenames into the first sequence of alphabetical characters.

Here's the output:

33368 install
3152 wifi
648 Nudge
392 jamf
64 system
32 fsck
8 shutdown
0 alf        

This is roughly equivalent to the following SQL query:

select sum(blocks) as s, filename
from dir_ls
where filename like "%log%"
group by filename 
order by s desc        

Room to grow

By no means an exhaustive overview of awk, this brief tutorial should give you a taste of what's possible. Dive in, play around, figure out a useful report that's waiting for YOU to craft it out of the log files!

Happy hunting!

要查看或添加评论,请登录

Jeffrey Wilson的更多文章

  • Crafting a Career, Ch 41

    Crafting a Career, Ch 41

    No plan survives first contact with the enemy. ~Helmuth von Moltke If you take time to consider your future career…

  • Advice from a fellow traveler on how to use AI

    Advice from a fellow traveler on how to use AI

    [Review my other tutorials.] The hype continues.

  • Crafting a Career, Ch 40

    Crafting a Career, Ch 40

    [Review previous chapters in this series here.] I've started to notice something.

  • Crafting a Career, Ch 39

    Crafting a Career, Ch 39

    [Pick a different chapter from the table of contents.] I've touched on all kinds of topics these past 30 something…

    1 条评论
  • Crafting a Career, Ch 38

    Crafting a Career, Ch 38

    How do I know if I am living out a "growth mentality"? [View the table of contents to select a different chapter.]…

    1 条评论
  • Crafting a Career, Ch 37

    Crafting a Career, Ch 37

    I missed posting my weekly article last week. Did you miss hearing from me? Let's all play nice and just go with, "Yes"…

    1 条评论
  • Crafting a Career, Ch 36

    Crafting a Career, Ch 36

    Reduction in force. RIF.

  • Crafting a Career, Ch 35

    Crafting a Career, Ch 35

    Last week I attended online training from Palo Alto Networks. Among the 7 students attending that class were my…

  • F5 - utilize 3rd party for backups

    F5 - utilize 3rd party for backups

    [Review other tutorials from my table of contents.] Vendor Support for Backups Congratulations on your successful…

  • Palo Alto operations - uploading UserID data

    Palo Alto operations - uploading UserID data

    [Review other tutorials from my table of contents.] If you're familiar with Palo Alto Networks, then you can understand…

社区洞察

其他会员也浏览了