Context grep in IBM AIX - part 2

Context grep in IBM AIX - part 2

Two weeks ago I wrote a post about context grep in AIX - https://www.dhirubhai.net/pulse/context-grep-ibm-aix-andrey-klyachkin/ and promised to pack sed in a shell script. Now let's do it!

First as a side notice - I use AIX 7.2 TL5 SP4 and KornShell 93. Why KornShell 93? Because I like it and it has some nice features for scripting. Some of them I use in the script below.

The first thing I do, when I create a new script, I create an empty file, write 'shebang' line into it and give execute permissions to the file. Something like:

No alt text provided for this image

The script does nothing, but I can open vi in one terminal window and edit it there, during I test it in another terminal window.

Now the most important decision. I must understand what the script will do and how it will work from a user's perspective.

My script will accept several arguments:

-A <number> - number of lines to show before the match.

-B <number> - number of lines to show after the match.

-C <number> - number of lines to show both before and after the match.

<pattern> - the pattern we search for.

<file> - where we search for the pattern.

More formal the call to the script can be written like:

cgrep.sh ([-A<number>] [-B<number>] | -C<number>) pattern file        

Now we understand that we need 4 variables to save number of lines before and after the match, pattern and file. I like to define important variables first before using them and if I know the type I define them with their types.

No alt text provided for this image

I know that number of lines should be integer and define the variables as integers (typeset -i). It means that if I'd try to save a string into the variable, the shell will not do it and set the variable to 0 - exactly what I need.

The next step is to check all the arguments. I will do it in a "classic" way - just iterate over all arguments:


while [[ $1 ]] ; do
  print "Argument $1"
  shift
done        

But we need a little bit more. We want to check the arguments and set our variables. That's why I use case:

while [[ $1 ]] ; d
  case $1 in
    -A*)
     ;;
    -B*)
     ;;
    -C*)
     ;;
    *)
     ;;
  esac
  shift
done        

I use star after the argument (-A*) because our arguments may have digits direct after them, e.g. -A5. It is wildcard - everything that starts with -A should be caught in this case. The last star (*) will catch everything that is not caught by previous cases.

Let's make the next step and parse our arguments. For -A:

-A*)
  n=${1#-A}
  if [[ -z $n ]] ; then
    # n is empty, the number should be in the next argument
    shift
    after="${1}"
  else
    after="${n}"
  fi        

There are two variants how a user can provide arguments - either as -A5 (together) or as -A 5 (with space between -A and the number). In the second line we drop off -A from $1. If the resulting variable ($n) is empty, then the user uses the second variant with the space. We need to shift our arguments once more and use what comes next. If the varialbe $n is NOT empty, then it has the number of lines we need to save in our variable.

After we've got some number we have to check, that we have the correct number there:

if [[ $after -lt 1 ]] ; then
  print -u2 -- "Argument for -A is invalid"
  exit 1
fi
;;        

If the number of lines less than 1 there was an error. Either the user gave us a negative number or zero - both are incorrect, or it was a string. Because our variable has type integer (remember typeset -i?) it can't hold strings and the value will be zero in this case. Anyway we print an error message to standard error device (print -u2) and exit.

Parsing of -B and -C is almost the same. The only difference is the variables we use. In case with -B we use $before and with -C we use both $after and $before. Don't be afraid, I'll pack the script at the end.

Now we have our star case, when we catch everything that was not caught earlier. First of all we check that some argument like -A, -B or -C is already selected, because all other arguments like pattern and file name come after them:

*)
  if [[ $after -eq 0 ]] || [[ $before -eq 0 ]] ; then
    print -u2 -- "Unknown argument $1"
    exit 1
  fi        

Again because our variables $after and $before are declared as integers and are not initialized, they will have 0, if they don't have any other values. We need to check if one of them already got some value.

In our case the pattern goes always before the file name. If the variable $pattern is empty then we must first fill it:

if [[ -z $pattern ]] ; then
  # it should be pattern
  pattern="$1"
else        

If $pattern is already set then it should be file name. Let's check if we can read it:

# we have pattern already, it must be a file
file="$1"
# check if it exists
if [[ ! -r "$file" ]] ; then
  print -u2 -- "Can't read file $file. Check its existence or permissions"
  exit 1
fi
fi
;;        

We've almost done with argument checking! But we still must do two more checks. It may happen that the user didn't specify any pattern or file at all. We need to check for these cases after our while loop (after done):

if [[ -z "$pattern" ]] ; then
  print -u2 -- "Req'd parameter missing - pattern"
  exit 1
fi

if [[ -z "$file" ]] ; then
  print -u2 -- "Req'd parameter missing - file name"
  exit 1
fi        

Now we are finished with the first part of our script. It was a classic way of checking arguments. There is a "modern" way - by using getopts. OK, it is not very modern. It is almost as old as the classic way, but KornShell '93 has some very advanced capabilities in getopts. More about it - in one of the next articles.

Today I continue with sed. sed has two mode of operations. I can write a sed script directly in the command line and it is what I usually do. In the last article we did exactly that - wrote the whole sed script in the command line. On the other side sed can read its script from a file. It is exactly what I do today - create a temporary file, write sed script into it and execute sed. Why? Because it is easier for me :-)

First create a temporary file and check that it was created:

: >/tmp/cgrep.$$

if [[ ! -f /tmp/cgrep.$$ ]] ; then
  print -u2 "Error creating a temporary file"
  exit 2
fi        

$$ is a special Shell variable containing our process ID. It should make our temporary files unique for every call of the script.

Now we open our temporary file for writing and start writing our sed script into it:

exec 5>/tmp/cgrep.$$        

Why it makes sense to open files for writing? If you use standard way of writing into a file such as:

echo 'hello world' >myfile
echo 'hello again' >>myfile        

Shell will open the file every time you want to write something into it and close it afterwards. It is not a problem if you want to write one line. If you want to write 10 000 lines, you will get your performance penalty. The file will be opened and closed 10 000 times. My sed script will not have 10 000 lines but I don't like to write every time >>. I use file descriptors.

if [[ $before -ne 0 ]] ; then
  print -u5 "/${pattern}/ !{"
  print -u5 'H'
  print -u5 'x'        

Our first part from the sed script should be executed only if we have to show lines before the match. The beginning is very standard, but then we have to expand our search pattern in sed for every line we want to add into output:

print -u5 -rn 's/^.*\n\('
if [[ $before -ne 1 ]] ; then
  for i in {1..$((before - 1))}; do print -u5 -rn '.*\n' ; done
fi
print -u5 -rn '.*'
print -u5 -r '\)$/\1/'        

print -r means that print must print all the special characters as they are, without interpreting them. We have \n in our sed script and print would usually print a new line if it finds \n. We don't need it, we need \n as \n. That's why we use print -r.

We also want to make our search pattern in one line. That's why we use print -n - don't print a newline character at the end.

At the end the whole job is done by for. It calculates how many .*\n should in the pattern. It can be many:

s/^.*\n\(.*\n.*\n.*\)$/\1/        

or even none (in parenthesis):

s/^.*\n\(.*\)$/\1/        

Now we can finish with our "before" part:

print -u5 'x'
print -u5 '}'
fi        

And continue with our "normal" and "after" parts:

print -u5 "/${pattern}/ {"
if [[ $after -gt 0 ]] ; then
  for i in {1..${after}}; do print -u5 'N' ; done
fi
if [[ $before -ne 0 ]] ; then
  print -u5 'H'
  print -u5 'x'
fi
print -u5 'p'
print -u5 '}'        

If we have the "after" part, we need to add as many N's in our sed script as the user wants to see lines.

If we have the "before" part we need to make two additional operations (hold the buffer and exchange it with the pattern space) before printing. Otherwise we can just print the pattern space.

We finished writing the sed script. We must now close our temporary file, execute sed and then remove the temporary file:

exec &>-5

sed -n -f /tmp/cgrep.$$ $file
rm -f /tmp/cgrep.$$        

We are ready to test it:

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

Of course I omitted several use cases. You are always welcome to make it better! Next time I will rewrite the script with getopts and show how I document scripts.

Have fun with KornShell!

Andrey

The whole script:

#!/bin/ksh93
# Author: Andrey Klyachkin, eNFence, 2022

typeset -i after=0????? # how many lines should we show before the match
typeset -i before=0???? # how many lines should we show after the match
typeset pattern
typeset file

# First check arguments
while [[ $1 ]] ; do
? case "$1" in
??? -A*)
????? n=${1#-A}
????? if [[ -z $n ]] ; then
??????? # n is empty, the number should be in the next argument
??????? shift
??????? after="${1}"
????? else
??????? after="${n}"
????? fi
????? if [[ $after -lt 1 ]] ; then
??????? print -u2 -- "Argument for -A is invalid"
??????? exit 1
????? fi
????? ;;
??? -B*)
????? n=${1#-B}
????? if [[ -z $n ]] ; then
??????? # n is empty, the number should be in the next argument
??????? shift
??????? before="${1}"
????? else
??????? before="${n}"
????? fi
????? if [[ $before -lt 1 ]] ; then
??????? print -u2 -- "Argument for -B is invalid"
??????? exit 1
????? fi
????? ;;
??? -C*)
????? n=${1#-C}
????? if [[ -z $n ]] ; then
??????? # n is empty, the number should be in the next argument
??????? shift
??????? before="${1}"
??????? after="${1}"
????? else
??????? before="${n}"
??????? after="${n}"
????? fi
????? if [[ $before -eq 0 ]] ; then
??????? print -u2 -- "Argument for -C is invalid"
??????? exit 1
????? fi
????? ;;
??? *)
????? if [[ $after -eq 0 ]] && [[ $before -eq 0 ]] ; then
??????? print -u2 -- "Unknown argument $1"
??????? exit 1
????? fi
????? if [[ -z $pattern ]] ; then
??????? # it should be pattern
??????? pattern="$1"
????? else
??????? # we have pattern already, it must be file
??????? file="$1"
??????? # check if it exists
??????? if [[ ! -r "$file" ]] ; then
????????? print -u2 -- "Can't read file $file. Check its existence or permissions"
????????? exit 1
??????? fi
????? fi
????? ;;
? esac
? shift
done

if [[ -z "$pattern" ]] ; then
? print -u2 -- "Req'd parameter missing - pattern"
? exit 1
fi

if [[ -z "$file" ]] ; then
? print -u2 -- "Req'd parameter missing - file name"
? exit 1
fi

: >/tmp/cgrep.$$

if [[ ! -f /tmp/cgrep.$$ ]] ; then
? print -u2 "Error creating a temporary file"
? exit 2
fi

exec 5>/tmp/cgrep.$$

if [[ $before -ne 0 ]] ; then
? print -u5 "/${pattern}/ !{"
? print -u5 'H'
? print -u5 'x'
? print -u5 -rn 's/^.*\n\('
? if [[ $before -ne 1 ]] ; then
??? for i in {1..$((before - 1))}; do print -u5 -rn '.*\n' ; done
? fi
? print -u5 -rn '.*'
? print -u5 -r '\)$/\1/'
? print -u5 'x'
? print -u5 '}'
fi
print -u5 "/${pattern}/ {"
if [[ $after -gt 0 ]] ; then
? for i in {1..${after}}; do print -u5 'N' ; done
fi
if [[ $before -ne 0 ]] ; then
? print -u5 'H'
? print -u5 'x'
fi
print -u5 'p'
print -u5 '}'

exec &>-5

sed -n -f /tmp/cgrep.$$ $file
rm -f /tmp/cgrep.$$

        
Jukka M?ki

Infrastructure Architect at Tieto

6 个月

Great..working ok

回复
Deniz S.

IBM (retired) Certified Senior Enterprise Architect: Infrastructure Services Technical Strategy

2 年

Perfect, Andrey ??

要查看或添加评论,请登录

???????Andrey Klyachkin的更多文章

社区洞察

其他会员也浏览了