Thursday, February 25, 2010

Debian 8

Awk is a field/column processor and it provides reports. Awk tokenizes fields/columns based on defined delimiter (space by default). Awk performs procedures on lines based on optional matched patterns. Awk automatically loops through lines of input. Awk supports the same input types as GREP. Awk supports different Input Delimiters and Output Delimiters. You can also use POSIX and Extended REGEXP with awk.
Awk consists of three steps:
BEGIN block (Optional) - executes prior to reading input stream
MAIN block - executes once per line of input (// Pattern matching is also part of this)
END block - executes once after reading input stream

We will now look at awk using a similar file used in GREP:
Debian Linux
SUSE Linux
SUSE Linux 9999
Debian9 Linux

To print full lines using awk:
awk '{print}' test.txt
cat test.txt | awk '{print}'

If you have a line like this:
SUSE Linux 9999

SUSE is referenced by $1, Linux is $2, and 9999 is $3.

To print only column 1 for the whole file:
awk '{print $1}' test.txt

If you print $2, the lines that doesn't have $2 will be printed as blank lines.

You can print more columns using the syntax:
awk '{print $1,$2}' test.txt

You can also swap the lines if you wish:
awk '{print $2,$1}' test.txt

To search for specific patterns:
awk '/SUSE/{print}' test.txt

Similarly, you can print specific columns:
awk '/SUSE/{print $1}' test.txt

If you want to use different input (FS) and out (OFS) delimiters, such as whitespace for input and colon ":" for output, use:
awk 'BEGIN{OFS=":"}{print}' test.txt

FS stands for field separator.

To use pattern matching together with different delimiters:
awk 'BEGIN{OFS=":"}/SUSE/{print}' test.txt

An end block is typically used to show that a process has been completed:
awk 'BEGIN{OFS=":"}/SUSE/{print}END{print "Process Complete"}' test.txt

Awk also has variables which you can print like this:
awk 'BEGIN{OFS=":"; print "Output Field Separator is \""OFS"\""}/SUSE/{print}END{print "Parsed "NR" lines"}' test.txt

We will now parse /var/log/messages for a real-life example. A line in messages look like this:
Feb 22 22:32:29 kelvin-debian01 kernel: [ 24.826668] NFSD: starting 90-second grace period

We want to show only the status, which is "NFSD: starting 90-second grace period". To do this, we first have to find out what column is NFSD. It is the 8th column. So we want to print from the 8th column onwards. This is the code to do this:
awk '/Feb 22/
 for (i=8;i<=NF;i++)
  printf("%s ",$i);
}' $1

NF contains the number of columns in one line. We want to print from the 8th column to the end of the line so we print from 8 till NF. After printing each line, we send a CR/LF. This effectively returns only the status messages with no other information.

No comments :

Post a Comment