Friday, November 15, 2024
Google search engine
HomeGuest BlogsAWK Command in Linux with Examples

AWK Command in Linux with Examples

Introduction

The awk command is a Linux tool and programming language that allows users to process and manipulate data and produce formatted reports. The tool supports various operations for advanced text processing and facilitates expressing complex data selections.

In this tutorial, you will learn what the awk command does and how to use it.

AWK command in Linux with examplesAWK command in Linux with examples

Prerequisites

  • A system running Linux.
  • Access to a terminal window.

AWK Command Syntax

The syntax for the awk command is:

awk [options] 'selection_criteria {action}' input-file > output-file

The available options are:

Option Description
-F [separator] Used to specify a file separator. The default separator is a blank space.
-f [filename] Used to specify the file containing the awk script. Reads the awk program source from the specified file, instead of the first command-line argument.
-v Used to assign a variable.

Note: You might also be interested in learning about the Linux curl command, allowing you to transfer data to and from a server after processing it with awk.

How Does the AWK Command Work?

The awk command’s main purpose is to make information retrieval and text manipulation easy to perform in Linux. This Linux command works by scanning a set of input lines in order and searches for lines matching the patterns specified by the user.

For each pattern, users can specify an action to perform on each line that matches the specified pattern. Thus, using awk, users can easily process complex log files and output a readable report.

Note: The awk command got its name from three people who wrote the original version in 1977 – Alfred Aho, Peter Weinberger, and Brian Kernighan.

AWK Operations

awk allows users to perform various operations on an input file or text. Some of the available operations are:

  • Scan a file line by line.
  • Split the input line/file into fields.
  • Compare the input line or fields with the specified pattern(s).
  • Perform various actions on the matched lines.
  • Format the output lines.
  • Perform arithmetic and string operations.
  • Use control flow and loops on output.
  • Transform the files and data according to a specified structure.
  • Generate formatted reports.

AWK Statements

The command provides basic control flow statements (if-else, while, for, break) and also allows users to group statements using braces {}.

  • if-else

The if-else statement works by evaluating the condition specified in the parentheses and, if the condition is true, the statement following the if statement is executed. The else part is optional.

For example:

awk -F ',' '{if($2==$3){print $1","$2","$3} else {print "No Duplicates"}}' answers.txt
Using the if-else statement in awk.Using the if-else statement in awk.

The output shows the lines in which duplicates exist and states No duplicates if there are no duplicate answers in the line.

  • while

The while statement repeatedly executes a target statement as long as the specified condition is true. That means that it operates like the one in the C programming language. If the condition is true, the body of the loop is executed. If the condition is false, awk continues with the execution.

For example, the following statement instructs awk to print all input fields one per line:

awk '{i=0; while(i<=NF) { print i ":"$i; i++;}}' employees.txt
Using the while statement in awk.Using the while statement in awk.
  • for

The for statement also works like that of C, allowing users to create a loop that needs to execute a specific number of times.

For example:

awk 'BEGIN{for(i=1; i<=10; i++) print "The square of", i, "is", i*i;}'
An example of the for statement in awk.An example of the for statement in awk.

The statement above increases the value of i by one until it reaches ten and calculates the square of i each time.

Note: The expressions in the condition part of if, while or for can include relational operators, such as <, , >=, == (is equal to), and != (not equal to). The expressions can also include regular expression matches with the match operators and !∼, logical operators ||, &&, and !. The operators are grouped with parentheses.

  • break

The break statement immediately exits from an enclosing while or for. To begin the next iteration, use the continue statement.

The next statement instructs awk to skip to the next record and begin scanning for patterns from the top. The exit statement instructs awk that the input has ended.

Following is an example of the break statement:

awk 'BEGIN{x=1; while(1) {print "Example"; if ( x==5 ) break; x++; }}'
Using the break statement in awk.Using the break statement in awk.

The command above breaks the loop after 5 iterations.

Note: The awk tool allows users to place comments in AWK programs. Comments begin with # and end at the end of the line.

AWK Patterns

Inserting a pattern in front of an action in awk acts as a selector. The selector determines whether to perform an action or not. The following expressions can serve as patterns:

  • Regular expressions.
  • Arithmetic relational expressions.
  • String-valued expressions.
  • Arbitrary Boolean combinations of the expressions above.

The following sections explain the above-mentioned expressions and how to use them.

Note: Learn how you can search for strings or patterns with the grep command.

Regular Expression Patterns

Regular expression patterns are the simplest form of expressions containing a string of characters enclosed in slashes. It can be a sequence of letters, numbers, or a combination of both.

In the following example, the program outputs all the lines starting with “A”. If the specified string is a part of a larger word, it is also printed.

awk '$1 ~ /^A/ {print $0}' employees.txt
An example of a regular expression pattern.An example of a regular expression pattern.

Relational Expression Patterns

Another type of awk patterns are relational expression patterns. The relational expression patterns involve using any of the following relational operators: <, <=, ==, !=, >=, and >.

Following is an example of an awk relational expression:

awk 'BEGIN { a = 10; b = 10; if (a == b) print "a == b" }'
An example of a relational expression pattern.An example of a relational expression pattern.

Range Patterns

A range pattern is a pattern consisting of two patterns separated by a comma. Range patterns perform the specified action for each line between the occurrence of pattern one and pattern two.

For example:

awk '/clerk/, /manager/ {print $1, $2}' employees.txt
Example of a range pattern.Example of a range pattern.

The pattern above instructs awk to print all the lines of the input containing the keywords “clerk” and “manager”.

Special Expression Patterns

Special expression patterns include BEGIN and END which denote program initialization and end. The BEGIN pattern matches the beginning of the input, before the first record is processed. The END pattern matches the end of the input, after the last record has been processed.

For example, you can instruct awk to display a message at the beginning and at the end of the process:

awk 'BEGIN { print "List of debtors:" }; {print $1, $2}; END {print "End of the debtor list"}' debtors.txt
An example of a special expression pattern in awk.An example of a special expression pattern in awk.

Combining Patterns

The awk command allows users to combine two or more patterns using logical operators. The combined patterns can be any Boolean combination of patterns. The logical operators for combining patterns are:

  • || (or)
  • && (and)
  • ! (not)

For example:

awk '$3 > 10 && $4 < 20 {print $1, $2}' employees.txt 
Example of a combining pattern in awk.Example of a combining pattern in awk.

The output prints the first and second fields of those records whose third field is greater than ten and the fourth field is less than 20.

AWK Variables

The awk command has built-in field variables, which break the input file into separate parts called fields. The awk assigns the following variables to each data field:

  • $0. Used to specify the whole line.
  • $1. Specifies the first field.
  • $2. Specifies the second field.
  • etc.

Other available built-in awk variables are:

  • NR. Counts the number of input records (usually lines). The awk command performs the pattern/action statements once for each record in a file.

For example:

awk '{print NR,$0}' employees.txt
Example of the NR variable in awk.Example of the NR variable in awk.

The command displays the line number in the output.

  • NF. Counts the number of fields in the current input record and displays the last field of the file.

For example:

awk '{print $NF}' employees.txt
Example of the NF variable in awk.Example of the NF variable in awk.
  • FS. Contains the character used to divide fields on the input line. The default separator is space, but you can use FS to reassign the separator to another character (typically in BEGIN).

For example, you can make the etc/passwd file (user list) more readable by changing the separator from a colon (:) to a dash (/) and print out the field separator as well:

awk -FS 'BEGIN{FS=":"; OFS="-"} {print $0}' /etc/passwd
An example of the FS variable in awk.An example of the FS variable in awk.
  • RS. Stores the current record separator character. The default input line is the input record, which makes a newline the default record separator. The command is useful if the input is a comma-separated file (CSV).

For example:

awk 'BEGIN {FS="-"; RS=","; OFS=" owes Rs. "} {print $1,$2}' debtors.txt
An example of the RS variable in awk.An example of the RS variable in awk.

Note: We first used the cat command to show the file’s contents and then formatted the output with AWK.

  • OFS. Stores the output field separator, which separates the fields when printed. The default separator is a blank space. Whenever the printed file has several parameters separated with commas, the OFS value is printed between each parameter.

For example:

awk 'OFS=" works as " {print $1,$3}' employees.txt
An example of the ofs variable in awk.An example of the ofs variable in awk.

AWK Actions

The awk tool follows rules containing pattern-action pairs. Actions consist of statements enclosed in curly braces {} which contain expressions, control statements, compound statements, input and output statements, and deletion statements. Those statements are described in the sections above.

Create an awk script using the following syntax:

awk '{action}' 

For example:

awk '{print "How to use the awk command"}'
An example of an awk action.An example of an awk action.

This simple command instructs awk to print the specified string each time you run the command. Terminate the program using Ctrl+D.

How to Use the AWK Command – Examples

Apart from manipulating data and producing formatted outputs, awk has other uses as it is a scripting language and not only a text processing command. This section explains alternative use cases for awk .

  • Calculations. The awk command allows you to perform arithmetic calculations. For example:
df | awk '/\/dev\/loop/ {print $1"\t"$2 + $3}'
An example of using awk for performing arithmetic calculations.An example of using awk for performing arithmetic calculations.

In this example, we pipe into the df command and use the information generated in the report to calculate the total memory available and used by the mounted filesystems that contain only /dev and /loop in the name.

The produced report shows the memory sum of the /dev and /loop filesystems in columns two and three in the df output.

  • Filtering. The awk command allows you to filter the output by limiting the length of the lines. For example:
awk 'length($0) > 8' /etc/shells
An example of using awk to filter a command output.An example of using awk to filter a command output.

In this example, we ran the /etc/shells system file through awk and filtered the output to contain only the lines containing more than 8 characters.

  • Monitoring. Check if a certain process is running in Linux by piping into the ps command. For example:
ps -ef | awk '{ if($NF == "clipboard") print $0}'
An example of using awk to check running processes.An example of using awk to check running processes.

The output prints a list of all the processes running on your machine with the last field matching the specified pattern.

  • Counting. You can use awk to count the number of characters in a line and get the number printed in the result. For example:
awk '{ print "The number of characters in line", NR,"=" length($0) }' employees.txt
An example of using awk to count character number in each line of a file.An example of using awk to count character number in each line of a file.

Note: Read also about gawk command, a text-processing and data-manipulating tool.

Conclusion

After reading this tutorial, you know what the awk command is and how you can use it effectively for various use cases.

The awk command is also a scripting language with many uses, and it is essential knowledge for every Linux user. Use it for powerful text manipulations, but also as a scripting language.

Was this article helpful?
YesNo

RELATED ARTICLES

Most Popular

Recent Comments