Skip to main content
awk

Filtering with AWK

By January 18, 2017September 12th, 2022No Comments

Filtering with AWKAWK is a really powerful tool and fits in as the “Daddy” when compared with grep and sed. Each tool has its own specific purpose and we will still use tools like grep often because they are appropriate. However, when you need to bring in the big guns then this is where AWK is your friend. In this example will be filtering with AWK to display specific contents from a text file.

The Right Tools For the Job


Although, when filtering with AWK we are able to manage the same tasks as with grep we will still use grep in many instances. If it is relatively simple check that we need to run then grep is the correct tool. You will find that grep has a much smaller footprint to awk, on Ubuntu 200 KB for grep compared to 600 KB for awk. The command syntax for grep is more simple too, so suits everyday use. Compare the following commands; each search for lines that being with bob from the users database file. This task is more suited to grep.

$ grep '^bob' /etc/passwd
$ awk '/bob/ { print }' /etc/passwd

But AWK is a huge language in itself so even though it is more complex we can achieve much more with the bigger tool.

Filtering with AWK by Searching Fields

Whilst grep is great at search for text in a file it is not very clever with fields in a record. This though, to AWK, is everyday stuff. Let’s say that we want to list groups that the user tux belongs to. We can use the command:

$ id -Gn

This list the groups that tux belongs to and all is great. Well perhaps? Just because a tool is built to to one single task it does not mean that it can do everything. The id command is there just to display user and group information and nothing else. So if you have another file that you want to work on with then you are stuck. So I do apologise for reinventing the wheel but this learning process will help you adopt the code to work on your own files. The /etc/group file is a a great file to work with as it exists on all Linux System. We will show you how we can search specific fields for given text. Additionally, we can display only a limited set of fields from the results set.

Check for tux in the Groups File

Firstly, let’s just search for our user in the /etc/group file with awk. A little warm up exercise:

$ awk -F: '/tux/ { print } ' /etc/group

The output may look something like this depending on the user’s group membership. I have used the user tux which is my user account. You may want to test on your own account or, at least, an account that is on the system.

adm:x:4:syslog,tux
cdrom:x:24:tux
sudo:x:27:tux
dip:x:30:tux
plugdev:x:46:tux
lxd:x:110:tux
tux:x:1000:
lpadmin:x:115:tux
sambashare:x:116:tux

Now of course this is the same output that grep would achieve. Note though, that we have added the -F option to show the field delimiter. Although, we are not using it yet this start to show the power that awk has over grep. Group membership is shown in the 4th field and we only want to display groups that tux belongs to an not the tux group which is the primary group for the user.

Filter on the Membership Field

The membership field for the group is the 4th field. If we modify our search with awk, we can look for tux only in this 4th field. We do this using the awk match function.

$ awk -F: 'match($4,/tux/) { print } ' /etc/group

The output is shown here:

adm:x:4:syslog,tux
cdrom:x:24:tux
sudo:x:27:tux
dip:x:30:tux
plugdev:x:46:tux
lxd:x:110:tux
lpadmin:x:115:tux
sambashare:x:116:tux

Print Only the Group Name

The next step is to ensure that we only display the group name that tux belongs to and not the complete line. The group name is represented by $1, the first field. Rather than bring the entire record we can elect to print selected fields only:

$ awk -F: 'match($4,/tux/) { print $1 } ' /etc/group
adm
cdrom
sudo
dip
plugdev
lxd
lpadmin
sambashare

The result is now that we have run an efficient search and printed just the information that we needed. Enjoy using grep but also enjoy practicing with awk.