Skip to main content
LPIC-1 Exam 101

103.2 Process text streams using filters Part 1

By February 1, 2014September 12th, 2022No Comments
  • Weight: 3
  • Description: Candidates should be able to apply filters to text streams.

Key Knowledge Areas

  • Send text files and output streams through text utility filters to modify the output using standard UNIX commands found in the GNU textutils package.

Terms and Utilities

  • cat
  • cut
  • expand
  • fmt
  • head
  • od
  • join
  • nl
  • paste
  • pr
  • sed
  • sort
  • split
  • tail
  • tr
  • unexpand
  • uniq
  • wc

Working with text files both within scripts and at the command line is very much part of the Linux administrator’s days to day job. Log files are usually text based and being able to search and sort through the entries is very important and will speed the way you work. Very simple tools like cat can become powerful diagnostics tools being able to display hidden characters in files that may cause issues and if you have not already made friends with the stream editor, sed, you will soon.

In part 2 we look at split

Expand and Unexpand

The first tools that we will look at is expand (/usr/bin/expand) and unexpand (/usr/bin/unexpand) and these will remove tabs from files, expand and unxepand will add tabs back in. This needed when you need specifically tab or space separated files and the wrong type is found or even perhaps mixtures of tabs and spacing in the same data file.

Using the command:

expand f1 > f2

We can remove the tabs from the file f1 and write to the new f2 without tabs, f1 is tab separated, f2 will be space separated. Conversely we can use the command:

unexpand -a f2 > f3

This will replace all spaces with tabs and send the output the file f3. F3 will be the same as the original file, f1. We can use the command cat (/bin/cat) to show the tabs which of course are non-printable characters.

cat -A f1

The output will show ^I where the tabs are

cat -A f2

The output will just show the spaces with no tabs. Each tab is translated to 8 spaces by default. Reviewing the output  of the two commands it is easy to understand the use of the name expand as we expand f1 to f2 by removing the TABs and for each TAB adding in 8 spaces.