{ Cut, Sed, Awk, and Xargs. }

Objectives

By the end of this chapter, you should be able to:

  • Understand what the cut command does and list some use cases for it
  • Understand what the sed command does and list some use cases for it
  • Understand what the awk command does and list some use cases for it
  • Understand what the xargs command does and list some use cases for it

cut

The cut command is very useful for separating or delimiting strings and characters. If you need to split apart text files or find certain characters, cut is the way to go. Let's see how it works with a few examples in a file called languages.txt:

languages.txt

Java,James
Ruby,Matz
Lisp,John
Bash,Brian
Self,David

Here's an example using cut to grab the first 4 characters of each line (the -c flag indicates that the numerical range coming after the flag is referencing characters, not bytes):

cut -c 1-4 languages.txt

Java
Ruby
Lisp
Bash
Self

Now let's grab the last names, but not by the number of characters. Instead, let's do this by delimiting (splitting) on the comma

cut -d, -f2 languages.txt

James
Matz
John
Brian
David

So what is the -f2? -f refers to each portion that has been split by the delimeter (-d) flag. So if we just want the names we can do

cut -d, -f1 languages.txt

Java
Ruby
Lisp
Bash
Self

This is much better because if we had languages of different character lengths, cut -c 1-4 languages.txt would not work!

Next, try to use cut to print out just the authors. Then sort them and then return the first two authors!

Here's one solution:

cut -d, -f2 languages.txt | sort | head -n 2

sed

Sed, or Stream EDitor, is one of the much more complex terminal commands, so we will only be going over some simple uses cases. There are many, many things you can do with sed so try to follow these examples and push yourself to keep learning!

Sed is commonly used for finding and replacing text, editing text in a file, and printing certain parts of a file (though it can to much more). Let's start by finding and replacing each comma in the languages.txt file with a colon.

Here's the command:

sed 's/,/:/g' languages.txt

Let's break this down:

sed - the command
s - substitute
, - our old value, a comma
: - our new value
g - globally, do this everywhere not just the first match
languages.txt - the file we are working with

Here's what the command should output:

Java:James
Ruby:Matz
Lisp:John
Bash:Brian
Self:David

Notice that if you cat languages.txt after running the above sed command, the original file is unchanged. In order to edit the file, we need to use the i flag to edit in place. But if you try running the command with the -i flag you'll get an error about extra characters. Since we are also specifying additional arguments, we need to use the -e flag as well.

sed -ie 's/,/:/g' languages.txt

If you run this command, and then cat languages.txt, you should see that everything has been replaced!

We are just scratching the surface with sed. If you want to learn more about it you can read this and this

awk

Similar to sed, awk is an incredibly powerful text processing tool. In fact, AWK itself is actually a language that can pretty much do any kind of text manipulation you can think of. We will briefly cover awk and go over a few useful commands you can use with it.

Let's start with the simple command awk '{print}' languages.txt. This will simply print out the languages.txt file. But what happens if we want to just print out a certain part of the file? We can also use awk as a delimiter using the -F flag. Let's set : to be our delimiter and just print out the first portion delimited using $1

awk -F ':' '{print $1}' languages.txt

Let's look at another example, using the history command. If you type this into the terminal, you should see a list of your recent commands. Let's use awk to get a history of just the names of the commands we've used, with no further details. When using awk, if the delimiter is an empty space we do not need the -F flag. So if we want to find the commands we recently used we can type history | awk '{print $2}'.

We can also use awk to print out rows and columns. For example, if you type df -h into the terminal you'll see a table withs some information on your hard drive (df is short for display free desk space). Let's pass the result of this command into awk:

df -h | awk 'FNR == 2 {print $4}'

FNR refers to the record number which is usually a line number. Essentially we're telling awk to grab the item in the 2nd row and 4th column of the table, which in this case should be your available disk space.

If you want to read more about awk, check out this tutorial.

xargs

Sometimes you may want to execute the same command for multiple inputs. For instance, maybe you want to search multiple text files for a certain piece of text. This is a case where the xargs command can be quite handy. Here are some example use cases for xargs; each one runs command for a group of files instead of a single one.

find . -name *.html | xargs grep "elie" - look for the text "elie" inside of every single html file in the current folder

ls | xargs wc -l - counts the total amount of lines for each file in a folder

find . -name "*" | xargs open - opens all of the files in the current folder.

find . -name "*.css" | xargs open - opens all css files in the current folder

find . -name "*.html | xargs rm - removes all files that end with .html

ls | xargs -t -I {} mv {} {}.md - adds a markdown file extension to all files (the -I flag replaces occurrences; the -t flag causes the command that gets run for each input to be logged to the terminal before it is executed, which can be helpful for debugging).

You can read more about xargs here

When you're ready, move on to Shell Scripting and Vim

Continue

Creative Commons License