Sort Command in Linux with Examples

Sorting is the process of arranging records into a specified sequence. Examples of sorting would be arranging a list of usernames into alphabetical order or a set of file sizes into numeric order.

In its simplest form, the sort command will alphabetically sort lines (including any whitespace or control characters which are encountered). The sort command uses the local locale (language definition) to determine the order of the characters (referred to as the collating order). In the following example, user first displays the contents of the file /etc/sysconfig/mouse as is, and then sorts the contents of the file alphabetically.

$ cat /etc/sysconfig/mouse
FULLNAME="Generic - 2 Button Mouse (PS/2)"
MOUSETYPE="ps/2"
XEMU3="yes"
XMOUSETYPE="PS/2"
DEVICE=/dev/psaux
$ sort /etc/sysconfig/mouse
DEVICE=/dev/psaux
FULLNAME="Generic - 2 Button Mouse (PS/2)"
MOUSETYPE="ps/2"
XEMU3="yes"
XMOUSETYPE="PS/2"

If called with arguments, the arguments are interpreted as (possibly multiple) filenames to be sorted. If called without argument, the sort command will sort whatever it reads from standard in.

Modifying the sort order

By default, the sort command sorts lines alphabetically. The following table lists command line switches which can be used to modify this default sort order.

Switch Effect
-b, –ignore-leading-blanks Ignore spaces and tabs at the beginning of a line.
-d, –dictionary-order Consider only blanks and alphanumeric characters.
-f, –ignore-case Treat all characters as uppercase.
-g, –general-numeric-sort Compare words as floating point numbers.
-n, –numeric-sort Compare words as integers.
-r, –reverse Sort in descending rather than ascending order.

As an example, user is examining the file sizes of all files that start with an m in the /var/log directory.

$ ls -s1 /var/log/m*
20 /var/log/maillog
3104 /var/log/maillog.1
1552 /var/log/maillog.2
1952 /var/log/maillog.3
1236 /var/log/maillog.4
4 /var/log/messages
384 /var/log/messages.1
636 /var/log/messages.2
216 /var/log/messages.3
560 /var/log/messages.4

user next sorts the output with the sort command.

$ ls -s /var/log/m* | sort
1236 /var/log/maillog.4
1552 /var/log/maillog.2
1952 /var/log/maillog.3
20 /var/log/maillog
216 /var/log/messages.3
3104 /var/log/maillog.1
384 /var/log/messages.1
4 /var/log/messages
560 /var/log/messages.4
636 /var/log/messages.2

Without being told otherwise, the sort command sorted the lines alphabetically (with 1952 coming before 20). Realizing this is not what user intended, user adds the -n command line switch.

$ ls -s /var/log/m* | sort -n
4 /var/log/messages
20 /var/log/maillog
216 /var/log/messages.3
384 /var/log/messages.1
560 /var/log/messages.4
636 /var/log/messages.2
1236 /var/log/maillog.4
1552 /var/log/maillog.2
1952 /var/log/maillog.3
3104 /var/log/maillog.1

Better, but user would prefer to reverse the sort order, so that the largest files come first. user adds the -r command line switch.

$ ls -s /var/log/m* | sort -nr
3104 /var/log/maillog.1
1952 /var/log/maillog.3
1552 /var/log/maillog.2
1236 /var/log/maillog.4
636 /var/log/messages.2
560 /var/log/messages.4
384 /var/log/messages.1
216 /var/log/messages.3
20 /var/log/maillog
4 /var/log/messages

Why ls -1?: Why was the -1 command line switch given to the ls command in the first example, but not the others? By default, when the ls command is using a terminal for standard out, it will group the filenames in multiple columns for easy readability. When the ls command is using a pipe or file for standard out, however, it will print the files one file per line. The -1 command line switch forces this behavior for for terminal output as well.

Specifying Sort Keys

In the previous examples, the sort command performed its sort based on the first characters found on a line. Often, formatted data is not arranged so conveniently. Fortunately, the sort command allows users to specify which column of tabular data to use for determining the sort order, or, in more formally, which column should be used as the sort key.

The following table of command line switches can be used to determine the sort key.

Switch Effect
-k, –key=POS Use the key at POS to determine sort order.
-t, –field-separator=SEP Use the character(s) SEP to separate fields (instead of simply whitespace).

Sorting Output by a Particular Column

As an example, suppose user wanted to re-examine her log files, using the long format of the ls command. He/user tries to sort the output numerically.

# ls -l /var/log/m* | sort -n
-rw-------. 1 root root   53524 Jun 11 02:37 /var/log/maillog-20201024
-rw-------. 1 root root       0 Oct 24 15:36 /var/log/maillog
-rw-------. 1 root root 3388685 Oct 24 15:35 /var/log/messages-20201024
-rw-------. 1 root root  743976 Oct 30 12:48 /var/log/messages

Now that the sizes are no longer reported at the beginning of the line, user has difficulty. Instead, user repeats his sort using the -k command line switch to sort her output by the 5th column, producing the desired output.

# ls -l /var/log/m* | sort -n -k5
-rw-------. 1 root root       0 Oct 24 15:36 /var/log/maillog
-rw-------. 1 root root   53524 Jun 11 02:37 /var/log/maillog-20201024
-rw-------. 1 root root  744999 Oct 30 12:49 /var/log/messages
-rw-------. 1 root root 3388685 Oct 24 15:35 /var/log/messages-20201024

Specifying Multiple Sort Keys

Next, user is examining the file /etc/services. He/She uses the grep command to extract the data from the file where servicename starts with “a”.

# cat /etc/services | grep ^a
auth            113/tcp         authentication tap ident
auth            113/udp         authentication tap ident
at-rtmp         201/tcp                         # AppleTalk routing
at-rtmp         201/udp
at-nbp          202/tcp                         # AppleTalk name binding
at-nbp          202/udp
at-echo         204/tcp                         # AppleTalk echo
at-echo         204/udp
at-zis          206/tcp                         # AppleTalk zone information
at-zis          206/udp
acap            674/tcp
acap            674/udp
afpovertcp      548/tcp                         # AFP over TCP
afpovertcp      548/udp                         # AFP over TCP
afs3-fileserver 7000/tcp                        # file server itself
...

User next sorts the data numerically, using the 1st column as key.

# cat /etc/services | grep ^a | sort -k1
a13-an          3125/tcp                # A13-AN Interface
a13-an          3125/udp                # A13-AN Interface
a14             3597/tcp                # A14 (AN-to-SC/MM)
a14             3597/udp                # A14 (AN-to-SC/MM)
a15             3598/tcp                # A15 (AN-to-AN)
a15             3598/udp                # A15 (AN-to-AN)
a16-an-an       4598/tcp                # A16 (AN-AN)
a16-an-an       4598/udp                # A16 (AN-AN)
....

Specifying the Field Separator

The above examples have demonstrated how to sort data using a specified field as the sort key. In all of the examples, fields were separated by whitespace (i.e., a series of spaces and/or tabs). Often in Linux (and Unix), some other method is used to separate fields. Consider, for example, the /etc/passwd file.

# head /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin

The lines are structured into seven fields each, but the fields are separated using a “:” instead of whitespace. With the -t command line switch, the sort command can be instructed to use some specified character (such as a “:”) to separate fields.

In the following, user uses the sort command with the -t command line switch to sort the first 10 lines of the /etc/passwd file by home directory (the 6th field).

# head /etc/passwd | sort -t: -k6
bin:x:1:1:bin:/bin:/sbin/nologin
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
halt:x:7:0:halt:/sbin:/sbin/halt
daemon:x:2:2:daemon:/sbin:/sbin/nologin
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin

The user bin, with a home directory of /bin, is now at the top, and the user mail, with a home directory of /var/spool/mail, is at the bottom.

Summary

In summary, we have seen that the sort command can be used to sort structured data, using the -k command line switch to specify the sort field (perhaps more than once), and the -t command line switch to specify the field delimiter.

The -k command line switch can receive more sophisticated arguments, which serve to specify character positions within a field, or customize sort options for individual fields. See the sort(1) man page for details.