Photo by Nik / Unsplash

`sort` Exercises

Exercises Mar 28, 2025 (Apr 3, 2025) Loading...

Subscribers get to work together on these in Workroom PlayTime 011 on 3 April, 3pm London time, Zoom and Miro.

Go to Workroom PlayTime 011 for login etc.

There is a (very draft) info page at https://www.workroom-productions.com/p/e0bf704b-5e17-4828-ace2-fde1f68306bd/

Exercises

Files

ex01 has one single-digit number per line, and is unsorted.

ex03 has data in comma-delimited columns. The first three columns are date, first name, surname.

ex04 contains some month names / abbreviations, in order as far as English months are concerned. ex04a contains variants.

ex05 – one number per line like ex01, and sometimes a letter.

ex06 contains randomly ordered numbers, with some duplicates, and ex06b contains a collection of numbers with some in 1E6 notation.

Go to https://envs.workroomprds.com, pick a user, drop through to VSCode in the browser. The files to sort are in ~/sort_exercises. We'll be working in the terminal, and you should see something like this at the bottom of your window:

Exercise 1: Basic use

Type cat ex01 on the command line to see the contents (or look using the file browser).

  • Type sort ex01 to see the output on the command line.
  • Compare sort ex01 with sort -R ex01 and sort -r ex01
πŸ’‘
The syntax is sort Β«option(s)Β» Β«file(s)Β»
Sort can reverse with -r ... and randomise with -R

Exercise 2: Plumbing

  • Compare cat ex01 | sort with sort ex01
  • Use sort ex01 > output_of_ex02 to sort into a file called output_of_ex02
  • Use sort ex01 | less to open the output in a a file reader less. Use q to exit the editor.
πŸ’‘
sort is all set up to be used with other commands.
As a standalone tool, with real data, it is a bit unwieldy – it's best used with other tools.

Exercise 3: Columns

Testers need to work with complex data, and need a column sort.

Use sort -t, -k3,3 ex03 to sort it by surname

Use sort -t, -k2,2 ex03 to sort by first name.

Use sort -t, -k3,3 -k2,2 ex03 to sort by surname then first name, and compare with sort -t, -k3,3 -k2,2r ex03 which reverse the sort of the first name.

πŸ’‘
Plain sort compares whole lines, character by character.

Columns need delimiters: sort uses space by default, and takes the -t option to change. Specify -t, to use commas and -t$'\t' to use tabs (probably).

Use options twice to sort on two columns. Use modifiers to change the type of sort.
πŸ’‘
Use -k2,2 to specify a sort on your data's second column.
Use -k2,4 to sort on the second, third and fourth columns.
If you specify -k2 you'll sort on the second column and everything to the left. It's weird, don't do it.

Exercise 4: Checking

You can check if something is sorted with sort -c – which is handy if you're checking a sort for a test, or pre-qualifying some data.

Use sort -c on any of the earlier files – note the error shows the line and the content of the first non-sorted entry.

Use sort -c ex04 to see that a problem is on line 2.

Use sort -Mc ex04 to see that the check changes if told to expect to sort months, and within that style of sort, it accepts varieties of abbreviation and case.

πŸ’‘
Use sort -c to check whether data is sorted, in various types of sort.
Options can stack

This exercise produces not a lot of output – here's the contents of ex04 for interest.

January
Feb
mar
April
dEcEmBeR

Exercise 5: Reducing

Sort can throw away duplicates. This is handy to see what data is in use (i.e. if you want unique account numbers, a list of this sessions error messages), and is handier using a columns selection.

  • Compare sort ex05 and sort -u ex05 – what's thrown away?
  • Compare sort -k1,1 ex05 and sort -uk1,1 ex05 – what's lost now?
  • Weird one: Compare sort -M ex04a and sort -Mu ex04a – what month names are kept?
πŸ’‘
option -u throws away duplicates
'duplicates' depends on the sort
u goes at the start, n at the end, column stuff in the middle...

Exercise 6: Problems and avoidances

Use sort ex06 to see a problem. Try sort -n ex06 to avoid it.

Try sort -g ex06b to see how that works...

πŸ’‘
sort's default is to sort by character.
option -n sorts by value
πŸ’‘
There are other options for other forms, including
* -d dictionary sort – good for names i.e.O'Leary and New York.
* -f caseless i.e. a before B before c.
* -g scientific numeric i.e. 1E-2 is sorted as 0.01
* -h human numeric sorts 1 before 1K before 1G
* -M English month acronym sorts jan before feb.

Testing: look out for the 'wrong' sort: it may only be revealed by novel data. Other systems may break when the 'wrong' sort is corrected.

Exercise 7: Sort and merge

Try sort -g ex06 ex01 ex06b

Sources

Linux sort Command with Examples

Wikipedia sort (Unix)

Man pages

https://ss64.com/bash/sort.html

sort(1) - Linux manual page


Sprue below - not useful.

Member reactions

Reactions are loading...

Sign in to leave reactions on posts

Tags

Comments

Sign in or become a Workroom Productions member to read and leave comments.

James Lyndsay

Getting better at software testing. Singing in Bulgarian. Staying in. Going out. Listening. Talking. Writing. Making.

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.