Debugging Bash scripts

Tips for linting, tracing and debugging

When I first had to debug a Bash script I was completely lost. I just wished that I’d have my usual debugging toolbox for this environment. Luckily, it supports most of the methods available for other languages. This post is about how to lint, trace and debug shell scripts.

Linting with ShellCheck

In many cases, a seemingly perfect script does not work because of a single missing semicolon or whitespace. On top of that, many constructs in Bash has weird edge cases, and some commands might not work the same in all environments.

Linters statically analyze the code to pinpoint many of these situations. There’s a lot to learn from the almost instant feedback they provide.

ShellCheck is such a tool for Bash.

To demonstrate it, let’s see this (broken) script:

#!/bin/bash

var = 42
echo $var

Executing it results in this error: ./awesome.sh: line 3: var: command not found. Which is a bit cryptic. Let’s see what ShellCheck has to say about it:

» shellcheck awesome.sh

In awesome.sh line 3:
var = 42
    ^-- SC1068: Don't put spaces around the = in assignments (or quote to make it literal).

For more information:
  https://www.shellcheck.net/wiki/SC1068 -- Don't put spaces around the = in ...

Ah, that’s more helpful! ShellCheck has more than 300 rules to catch issues like this, and it even supports autofixes for some of the cases.

You can install it locally, or use the official Docker image.

docker run --rm -it -v $(pwd):/mnt koalaman/shellcheck *.sh
Shameless plug
We write articles like this regularly. Join our mailing list and let's keep in touch.

Oftentimes figuring out the control flow of a script is not obvious, but you have to find out which part of the code was executed.

One approach is the good old print debugging, inserting echo statements all over the code.

This debugging style is quite straight-forward and easy to use. One possible gotcha is that echoing to the standard output of a function might change the program’s behavior.

Because the error stream is less used, usually it’s a good idea to use it for debugging.

if [[ some_condition ]]; then
   echo "Branch 1 executed" >&2
   ...
else
   echo "Branch 2 executed" >&2
   ...
fi

A more robust approach is to use logger to write to the system log.

Tracing

For more complex scripts, an easier alternative is to enable tracing. When tracing is enabled each command and its arguments are printed before they are executed, so the entire flow of the program can be observed.

Let’s see this in action. For the sake of this example, consider the following script that calculates the price of the ice cream. The base price for each portion is 100, but when you order at least 3 portions, you get a discount on odd days.

#!/bin/bash
function calculatePrice() {
    if [[ $numberOfPortions -lt 3 ]]; then
        echo "100"
    else
        day=$(date -d "$D" '+%d')
        if (( $day % 2 )); then
            # Discount on odd days
            echo "80"
        else
            echo "100"
        fi
    fi
}

numberOfPortions=$1
pricePerPortion=$(calculatePrice $numberOfPortions)
totalPrice=$(( $numberOfPortions * $pricePerPortion ))

echo "Total $totalPrice"

The script expects the number of portions as an argument and prints the total price.

» ./ice_cream_price.sh 4
Total 320

Reading through the script to figure out why it produced such a result might not always be easy.

Let’s see how tracing can help better understand this program.

To enable it, add set -x to the beginning of the script, or simply pass -x as a parameter to Bash.

» bash -x ./ice_cream_price.sh 4
+ numberOfPortions=4
++ calculatePrice 4
++ [[ 4 -lt 3 ]]
+++ date -d '' +%d
++ day=11
++ ((  11 % 2  ))
++ echo 80
+ pricePerPortion=80
+ totalPrice=320
+ echo 'Total 320'
Total 320

Tracing can be enabled and disabled at any given line, so it’s possible to reduce the debug output to certain parts.

this_call_is_not_traced
set -x # enable tracing
tricky_function
set +x # disable tracing
this_call_is_not_traced

It’s also possible to customize the trace messages by setting the PS4 variable.

This allows getting more information. The following example enhances tracing output to include the name of the file, function and line number:

» export PS4='+(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }'
» bash -x ./ice_cream_price.sh 4
+(./ice_cream_price.sh:19): numberOfPortions=4
++(./ice_cream_price.sh:20): calculatePrice 4
++(./ice_cream_price.sh:6): calculatePrice(): [[ 4 -lt 3 ]]
+++(./ice_cream_price.sh:9): calculatePrice(): date -d '' +%d
++(./ice_cream_price.sh:9): calculatePrice(): day=11
++(./ice_cream_price.sh:10): calculatePrice(): ((  11 % 2  ))
++(./ice_cream_price.sh:14): calculatePrice(): echo 80
+(./ice_cream_price.sh:20): pricePerPortion=80
+(./ice_cream_price.sh:21): totalPrice=320
+(./ice_cream_price.sh:23): echo 'Total 320'
Total 320

Breakpoints

Sometimes it’s just handy to stop the program execution at any given point to execute commands step-by-step and see how they behave. Luckily, this is possible in Bash—to some extent.

Similarly to print debugging, one can add extra read commands to the code to stop script execution until manual intervention.

command_1

echo "Press enter to continue!" >&2
read

command_2

This might come in handy if you’d like to examine the effects of a script at a given step.

It’s also possible to step through the whole script (or parts of a script) by adding read as a debug trap. It’s best used with tracing to see which commands were executed.

# Enable debugging
set -x
trap read debug

...

# Disable debugging
set -x
trap - debug

Coming back to the ice cream example, this is what it looks like to debug the calculatePrice function:

Debugging in Bash

Summary

Debugging a Bash script is not an easy task. This language has more rough edges than the others I’ve used, and typically the tooling is just a text editor. In this context, it’s really important to know the tools available to make this challenging task more efficient.

14 April 2020