A Brief Introduction to Bash Programming

DONG Yuxuan @ May 24, 2020 Asia/Shanghai

A very compact note on Bash programming for looking things up or quickly learning Bash by very experienced programmers. It supposes that you have basic understanding of Bash; Be familar with concepts like pipes and redirection. Be experienced with another programming language like C or Python; Use Bash to manipulate Unix systems but don’t program with it.

Escaping

Unlike other programming languages, Bash is all about processes, strings, expansions and replacements.

When Bash read a command from a source file or a keyboard, it splits the command into tokens by blanks. For example, rm a b means call the rm program and pass 2 arguments a and b to it. This cause rm to delete two files a and b. If what we want is to delete one single file a b it will fail us. To achieve our goal we could use one of the following commands.

\, '...', and "..." are all escaping characters in Bash. Escaping means treat a meta character like blank as a normal one.

\ escapes the meta character follows it. "..." escapes most meta characters in ... except $ and \. '...' escapes all meta characters in ....

Bash has a large number of meta characters and you can find them and their meanings here.

The important ones here are the blank and newline. The blank separates arguments (tokens) and the newline ends a command.

Since \ can even escape the newline, we often use it to write a long command as multiple lines.

% rm a b c \
	d e f \
	g h i

Multiple Commands

If we want to execute more than one program in a single command. We can combine them with ;.

% echo deleting...; rm a

This is equivalent to the following.

% echo deleting...
% rm a

We could also introduce conditions here. If we want to delete file b only if we successfully remove the file a, we could write as the following.

% rm a && rm b

Like in other programming languages, && means the logical operator AND. A command is true if and only if the exit code is 0.

There are also other logical operators. || means OR and ! mens NOT. The short-circuit evaluation exits in Bash too. Thus deleting b if we failed on deleting a can be written as the following two froms.

% rm a || rm b
% ! rm a && rm b

Parentheses are used to control the priority as usual. However, when commands are placed in a pair of parentheses it not only means they have a higher priority but alse means those commands should be executed in a subshell.

Here Documents/Strings

When use Bash a programming language, we frequently want to control want goes to the standard input of a command. We could achieve it by the here documents.

cat << EOF
Hello, world.
I hate programming with Shell.
Not just Bash.
EOF

The syntax is command << END_TOKEN. As the above example shows, it takes the following content until the end token which in the example is EOF as the input for the command.

If you just want to send a simple content to the command a here document could be too heavy. We could use a here string to replace it.

% cat <<< "HELLO, WORLD"
HELLO, WORLD

A here string uses <<< to replace << and doesn’t need an end token.

Expanasions

After splits a command line into tokens Bash expands tokens with the following rules.

The above expansions are path expansions. If no files or directories are matched they will not be expanded.

The following are non-path expansions.

Variables

Use varname=varval to create or overwrite a Shell variable. Use $varname or ${varname} to access a variable.

% name=Yuxuan
% echo $name
Yuxuan

By default a Shell variable is not necessarily an environment variable. Thus it can’t be accessed by subprocesses.

Use export varname to export a Shell variable as a environment variable.

export name

Useful special variables are the following:

When reference a variable, $VAR and "$VAR" are differnt. If the variable includes blanks, $VAR will be expanded to multiple tokens but "$VAR" will be expanded to a single toke contains blanks.

Strings

Let’s give an example to return the basename without the extension name of a path.

% path=/usr/local/test.txt
% echo ${${path##*/}%.*}
test

Script Arguments

To handle complex arguments, we can use getopts. See getopts(1).

Arrays

We can define an array using ARRAY=(val0 val1 ... valn) and reference its elements by ${ARRAY[0]}, ${ARRAY[1]}. To reference the whole array, use ${ARRAY[@]}.

ARRAY=(1 2 3 'hello world' 4 5)

# hello world
echo "${ARRAY[3]}"

ARRAY[0]=9

# 9
echo ${ARRAY[0]}

As introduced above, $VAR and "$VAR" are differnt. ${ARRAY[@]} and "${ARRAY[@]}" are differnt in the same way.

# countargs prints the number of arguments it received
function countargs() {
	echo $#
}

ARRAY=(1 2 "hello world")

# 4
countargs ${ARRAY[@]}

# 3
countargs "${ARRAY[@]}"

Read from the Standard Input

read [-options] [variable...]

read reads from stdin and stores values into variables.

% read A B
hello world
% echo $A $B
hello world

Use -a to set read to read to an array.

% read -a names
DONG Yuxuan
% echo ${names[0]} ${names[1]}
DONG Yuxuan

The $IFS variable specify the delimiter read uses.

Conditions

if commands; then
	commands
[elif commands; then
	commands...]
[else
	commands]
fi

Just like common if-else in other languages. If commands exits with status 0, if thinks that the condtion is true.

The ; is used to help Shell to split. If you write in multiple lines it can be omitted.

if true
then
	echo yes
fi

If you write all conditions in one line you need multiple ;.

if [ $UID = 501 ]; then echo YES; else echo NO; fi

[ ... ] is just a builtin command to test a condition expression. If the expression is true, it exits with status 0; Else it exits with a non-zero status code. Thus it’s very handy to use with if.

Be careful, there must be spaces after [ and before ].

[ ... ] is just the simplified form of test .... To see its detailed usage, run man test.

Besides [ ... ] and test, Bash supports [[ ... ]]. It’s the same as [ ... ] but regular expressions are supported.

URL=http://dongyuxuan.me

if [[ $URL =~ ^http[s]?://.+\..+(/.*)*$ ]]
then
	echo It\'s a valid URL.
else
	echo It\'s an invalid URL.
fi

Another frequently used command for building condition expressions is (( ... )). (( ... )) executes the arithmetic express .... If the result is 0 it returns non-zero exit code. Else it returns zero exit code.

if (( 100 > 50 ))
then
	echo YES
fi

An assignment can be used in (( ... )).

if (( x = 100 + 2 ))
then
	echo $x
else
	echo zero
fi

As we can see, there’s no $ while referencing variables in (( ... )).

Besides if Bash supports case.

case expression in
	pattern )
		commands ;;
	pattern )
		commands ;;
	...
esac

Just a simplified form of if ... then ... elif ... then ... elif ... then ... ... else ... fi.

case $1 in
	1 )
		echo Monday ;;
	2 )
		echo Tuesday ;;
	3 )
		echo Wednesday ;;
	4 )
		echo Thursday ;;
	5 )
		echo Friday ;;
	6 )
		echo Saturday ;;
	7 )
		echo Sunday ;;
	* )
		echo Other ;;
esac

Loops

While Loops

while condition; do
	commands
done

Like normal while in common programming languages. If condition is true commands will be exexcuted. Repeat the procedure util condition becomes false;

Writing condition is the same as in if, nothing new.

; and line wrapping is the same as in if, nothing new.

Until Loops

until condition; do
	commands
done

util is the negtive of while.

Python-like For Loops

for variable in list
do
	commands
done

It’s the Python-like loop. list can either an array variable or literal list.

# print a, b, and c, one letter per line

for i in a b c
do
	echo $i
done
# print all C source files in the current directory

for i in *.c
do
	echo $i
done

C-like For Loops

for (( expression1; expression2; expression3 )); do
	commands
done

It’s equivalent to the following while loop.

(( expression1 ))
while (( expression2 )); do
	commands
	(( expression3 ))
done

Functions

There’re two ways to define a function.

fn() {
	...
}
function fn() {
	...
}

After a function is defined it can be used as a command. In the function arguments are also in $1, $2, ... like in usual command. The diffrence between functions and usual commands is that a funtion runs in the same process. Thus shell variables are available.

We use the local keyword to diffrentiate from function-local variables and shell-global variables.

foo=1
bar=2

fn() {
	local bar

	foo=3
	bar=4
}

fn

# 3 2
echo $foo $bar

Using the return command we could exit a function. The value following return will be set as the exit code of the function command. As usual commands the exit code will be used to judge if the function is successfully executed.

fn() {
	return 5
}

fn

# 5
echo $?