Chapter12 Bioinformatics Shell Scripting, Writing Pipelines, and Parallelizing Tasks
We’ll see how to write rerunnable Bash shell scripts, automate fileprocessing tasks with find and xargs, run pipelines in parallel, and see a simple makefile.
Basic Bash Scripting
Writing and Running Robust Bash Scripts
1. A robust Bash header
#!/bin/bash
set -e
set -u
set -o pipefail
(1) This is called the shebang, and it indicates the path to the interpreter used to execute this script.
(2) 若某一行shell命令error,shell默认仍会执行完整个命令。set -e 使得error时直接终结整个命令,但某些可能不会有影响的error,如test -d file.txt will return a nonzero exit status if its argument is not a directory,却不会终结。
(3) 若某一命令含有未定义路径,shell仍会执行,如 rm -rf $TEMP_DIR/. If the shell variable $TEMP_DIR isn’t set, Bash will still substitute its value (which is nothing) in place of it. The end result is rm -rf /。使用set -u阻止类似命令的执行。
(4) 即使设置了set -e ,但在pipe中也只有最后一个program出现error时,才会导致程序跳出。set -o pipefail使得pipe 中任一program error都会终止程序。
2. Running Bash scripts
two ways:
bash script.sh: can run any script (as long as it has read permissions) with bash script.sh
./script.sh: might use interpreters other than /bin/bash (e.g., zsh, csh, etc.)
executable permissions:
$ chmod u+x script.sh
Variables and Command Arguments
$ results_dir="results/"
$ echo $results_dir
results/
${variables}
sample="CNTRL01A"
mkdir ${sample}_aln/
Quoting variables: prevents commands from interpreting any spaces or other special characters
sample="CNTRL01A"
mkdir "${sample}_aln/"
1. Command-line arguments
#!/bin/bash
echo "script name: $0"
echo "first arg: $1"
echo "second arg: $2"
echo "third arg: $3"
$ bash args.sh arg1 arg2 arg3
script name: args.sh
first arg: arg1
second arg: arg2
third arg: arg3
Bash assigns the number of command-line arguments to the variable $# (this does not count the script name, $0, as an argument). This is useful for user-friendly messages:
#!/bin/bash
if [ "$#" -lt 3 ] # are there less than 3 arguments?
then
echo "error: too few arguments, you provided $#, 3 required"
echo "usage: script.sh arg1 arg2 arg3"
exit 1
fi
echo "script name: $0"
echo "first arg: $1"
echo "second arg: $2"
echo "third arg: $3"
Running this with too few arguments gives an error (and leads the process to exit with a nonzero exit status)
$ ./script.sh some_arg
error: too few arguments, you provided 1, 3 required
usage: script.sh arg1 arg2 arg3
Conditionals in a Bash Script: if Statements
Bash a bit unique is that a command’s exit status provides the true and false (remember: contrary to other languages, 0 represents true/success and anything else is false/failure)
if [commands]
then
[if-statements]
else
[else-statements]
fi
中文详解:https://blog.youkuaiyun.com/zhan570556752/article/details/80399154