Error handling in Bash

2019-01-01 14:28发布

问题:

What is your favorite method to handle errors in Bash? The best example of handling errors I have found on the web was written by William Shotts, Jr at http://www.linuxcommand.org.

He suggests using the following function for error handling in Bash:

#!/bin/bash

# A slicker error handling routine

# I put a variable in my scripts named PROGNAME which
# holds the name of the program being run.  You can get this
# value from the first item on the command line ($0).

# Reference: This was copied from <http://www.linuxcommand.org/wss0150.php>

PROGNAME=$(basename $0)

function error_exit
{

#   ----------------------------------------------------------------
#   Function for exit due to fatal program error
#       Accepts 1 argument:
#           string containing descriptive error message
#   ---------------------------------------------------------------- 

    echo \"${PROGNAME}: ${1:-\"Unknown Error\"}\" 1>&2
    exit 1
}

# Example call of the error_exit function.  Note the inclusion
# of the LINENO environment variable.  It contains the current
# line number.

echo \"Example of error with line number and message\"
error_exit \"$LINENO: An error has occurred.\"

Do you have a better error handling routine that you use in Bash scripts?

回答1:

Use a trap!

tempfiles=( )
cleanup() {
  rm -f \"${tempfiles[@]}\"
}
trap cleanup 0

error() {
  local parent_lineno=\"$1\"
  local message=\"$2\"
  local code=\"${3:-1}\"
  if [[ -n \"$message\" ]] ; then
    echo \"Error on or near line ${parent_lineno}: ${message}; exiting with status ${code}\"
  else
    echo \"Error on or near line ${parent_lineno}; exiting with status ${code}\"
  fi
  exit \"${code}\"
}
trap \'error ${LINENO}\' ERR

...then, whenever you create a temporary file:

temp_foo=\"$(mktemp -t foobar.XXXXXX)\"
tempfiles+=( \"$temp_foo\" )

and $temp_foo will be deleted on exit, and the current line number will be printed. (set -e will likewise give you exit-on-error behavior, though it comes with serious caveats and weakens code\'s predictability and portability).

You can either let the trap call error for you (in which case it uses the default exit code of 1 and no message) or call it yourself and provide explicit values; for instance:

error ${LINENO} \"the foobar failed\" 2

will exit with status 2, and give an explicit message.



回答2:

That\'s a fine solution. I just wanted to add

set -e

as a rudimentary error mechanism. It will immediately stop your script if a simple command fails. I think this should have been the default behavior: since such errors almost always signify something unexpected, it is not really \'sane\' to keep executing the following commands.



回答3:

Reading all the answers on this page inspired me a lot.

So, here\'s my hint:

file content: lib.trap.sh

lib_name=\'trap\'
lib_version=20121026

stderr_log=\"/dev/shm/stderr.log\"

#
# TO BE SOURCED ONLY ONCE:
#
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##

if test \"${g_libs[$lib_name]+_}\"; then
    return 0
else
    if test ${#g_libs[@]} == 0; then
        declare -A g_libs
    fi
    g_libs[$lib_name]=$lib_version
fi


#
# MAIN CODE:
#
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##

set -o pipefail  # trace ERR through pipes
set -o errtrace  # trace ERR through \'time command\' and other functions
set -o nounset   ## set -u : exit the script if you try to use an uninitialised variable
set -o errexit   ## set -e : exit the script if any statement returns a non-true return value

exec 2>\"$stderr_log\"


###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##
#
# FUNCTION: EXIT_HANDLER
#
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##

function exit_handler ()
{
    local error_code=\"$?\"

    test $error_code == 0 && return;

    #
    # LOCAL VARIABLES:
    # ------------------------------------------------------------------
    #    
    local i=0
    local regex=\'\'
    local mem=\'\'

    local error_file=\'\'
    local error_lineno=\'\'
    local error_message=\'unknown\'

    local lineno=\'\'


    #
    # PRINT THE HEADER:
    # ------------------------------------------------------------------
    #
    # Color the output if it\'s an interactive terminal
    test -t 1 && tput bold; tput setf 4                                 ## red bold
    echo -e \"\\n(!) EXIT HANDLER:\\n\"


    #
    # GETTING LAST ERROR OCCURRED:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #

    #
    # Read last file from the error log
    # ------------------------------------------------------------------
    #
    if test -f \"$stderr_log\"
        then
            stderr=$( tail -n 1 \"$stderr_log\" )
            rm \"$stderr_log\"
    fi

    #
    # Managing the line to extract information:
    # ------------------------------------------------------------------
    #

    if test -n \"$stderr\"
        then        
            # Exploding stderr on :
            mem=\"$IFS\"
            local shrunk_stderr=$( echo \"$stderr\" | sed \'s/\\: /\\:/g\' )
            IFS=\':\'
            local stderr_parts=( $shrunk_stderr )
            IFS=\"$mem\"

            # Storing information on the error
            error_file=\"${stderr_parts[0]}\"
            error_lineno=\"${stderr_parts[1]}\"
            error_message=\"\"

            for (( i = 3; i <= ${#stderr_parts[@]}; i++ ))
                do
                    error_message=\"$error_message \"${stderr_parts[$i-1]}\": \"
            done

            # Removing last \':\' (colon character)
            error_message=\"${error_message%:*}\"

            # Trim
            error_message=\"$( echo \"$error_message\" | sed -e \'s/^[ \\t]*//\' | sed -e \'s/[ \\t]*$//\' )\"
    fi

    #
    # GETTING BACKTRACE:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #
    _backtrace=$( backtrace 2 )


    #
    # MANAGING THE OUTPUT:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #

    local lineno=\"\"
    regex=\'^([a-z]{1,}) ([0-9]{1,})$\'

    if [[ $error_lineno =~ $regex ]]

        # The error line was found on the log
        # (e.g. type \'ff\' without quotes wherever)
        # --------------------------------------------------------------
        then
            local row=\"${BASH_REMATCH[1]}\"
            lineno=\"${BASH_REMATCH[2]}\"

            echo -e \"FILE:\\t\\t${error_file}\"
            echo -e \"${row^^}:\\t\\t${lineno}\\n\"

            echo -e \"ERROR CODE:\\t${error_code}\"             
            test -t 1 && tput setf 6                                    ## white yellow
            echo -e \"ERROR MESSAGE:\\n$error_message\"


        else
            regex=\"^${error_file}\\$|^${error_file}\\s+|\\s+${error_file}\\s+|\\s+${error_file}\\$\"
            if [[ \"$_backtrace\" =~ $regex ]]

                # The file was found on the log but not the error line
                # (could not reproduce this case so far)
                # ------------------------------------------------------
                then
                    echo -e \"FILE:\\t\\t$error_file\"
                    echo -e \"ROW:\\t\\tunknown\\n\"

                    echo -e \"ERROR CODE:\\t${error_code}\"
                    test -t 1 && tput setf 6                            ## white yellow
                    echo -e \"ERROR MESSAGE:\\n${stderr}\"

                # Neither the error line nor the error file was found on the log
                # (e.g. type \'cp ffd fdf\' without quotes wherever)
                # ------------------------------------------------------
                else
                    #
                    # The error file is the first on backtrace list:

                    # Exploding backtrace on newlines
                    mem=$IFS
                    IFS=\'
                    \'
                    #
                    # Substring: I keep only the carriage return
                    # (others needed only for tabbing purpose)
                    IFS=${IFS:0:1}
                    local lines=( $_backtrace )

                    IFS=$mem

                    error_file=\"\"

                    if test -n \"${lines[1]}\"
                        then
                            array=( ${lines[1]} )

                            for (( i=2; i<${#array[@]}; i++ ))
                                do
                                    error_file=\"$error_file ${array[$i]}\"
                            done

                            # Trim
                            error_file=\"$( echo \"$error_file\" | sed -e \'s/^[ \\t]*//\' | sed -e \'s/[ \\t]*$//\' )\"
                    fi

                    echo -e \"FILE:\\t\\t$error_file\"
                    echo -e \"ROW:\\t\\tunknown\\n\"

                    echo -e \"ERROR CODE:\\t${error_code}\"
                    test -t 1 && tput setf 6                            ## white yellow
                    if test -n \"${stderr}\"
                        then
                            echo -e \"ERROR MESSAGE:\\n${stderr}\"
                        else
                            echo -e \"ERROR MESSAGE:\\n${error_message}\"
                    fi
            fi
    fi

    #
    # PRINTING THE BACKTRACE:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #

    test -t 1 && tput setf 7                                            ## white bold
    echo -e \"\\n$_backtrace\\n\"

    #
    # EXITING:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #

    test -t 1 && tput setf 4                                            ## red bold
    echo \"Exiting!\"

    test -t 1 && tput sgr0 # Reset terminal

    exit \"$error_code\"
}
trap exit_handler EXIT                                                  # ! ! ! TRAP EXIT ! ! !
trap exit ERR                                                           # ! ! ! TRAP ERR ! ! !


###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##
#
# FUNCTION: BACKTRACE
#
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##

function backtrace
{
    local _start_from_=0

    local params=( \"$@\" )
    if (( \"${#params[@]}\" >= \"1\" ))
        then
            _start_from_=\"$1\"
    fi

    local i=0
    local first=false
    while caller $i > /dev/null
    do
        if test -n \"$_start_from_\" && (( \"$i\" + 1   >= \"$_start_from_\" ))
            then
                if test \"$first\" == false
                    then
                        echo \"BACKTRACE IS:\"
                        first=true
                fi
                caller $i
        fi
        let \"i=i+1\"
    done
}

return 0



Example of usage:
file content: trap-test.sh

#!/bin/bash

source \'lib.trap.sh\'

echo \"doing something wrong now ..\"
echo \"$foo\"

exit 0


Running:

bash trap-test.sh

Output:

doing something wrong now ..

(!) EXIT HANDLER:

FILE:       trap-test.sh
LINE:       6

ERROR CODE: 1
ERROR MESSAGE:
foo:   unassigned variable

BACKTRACE IS:
1 main trap-test.sh

Exiting!


As you can see from the screenshot below, the output is colored and the error message comes in the used language.

\"enter



回答4:

An equivalent alternative to \"set -e\" is

set -o errexit

It makes the meaning of the flag somewhat clearer than just \"-e\".

Random addition: to temporarily disable the flag, and return to the default (of continuing execution regardless of exit codes), just use

set +e
echo \"commands run here returning non-zero exit codes will not cause the entire script to fail\"
echo \"false returns 1 as an exit code\"
false
set -e

This precludes proper error handling mentioned in other responses, but is quick & effective (just like bash).



回答5:

Inspired by the ideas presented here, I have developed a readable and convenient way to handle errors in bash scripts in my bash boilerplate project.

By simply sourcing the library, you get the following out of the box (i.e. it will halt execution on any error, as if using set -e thanks to a trap on ERR and some bash-fu):

\"bash-oo-framework

There are some extra features that help handle errors, such as try and catch, or the throw keyword, that allows you to break execution at a point to see the backtrace. Plus, if the terminal supports it, it spits out powerline emojis, colors parts of the output for great readability, and underlines the method that caused the exception in the context of the line of code.

The downside is - it\'s not portable - the code works in bash, probably >= 4 only (but I\'d imagine it could be ported with some effort to bash 3).

The code is separated into multiple files for better handling, but I was inspired by the backtrace idea from the answer above by Luca Borrione.

To read more or take a look at the source, see GitHub:

https://github.com/niieani/bash-oo-framework#error-handling-with-exceptions-and-throw



回答6:

I prefer something really easy to call. So I use something that looks a little complicated, but is easy to use. I usually just copy-and-paste the code below into my scripts. An explanation follows the code.

#This function is used to cleanly exit any script. It does this displaying a
# given error message, and exiting with an error code.
function error_exit {
    echo
    echo \"$@\"
    exit 1
}
#Trap the killer signals so that we can exit with a good message.
trap \"error_exit \'Received signal SIGHUP\'\" SIGHUP
trap \"error_exit \'Received signal SIGINT\'\" SIGINT
trap \"error_exit \'Received signal SIGTERM\'\" SIGTERM

#Alias the function so that it will print a message with the following format:
#prog-name(@line#): message
#We have to explicitly allow aliases, we do this because they make calling the
#function much easier (see example).
shopt -s expand_aliases
alias die=\'error_exit \"Error ${0}(@`echo $(( $LINENO - 1 ))`):\"\'

I usually put a call to the cleanup function in side the error_exit function, but this varies from script to script so I left it out. The traps catch the common terminating signals and make sure everything gets cleaned up. The alias is what does the real magic. I like to check everything for failure. So in general I call programs in an \"if !\" type statement. By subtracting 1 from the line number the alias will tell me where the failure occurred. It is also dead simple to call, and pretty much idiot proof. Below is an example (just replace /bin/false with whatever you are going to call).

#This is an example useage, it will print out
#Error prog-name (@1): Who knew false is false.
if ! /bin/false ; then
    die \"Who knew false is false.\"
fi


回答7:

Another consideration is the exit code to return. Just \"1\" is pretty standard, although there are a handful of reserved exit codes that bash itself uses, and that same page argues that user-defined codes should be in the range 64-113 to conform to C/C++ standards.

You might also consider the bit vector approach that mount uses for its exit codes:

 0  success
 1  incorrect invocation or permissions
 2  system error (out of memory, cannot fork, no more loop devices)
 4  internal mount bug or missing nfs support in mount
 8  user interrupt
16  problems writing or locking /etc/mtab
32  mount failure
64  some mount succeeded

OR-ing the codes together allows your script to signal multiple simultaneous errors.



回答8:

I use the following trap code, it also allows errors to be traced through pipes and \'time\' commands

#!/bin/bash
set -o pipefail  # trace ERR through pipes
set -o errtrace  # trace ERR through \'time command\' and other functions
function error() {
    JOB=\"$0\"              # job name
    LASTLINE=\"$1\"         # line of error occurrence
    LASTERR=\"$2\"          # error code
    echo \"ERROR in ${JOB} : line ${LASTLINE} with exit code ${LASTERR}\"
    exit 1
}
trap \'error ${LINENO} ${?}\' ERR


回答9:

I\'ve used

die() {
        echo $1
        kill $$
}

before; i think because \'exit\' was failing for me for some reason. The above defaults seem like a good idea, though.



回答10:

This has served me well for a while now. It prints error or warning messages in red, one line per parameter, and allows an optional exit code.

# Custom errors
EX_UNKNOWN=1

warning()
{
    # Output warning messages
    # Color the output red if it\'s an interactive terminal
    # @param $1...: Messages

    test -t 1 && tput setf 4

    printf \'%s\\n\' \"$@\" >&2

    test -t 1 && tput sgr0 # Reset terminal
    true
}

error()
{
    # Output error messages with optional exit code
    # @param $1...: Messages
    # @param $N: Exit code (optional)

    messages=( \"$@\" )

    # If the last parameter is a number, it\'s not part of the messages
    last_parameter=\"${messages[@]: -1}\"
    if [[ \"$last_parameter\" =~ ^[0-9]*$ ]]
    then
        exit_code=$last_parameter
        unset messages[$((${#messages[@]} - 1))]
    fi

    warning \"${messages[@]}\"

    exit ${exit_code:-$EX_UNKNOWN}
}


回答11:

Not sure if this will be helpful to you, but I modified some of the suggested functions here in order to include the check for the error (exit code from prior command) within it. On each \"check\" I also pass as a parameter the \"message\" of what the error is for logging purposes.

#!/bin/bash

error_exit()
{
    if [ \"$?\" != \"0\" ]; then
        log.sh \"$1\"
        exit 1
    fi
}

Now to call it within the same script (or in another one if I use export -f error_exit) I simply write the name of the function and pass a message as parameter, like this:

#!/bin/bash

cd /home/myuser/afolder
error_exit \"Unable to switch to folder\"

rm *
error_exit \"Unable to delete all files\"

Using this I was able to create a really robust bash file for some automated process and it will stop in case of errors and notify me (log.sh will do that)



回答12:

This function has been serving me rather well recently:

action () {
    # Test if the first parameter is non-zero
    # and return straight away if so
    if test $1 -ne 0
    then
        return $1
    fi

    # Discard the control parameter
    # and execute the rest
    shift 1
    \"$@\"
    local status=$?

    # Test the exit status of the command run
    # and display an error message on failure
    if test ${status} -ne 0
    then
        echo Command \\\"\"$@\"\\\" failed >&2
    fi

    return ${status}
}

You call it by appending 0 or the last return value to the name of the command to run, so you can chain commands without having to check for error values. With this, this statement block:

command1 param1 param2 param3...
command2 param1 param2 param3...
command3 param1 param2 param3...
command4 param1 param2 param3...
command5 param1 param2 param3...
command6 param1 param2 param3...

Becomes this:

action 0 command1 param1 param2 param3...
action $? command2 param1 param2 param3...
action $? command3 param1 param2 param3...
action $? command4 param1 param2 param3...
action $? command5 param1 param2 param3...
action $? command6 param1 param2 param3...

<<<Error-handling code here>>>

If any of the commands fail, the error code is simply passed to the end of the block. I find it useful when you don\'t want subsequent commands to execute if an earlier one failed, but you also don\'t want the script to exit straight away (for example, inside a loop).



回答13:

This trick is useful for missing commands or functions. The name of the missing function (or executable) will be passed in $_

function handle_error {
    status=$?
    last_call=$1

    # 127 is \'command not found\'
    (( status != 127 )) && return

    echo \"you tried to call $last_call\"
    return
}

# Trap errors.
trap \'handle_error \"$_\"\' ERR


回答14:

Using trap is not always an option. For example, if you\'re writing some kind of re-usable function that needs error handling and that can be called from any script (after sourcing the file with helper functions), that function cannot assume anything about exit time of the outer script, which makes using traps very difficult. Another disadvantage of using traps is bad composability, as you risk overwriting previous trap that might be set earlier up in the caller chain.

There is a little trick that can be used to do proper error handling without traps. As you may already know from other answers, set -e doesn\'t work inside commands if you use || operator after them, even if you run them in a subshell; e.g., this wouldn\'t work:

#!/bin/sh

# prints:
#
# --> outer
# --> inner
# ./so_1.sh: line 16: some_failed_command: command not found
# <-- inner
# <-- outer

set -e

outer() {
  echo \'--> outer\'
  (inner) || {
    exit_code=$?
    echo \'--> cleanup\'
    return $exit_code
  }
  echo \'<-- outer\'
}

inner() {
  set -e
  echo \'--> inner\'
  some_failed_command
  echo \'<-- inner\'
}

outer

But || operator is needed to prevent returning from the outer function before cleanup. The trick is to run the inner command in background, and then immediately wait for it. The wait builtin will return the exit code of the inner command, and now you\'re using || after wait, not the inner function, so set -e works properly inside the latter:

#!/bin/sh

# prints:
#
# --> outer
# --> inner
# ./so_2.sh: line 27: some_failed_command: command not found
# --> cleanup

set -e

outer() {
  echo \'--> outer\'
  inner &
  wait $! || {
    exit_code=$?
    echo \'--> cleanup\'
    return $exit_code
  }
  echo \'<-- outer\'
}

inner() {
  set -e
  echo \'--> inner\'
  some_failed_command
  echo \'<-- inner\'
}

outer

Here is the generic function that builds upon this idea. It should work in all POSIX-compatible shells if you remove local keywords, i.e. replace all local x=y with just x=y:

# [CLEANUP=cleanup_cmd] run cmd [args...]
#
# `cmd` and `args...` A command to run and its arguments.
#
# `cleanup_cmd` A command that is called after cmd has exited,
# and gets passed the same arguments as cmd. Additionally, the
# following environment variables are available to that command:
#
# - `RUN_CMD` contains the `cmd` that was passed to `run`;
# - `RUN_EXIT_CODE` contains the exit code of the command.
#
# If `cleanup_cmd` is set, `run` will return the exit code of that
# command. Otherwise, it will return the exit code of `cmd`.
#
run() {
  local cmd=\"$1\"; shift
  local exit_code=0

  local e_was_set=1; if ! is_shell_attribute_set e; then
    set -e
    e_was_set=0
  fi

  \"$cmd\" \"$@\" &

  wait $! || {
    exit_code=$?
  }

  if [ \"$e_was_set\" = 0 ] && is_shell_attribute_set e; then
    set +e
  fi

  if [ -n \"$CLEANUP\" ]; then
    RUN_CMD=\"$cmd\" RUN_EXIT_CODE=\"$exit_code\" \"$CLEANUP\" \"$@\"
    return $?
  fi

  return $exit_code
}


is_shell_attribute_set() { # attribute, like \"x\"
  case \"$-\" in
    *\"$1\"*) return 0 ;;
    *)    return 1 ;;
  esac
}

Example of usage:

#!/bin/sh
set -e

# Source the file with the definition of `run` (previous code snippet).
# Alternatively, you may paste that code directly here and comment the next line.
. ./utils.sh


main() {
  echo \"--> main: $@\"
  CLEANUP=cleanup run inner \"$@\"
  echo \"<-- main\"
}


inner() {
  echo \"--> inner: $@\"
  sleep 0.5; if [ \"$1\" = \'fail\' ]; then
    oh_my_god_look_at_this
  fi
  echo \"<-- inner\"
}


cleanup() {
  echo \"--> cleanup: $@\"
  echo \"    RUN_CMD = \'$RUN_CMD\'\"
  echo \"    RUN_EXIT_CODE = $RUN_EXIT_CODE\"
  sleep 0.3
  echo \'<-- cleanup\'
  return $RUN_EXIT_CODE
}

main \"$@\"

Running the example:

$ ./so_3 fail; echo \"exit code: $?\"

--> main: fail
--> inner: fail
./so_3: line 15: oh_my_god_look_at_this: command not found
--> cleanup: fail
    RUN_CMD = \'inner\'
    RUN_EXIT_CODE = 127
<-- cleanup
exit code: 127

$ ./so_3 pass; echo \"exit code: $?\"

--> main: pass
--> inner: pass
<-- inner
--> cleanup: pass
    RUN_CMD = \'inner\'
    RUN_EXIT_CODE = 0
<-- cleanup
<-- main
exit code: 0

The only thing that you need to be aware of when using this method is that all modifications of Shell variables done from the command you pass to run will not propagate to the calling function, because the command runs in a subshell.