R: Catch errors and continue execution after loggi

2019-05-22 22:58发布

I have many unattended batch jobs in R running on a server and I have to analyse job failures after they have run.

I am trying to catch errors to log them and recover from the error gracefully but I am not able to get a stack trace (traceback) to log the code file name and line number of the R command that caused the error. A (stupid) reproducible example:

f <- function() {
  1 + variable.not.found    # stupid error
}

tryCatch( f(), error=function(e) {
  # Here I would log the error message and stack trace (traceback)
  print(e)            # error message is no problem
  traceback()         # stack trace does NOT work
  # Here I would handle the error and recover...
})

Running the code above produces this output:

simpleError in f(): object 'variable.not.found' not found

No traceback available

The traceback is not available and the reason is documented in the R help (?traceback):

Errors which are caught via try or tryCatch do not generate a traceback, so what is printed is the call sequence for the last uncaught error, and not necessarily for the last error.

In other words: Catching an error with tryCatch does kill the stack trace!

How can I

  • handle errors and
  • log the stack trace (traceback) for further examination
  • [optionally] without using undocumented or hidden R internal functions that are not guaranteed to work in the future?

THX a lot!

2条回答
Melony?
2楼-- · 2019-05-22 23:45

Sorry for the long answer but I wanted to summarize all knowledge and references in one answer!

Main issues to be solved

  1. tryCatch "unrolls" the call stack to the tryCatch call so that traceback and sys.calls do no longer contain the full stack trace to identify the source code line that causes an error or warning.

  2. tryCatch aborts the execution if you catch a warning by passing a handler function for the warning condition. If you just want to log a warning you cannot continue the execution as normal.

  3. dump.frames writes the evaluation environments (frames) of the stack trace to allow post-mortem debugging (= examining the variable values visible within each function call) but dump.frames "forgets" to save the workspace too if you set the parameter to.file = TRUE. Therefore important objects may be missing.

  4. Find a simple logging framework since R does not support decent logging out of the box

  5. Enrich the stack trace with the source code lines.

Solution concept

  1. Use withCallingHandlers instead of tryCatch to get a full stack trace pointing to the source code line that throwed an error or warning.

  2. Catch warnings only within withCallingHandlers (not in tryCatch) since it just calls the handler functions but does not change the program flow.

  3. Surround withCallingHandlers with tryCatch to catch and handle errors as wanted.

  4. Use dump.frames with the parameter to.file = FALSE to write the dump into global variable named last.dump and save it into a file together with the global environment by calling save.image.

  5. Use a logging framework, e. g. the package futile.logger.

  6. R does track source code references when you set options(keep.source = TRUE). You can add this option to your .Rprofile file or use a startup R script that sets this option and source your actual R script then.

  7. To enrich the stack trace with the tracked source code lines you can use the undocumented (but widely used) function limitedLabels.

  8. To filter out R internal function calls from stack trace you can remove all calls that have no source code line reference.

Implementation

Code template

Instead of using tryCatch you should use this code snippet:

library(futile.logger)

tryCatch(
  withCallingHandlers(<expression>,
    error = function(e) {
      call.stack <- sys.calls() # is like a traceback within "withCallingHandlers"
      dump.frames()
      save.image(file = "last.dump.rda")
      flog.error(paste(e$message, limitedLabels(call.stack), sep = "\n"))
    }
    warning = <similar to error above>
  }
  error = <catch errors and recover as you would do it normally>
  # warning = <...> # never do this here since it stops the normal execution like an error!
  finally = <your clean-up code goes here>
}

Reusable implementation via a package (tryCatchLog)

I have implemented a simple package with all the concepts mentioned above. It provides a function tryCatchLog using the futile.logger package.

Usage:

library(tryCatchLog)  # or source("R/tryCatchLog.R")

tryCatchLog(<expression>,
            error = function(e) {
              <your error handler>
            })

You can find the free source code at github:

https://github.com/aryoda/tryCatchLog

You could also source the tryCatchLog function instead of using a full blown package.

Example (demo)

See the demo file that provides a lot of comments to explain how it works.

References

Other tryCatch replacements

Other helpful links

http://adv-r.had.co.nz/Exceptions-Debugging.html

A Warning About warning() - avoid R's warning feature

In R, why does withCallingHandlers still stops execution?

How to continue function when error is thrown in withCallingHandlers in R

Can you make R print more detailed error messages?

How can I access the name of the function generating an error or warning?

How do I save warnings and errors as output from a function?

options(error=dump.frames) vs. options(error=utils::recover)

General suggestions for debugging in R

Suppress warnings using tryCatch in R

R Logging display name of the script

Background information about the "srcrefs" attribute (Duncan Murdoch)

get stack trace on tryCatch'ed error in R

查看更多
Bombasti
3楼-- · 2019-05-23 00:03

The traceback function can be used to print/save the current stack trace, but you have to specify an integer argument, which is the number of stack frames to omit from the top (can be 0). This can be done inside a tryCatch block or anywhere else. Say this is the content of file t.r:

f <- function() {
    x <- 1
    g()
}
g <- function() {
    traceback(0)
}

When you source this file into R and run f, you get the stack trace:

3: traceback(0) at t.r#7
2: g() at t.r#3
1: f()

which has file name and line number information for each entry. You will get several stack frames originating from the implementation of tryCatch and you can't skip them by specifying a non-zero argument to traceback, yet indeed this will break in case the implementation of tryCatch changes.

The file name and line number information (source references) will only be available for code that has been parsed to keep source references (by default the source'd code, but not packages). The stack trace will always have call expressions.

The stack trace is printed by traceback (no need to call print on it).

For logging general errors, it is sometimes useful to use options(error=), one then does not need to modify the code that causes the errors.

查看更多
登录 后发表回答