Can the R console support background tasks or inte

2020-07-08 06:53发布

问题:

While working in an R console, I'd like to set up a background task that monitors a particular connection and when an event occurs, another function (an alert) is executed. Alternatively, I can set things up so that an external function simply sends an alert to R, but this seems to be the same problem: it is necessary to set up a listener.

I can do this in a dedicated process of R, but I don't know if this is feasible from within a console. Also, I'm not interested in interrupting R if it is calculating a function, but alerting or interrupting if the console is merely waiting on input.

Here are three use cases:

  1. The simplest possible example is watching a file. Suppose that I have a file called "latestData.csv" and I want to monitor it for changes; when it changes, myAlert() is executed. (One can extend it to do different things, but just popping up with a note that a file has changed is useful.)

  2. A different kind of monitor would watch for whether a given machine is running low on RAM and might execute a save.image() and terminate. Again, this could be a simple issue of watching a file produced by an external monitor that saves the output of top or some other command.

  3. A different example is like another recent SO question, about : have R halt the EC2 machine it's running on. If an alert from another machine or process tells the program to save & terminate, then being able to listen for that alert would be great.

At the moment, I suspect there are two ways of handling this: via Rserve and possibly via fork. If anyone has examples of how to do this with either package or via another method, that would be great. I think that solving any of these three use cases would solve all of them, modulo a little bit external code.


Note 1: I realize, per this answer to another SO question that R is single threaded, which is why I suspect fork and Rserve may work. However, I'm not sure about feasibility if one is interfacing with an R terminal. Although R's REPL is attached to the input from the console, I am trying to either get around this or mimic it, which is where fork or Rserve may be the answer.

Note 2: For those familiar with event handling / eventing methods, that would solve everything, too. I've just not found anything about this in R.


Update 1: I've found that the manual for writing R extensions has a section referencing event handling, which mentions the use of R_PolledEvents. This looks promising.

回答1:

It depends whether you want to interrupt idling or working R. If the first, you can think of bypassing the R default REPL loop by some event listener that will queue the incoming events and evaluate them. The common option is to use tcl/tk or gtk loop; I have made something like this around libev in my triggr package, which makes R digest requests coming from a socket.

The latter case is mostly hopeless, unless you will manually make the computational code to execute if(evenOccured) processIt code periodically.

Multithreading is not a real option, because as you know two interpreters in one process will break themselves by using same global variables, while forked processes will have independent memory contents.



回答2:

One more option is the svSocket package. It is non blocking.

Here is an 8 minute video using it, which has over 3,000 views. It shows how to turn an R session into a server and how to send commands to it and receive data back. It demonstrates doing that even while the server is busy; e.g., say you start a long running process and forget to save intermediate results, you can connect to the server and fetch the results (before it has finished) from it.



回答3:

It turns out that the package Rdsm supports this as well.

With this package, one can set up a server/client relationship between different instances of R, each is a basic R terminal, and the server can send messages, including functions, to the clients.

Transformed to the use case I described, the server process can do whatever monitoring is necessary, and then send messages to the clients. The documentation is a little terse, unfortunately, but the functionality seems to be straightforward.

If the server process is, say, monitoring a connection (a file, a pipe, a URL, etc.) on a regular basis and a trigger is encountered, it can then send a message to the clients.

Although the primary purpose of the package is shared memory (which is how I came across it), this messaging works pretty well for other purposes, too.


Update 1: Of course for message passing, one can't ignore MPI and the Rmpi package. That may do the trick, but the Rdsm package launches / works with R consoles, which is the kind of interface I'd sought. I'm not yet sure what Rmpi supports.



回答4:

A few ideas:

  1. Run R from within another language's script (this is possible, for example, in Perl using RSPerl) and use the wrapping script to launch the listener.

  2. Another option may be to run an external (non-R) command (using system()) from within R that will launch a listener in the background.

  3. Running R in batch mode in the background either before launching R or in a separate window.

    For example:

    R --no-save < listener.R > output.out &

    The listener can send an approraite email when the event occurs.