calling R script from java

2019-01-17 04:17发布

I would like to call an R script from Java. I have done google searches on the topic, but almost all of the results I have seen would require me to add a dependency to some third party library. Can anyone show me a good way to accomplish the same thing without adding any dependencies to my code?

I am using a windows machine, so perhaps I might use the command line to start R (if it is not already open) and to run a specific R script. But I have never written command line code (or called it from Java) so I would need code examples.

I am including working sample code that I wrote for one possible approach below, using my command line idea. In my in-line-comments below, you can see that Step Three in AssembleDataFile.java is intentionally left blank by me. If you think that you can make the command line idea work, then please show me what code to write in Step Three.

Also, feel free to suggest another approach that, hopefully, does not involve adding any more dependencies to my code.

And, as always, I very much appreciate any links you might post to articles/tutorials/etc related to this question.

Here is what I have so far:

AssembleDataFile.java

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;

public class AssembleDataFile {
static String delimiter;
static String localPath = "C:\\test\\cr\\";
static String[][] myDataArray;

public static void main(String[] args) {
    String inputPath = localPath+"pd\\";
    String fileName = "MSData.txt";
    delimiter = "\\t";

    // Step One: Import data in two parts
    try {
        // 1A: get length of data file
        BufferedReader br1 = new BufferedReader(new FileReader(inputPath+fileName));
        int numRows = 0;
        int numCols = 0;
        String currentRow;
        while ((currentRow = br1.readLine()) != null) {
            numRows += 1;
            numCols = currentRow.split(delimiter).length;}
        br1.close();
        //1B: populate data into array
        myDataArray = new String[numRows][numCols+1];
        BufferedReader br2 = new BufferedReader(new FileReader(inputPath+fileName));
        String eachRow;
        int rowIdx = 0;
        while ((eachRow = br2.readLine()) != null) {
            String[] splitRow = eachRow.split(delimiter);
            for(int z = 0;z < splitRow.length;z++){myDataArray[rowIdx][z] = splitRow[z];}
            rowIdx += 1;}
        br2.close();

        // Step Two: Write data to csv
        String rPath = localPath+"r\\";
        String sFileName = rPath+"2colData.csv";
        PrintWriter outputWriter = new PrintWriter(sFileName);
        for(int q = 0;q < myDataArray.length; q++){
            outputWriter.println(myDataArray[q][8]+", "+myDataArray[q][9]);
        }
        outputWriter.close();

        //Step Three: Call R script named My_R_Script.R that uses 2ColData.csv as input
        // not sure how to write this code.  Can anyone help me write this part?
        // For what it is worth, one of the R scripts that I intend to call is included below
        //
        //added the following lines here, per Vincent's suggestion:
            String rScriptFileName = rPath+"My_R_Script.R";
        Runtime.getRuntime().exec("mypathto\\R\\bin\\Rscript "+rScriptFileName);
        //
        //

        //Step Four: Import data from R and put it into myDataArray's empty last column
        try {Thread.sleep(30000);}//make this thread sleep for 30 seconds while R creates the needed file
        catch (InterruptedException e) {e.printStackTrace();}
        String matchFileName = rPath+"Matches.csv";
        BufferedReader br3 = new BufferedReader(new FileReader(matchFileName));
        String thisRow;
        int rowIndex = 0;
        while ((thisRow = br3.readLine()) != null) {
            String[] splitRow = thisRow.split(delimiter);
            myDataArray[rowIndex][numCols] = splitRow[0];
            rowIndex += 1;}
        br3.close();

        //Step Five: Check work by printing out one row from myDataArray
        //Note that the printout has one more column than the input file had.
        for(int u = 0;u<=numCols;u++){System.out.println(String.valueOf(myDataArray[1][u]));}
    }
    catch (FileNotFoundException e) {e.printStackTrace();}
    catch (IOException ie){ie.printStackTrace();}
}
}

My_R_Script.R

myCSV <- read.csv(file="2colData.csv",head=TRUE,sep=",")  
pts = SpatialPoints(myCSV)
Codes = readShapeSpatial("mypath/myshapefile.shp")  
write.csv(ZipCodes$F[overlay(pts,Codes)], "Matches.csv", quote=FALSE, row.names=FALSE)

EDIT:
Here is the error message that is being thrown when I add Runtime.getRuntime().exec("Rscript "+rScriptFileName); to the code above:

java.io.IOException: Cannot run program "Rscript": CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessBuilder.start(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at AssembleDataFile.main(AssembleDataFile.java:52)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.<init>(Unknown Source)
at java.lang.ProcessImpl.start(Unknown Source)
... 5 more    

SECOND EDIT: The code above now works because I followed Vincent's suggestions. However, I had to put in a sleep command in order to give the R script enough time to run. Without the sleep command, the java code above throws an error saying that the Matches.csv file does not exist. I am concerned that a 30 second sleep period is too rough of an instrument. Can anyone show me code that gets the java program to wait until the R program has a chance to create Matches.csv? I hesitate to use thread tools because I have read that poorly designed threads can cause bugs that are nearly impossible to localize and fix.

5条回答
一夜七次
2楼-- · 2019-01-17 04:55

...would require me to add a dependency to some third party library...

Why is that so bad? You make it sound like "...would require me to assault a honeybadger with a baseball bat..." I don't see the harm, especially if it works.

Maybe RCaller can help you. No JNI required.

查看更多
姐就是有狂的资本
3楼-- · 2019-01-17 04:58

Do not wait for the process to finish with Thread.sleep()...

Use the waitFor() method instead.

Process child = Runtime.getRuntime().exec(command, environments, dataDir);

int code = child.waitFor();

switch (code) {
    case 0:
        //normal termination, everything is fine
        break;
    case 1:
        //Read the error stream then
        String message = IOUtils.toString(child.getErrorStream());
        throw new RExecutionException(message);
}
查看更多
Bombasti
4楼-- · 2019-01-17 05:00
BufferedReader reader = null;
        Process shell = null;
        try {
            shell = Runtime.getRuntime().exec(new String[] { "/usr/bin/Rscript", "/media/subin/works/subzworks/RLanguage/config/predict.R" });

            reader = new BufferedReader(new InputStreamReader(shell.getInputStream()));
            String line;
            while ((line = reader.readLine()) != null) {
                System.out.println(line);

            }

        } catch (IOException e) {
            e.printStackTrace();
        }
查看更多
放荡不羁爱自由
5楼-- · 2019-01-17 05:10

You just want to call an external application: wouldn't the following work?

Runtime.getRuntime().exec("Rscript myScript.R"); 
查看更多
我命由我不由天
6楼-- · 2019-01-17 05:12

You can easily adapt this code: http://svn.rforge.net/org/trunk/rosuda/REngine/Rserve/test/StartRserve.java

Among other things it finds R and runs a fixed script in R - you can replace that script with with your script and ignore the last two methods.

查看更多
登录 后发表回答