Google cloud storage gsutil tool with Java

2019-08-06 09:55发布

问题:

If we have around 30G files (ranged from 50MB to 4GB) need to be uploaded to Google Cloud Storage everyday, according to google docs, gsutil might be the only fitted choice, isn't it?

I want to call gsutil command by Java, now the code below can work. But If I delete that while loop, the program will stop immediately after the runtime.exec(command) but python process was started but doing no uploading and it will soon be killed. I wonder why.

The reason I read from sterr stream is inspired by Pipe gsutil output to file

I decide whether gsutil finish executing by read util the last line of its status output, but is it a reliable way? Is there any better ways to detect whether gsutil execution is end in Java?

String command="python c:/gsutil/gsutil.py cp C:/SFC_Data/gps.txt"
            + " gs://getest/gps.txt";
 try {
        Process process = Runtime.getRuntime().exec(command);
        System.out.println("the output stream is "+process.getErrorStream());
        BufferedReader reader=new BufferedReader(new InputStreamReader(process.getErrorStream())); 
        String s; 
        while ((s = reader.readLine()) != null){
            System.out.println("The inout stream is " + s);
        }                
    } catch (IOException e) {
        e.printStackTrace();
    }

回答1:

There are certainly more than one way to upload 30G worth of data per day to GCS. Since you are working in Java, have you considered to use the Cloud Storage API Java client library? https://developers.google.com/api-client-library/java/apis/storage/v1

As for the specific questions about calling gsutil from Java using Runtime.exec(), I suspect when there is no while loop, the program will exit immediately after creating the sub-process, causing the "process" variable to be GC'ed, which might kill the sub-process.

I think you should wait for the sub-process to complete, which is effectively what the while loop is doing. Or you can just call waitFor() and check the existValue() if you don't care about the output: http://docs.oracle.com/javase/7/docs/api/java/lang/Process.html



回答2:

I draw the following pic according to Zhihong Yao's explanation. Hope it can help anyone with the same question as mine.