detectClassPathResourcesToStage - Unable to conver

2019-03-04 15:41发布

问题:

When I run the jar in the GCE, it had the following error:

java -jar mySimple.jar --project=myProjcet

Aug 13, 2015 1:22:26 AM com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner detectClassPathResourcesToStage
SEVERE: Unable to convert url (rsrc:./) to file.
Aug 13, 2015 1:22:26 AM simple.SimpleV1 main
SEVERE: Failed to construct instance from factory method com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner#fromOptions

I am working on Eclipse(window). And it succeeded to run dataflow through the eclipse. Packaging the project to Runable jar file and uploaded to the GCE (ubuntu). And i had errors when i run the jar file on the GCE(ubuntu).

the runner is BlockingDataflowPipelineRunner(batch mode). there are other options in source code.

the follow is manifest.

Manifest-Version: 1.0
Rsrc-Class-Path: ./ httpclient-4.3.6.jar httpcore-4.3.3.jar commons-lo
 gging-1.1.3.jar commons-codec-1.6.jar mybatis-3.2.8.jar mysql-connect
 or-java-5.1.34.jar ibatis2-common-2.1.7.597.jar ibatis2-dao-2.1.7.597
 .jar ibatis2-sqlmap-2.1.7.597.jar geoip-api-1.2.14.jar google-api-cli
 ent-java6-1.20.0.jar google-api-client-1.20.0.jar google-oauth-client
 -1.20.0.jar guava-jdk5-13.0.jar google-oauth-client-java6-1.20.0.jar 
 google-oauth-client-jetty-1.20.0.jar jetty-6.1.26.jar jetty-util-6.1.
 26.jar servlet-api-2.5-20081211.jar google-http-client-jackson2-1.20.
 0.jar google-http-client-1.20.0.jar jsr305-1.3.9.jar joda-time-2.8.1.
 jar slf4j-api-1.7.7.jar slf4j-jdk14-1.7.7.jar commons-csv-1.1.jar aws
 -java-sdk-sqs-1.10.5.1.jar aws-java-sdk-core-1.10.5.1.jar google-clou
 d-dataflow-java-sdk-all-0.4.150710.jar google-api-services-dataflow-v
 1b3-rev4-1.19.1.jar google-cloud-dataflow-java-proto-library-all-0.4.
 150612.jar protobuf-java-2.5.0.jar google-api-services-bigquery-v2-re
 v187-1.19.1.jar google-api-services-compute-v1-rev46-1.19.1.jar googl
 e-api-services-pubsub-v1beta2-rev1-1.19.1.jar google-api-services-sto
 rage-v1-rev25-1.19.1.jar google-api-services-datastore-protobuf-v1bet
 a2-rev1-2.1.2.jar google-http-client-protobuf-1.15.0-rc.jar google-ht
 tp-client-jackson-1.15.0-rc.jar jackson-annotations-2.4.2.jar jackson
 -databind-2.4.2.jar avro-1.7.7.jar jackson-core-asl-1.9.13.jar jackso
 n-mapper-asl-1.9.13.jar paranamer-2.3.jar snappy-java-1.0.5.jar commo
 ns-compress-1.9.jar jetty-server-9.2.10.v20150310.jar javax.servlet-a
 pi-3.1.0.jar jetty-http-9.2.10.v20150310.jar jetty-io-9.2.10.v2015031
 0.jar jetty-jmx-9.2.10.v20150310.jar jetty-util-9.2.10.v20150310.jar 
 jackson-core-2.6.0.jar
Class-Path: .
Rsrc-Main-Class: simple.SimpleV1
Main-Class: org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader

回答1:

When exporting a Runnable JAR file using Eclipse, there are three ways to package your project:

  1. Extract required libraries into generated JAR
  2. Package required libraries into generated JAR
  3. Copy required libraries into a sub-folder next to the generated JAR

All 3 options, have the same usage pattern when executing, e.g.

java -jar myrunnable.jar --myCommandLineOption1=...

Currently, only option 1 is compatible with how the Dataflow SDK for Java is able to detect resources to stage because it is dependent on them being file URIs from a URLClassLoader.

For an explanation of how the Runnable Jars are created and more specific details of why this was problematic, read further below.

An alternative solution to using the Runnable Jars, is to execute your project using mvn exec.

Option 1

This creates a jar which copies all the class files & resources in each individual jar into a single jar. This allows for a manifest where the entire classpath is composed of file based URIs:

Manifest-Version: 1.0
Main-Class: com.google.cloud.dataflow.starter.StarterPipeline
Class-Path: .

Option 2

This creates a jar file with additional jars embedded within it. It uses a custom main entry point (org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader) which knows how to read the custom manifest entries (Rsrc-Class-Path & Rsrc-Main-Class) and creates a classloader with non file based URIs. Since the Dataflow SDK for Java currently only knows how to handle file based resources and doesn't know how to interpret the rsrc:... URIs, you get the exception that your seeing.

Manifest-Version: 1.0
Rsrc-Class-Path: ./ httpclient-4.3.6.jar ...
Class-Path: .
Rsrc-Main-Class: simple.SimpleV1
Main-Class: org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader

Option 3

This creates a jar file which contains your project resources and then creates a folder along side the runnable jar containing all your projects dependent jars. This allows for a more complex standard manifest listing all your project dependencies.

Manifest-Version: 1.0
Main-Class: com.google.cloud.dataflow.starter.StarterPipeline
Class-Path: . runnable_lib/google-cloud-dataflow-java-sdk-all-manual_build.jar ...

The Class-Path manifest is not returned part of the URLClassLoader and hence these classes are not discoverable. Furthermore, those jars are only meant to be loaded by classes from that jar which can lead to a jar loading hierarchy. More details are available here: http://docs.oracle.com/javase/7/docs/technotes/tools/findingclasses.html