I am attempting to specify my GCS temp location by passing it as an option in the command-line as shown below.
java -jar pipeline-0.0.1-SNAPSHOT.jar --runner=DataflowRunner --project=<my_project> --tempLocation=gs://<my_bucket>/<my_folder>
However, I continue to receive a syntax error:
java.nio.file.InvalidPathException: Illegal char <:> at index 2: gs://<my_bucket>/<my_folder>
I'm referring to the following documentation:
https://cloud.google.com/dataflow/pipelines/specifying-exec-params
I specify that I am taking the argument from the command-line as such:
DataflowPipelineOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().as(DataflowPipelineOptions.class);
Updated with the full stack trace as asked in questions below:
org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.nio.file.InvalidPathException: Illegal char <:> at index 2: gs://pipeline-az/staging
at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:342)
at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:312)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:206)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:62)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
at com.autozone.google.pipeline.PipelinePeople.main(PipelinePeople.java:97)
Caused by: java.nio.file.InvalidPathException: Illegal char <:> at index 2: gs://pipeline-az/staging
at sun.nio.fs.WindowsPathParser.normalize(Unknown Source)
at sun.nio.fs.WindowsPathParser.parse(Unknown Source)
at sun.nio.fs.WindowsPathParser.parse(Unknown Source)
at sun.nio.fs.WindowsPath.parse(Unknown Source)
at sun.nio.fs.WindowsFileSystem.getPath(Unknown Source)
at java.nio.file.Paths.get(Unknown Source)
at org.apache.beam.sdk.io.LocalFileSystem.matchNewResource(LocalFileSystem.java:196)
at org.apache.beam.sdk.io.LocalFileSystem.matchNewResource(LocalFileSystem.java:78)
at org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:544)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.resolveTempLocation(BigQueryHelpers.java:325)
at org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$4.getTempFilePrefix(BatchLoads.java:381)
My Pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.hendpro.google</groupId>
<artifactId>pipeline</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>pipeline</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<build>
<plugins>
<plugin>
<!-- Build an executable JAR -->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.0.2</version>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
<mainClass>com.hendpro.google.pipeline.PipelinePeople</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-core</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-jdbc</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-direct-java</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.1.1</version>
</dependency>
</dependencies>
</project>
Update:
I've also tried both direct runner as well as the dataflow runner and have tried with and without the following:
.as(DataflowPipelineOptions.class);
.as(DirectOptions.class);
Regardless of runner choice or declaration the error persists.
Adding Shaded jar list:
[INFO] --- maven-shade-plugin:3.1.0:shade (default) @ pipeline ---
[INFO] Including log4j:log4j:jar:1.2.17 in the shaded jar.
[INFO] Including org.apache.beam:beam-sdks-java-core:jar:2.3.0 in the shaded jar.
[INFO] Including com.google.code.findbugs:jsr305:jar:3.0.1 in the shaded jar.
[INFO] Including com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1 in the shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-core:jar:2.8.9 in the shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-annotations:jar:2.8.9 in the shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-databind:jar:2.8.9 in the shaded jar.
[INFO] Including org.slf4j:slf4j-api:jar:1.7.25 in the shaded jar.
[INFO] Including org.apache.avro:avro:jar:1.8.2 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar.
[INFO] Including com.thoughtworks.paranamer:paranamer:jar:2.7 in the shaded jar.
[INFO] Including org.apache.commons:commons-compress:jar:1.8.1 in the shaded jar.
[INFO] Including org.tukaani:xz:jar:1.5 in the shaded jar.
[INFO] Including org.xerial.snappy:snappy-java:jar:1.1.4 in the shaded jar.
[INFO] Including joda-time:joda-time:jar:2.4 in the shaded jar.
[INFO] Including org.apache.beam:beam-sdks-java-io-jdbc:jar:2.3.0 in the shaded jar.
[INFO] Including org.apache.commons:commons-dbcp2:jar:2.1.1 in the shaded jar.
[INFO] Including org.apache.commons:commons-pool2:jar:2.4.2 in the shaded jar.
[INFO] Including commons-logging:commons-logging:jar:1.2 in the shaded jar.
[INFO] Including org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.3.0 in the shaded jar.
[INFO] Including org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:2.3.0 in the shaded jar.
[INFO] Including com.google.cloud.bigdataoss:gcsio:jar:1.4.5 in the shaded jar.
[INFO] Including com.google.apis:google-api-services-cloudresourcemanager:jar:v1-rev6-1.22.0 in the shaded jar.
[INFO] Including com.google.apis:google-api-services-storage:jar:v1-rev71-1.22.0 in the shaded jar.
[INFO] Including org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.3.0 in the shaded jar.
[INFO] Including io.grpc:grpc-core:jar:1.2.0 in the shaded jar.
[INFO] Including com.google.errorprone:error_prone_annotations:jar:2.0.11 in the shaded jar.
[INFO] Including io.grpc:grpc-context:jar:1.2.0 in the shaded jar.
[INFO] Including com.google.instrumentation:instrumentation-api:jar:0.3.0 in the shaded jar.
[INFO] Including com.google.apis:google-api-services-bigquery:jar:v2-rev355-1.22.0 in the shaded jar.
[INFO] Including com.google.api:gax-grpc:jar:0.20.0 in the shaded jar.
[INFO] Including io.grpc:grpc-protobuf:jar:1.2.0 in the shaded jar.
[INFO] Including com.google.api:api-common:jar:1.1.0 in the shaded jar.
[INFO] Including com.google.auto.value:auto-value:jar:1.2 in the shaded jar.
[INFO] Including com.google.api:gax:jar:1.3.1 in the shaded jar.
[INFO] Including org.threeten:threetenbp:jar:1.3.3 in the shaded jar.
[INFO] Including com.google.cloud:google-cloud-core-grpc:jar:1.2.0 in the shaded jar.
[INFO] Including com.google.protobuf:protobuf-java-util:jar:3.2.0 in the shaded jar.
[INFO] Including com.google.code.gson:gson:jar:2.7 in the shaded jar.
[INFO] Including com.google.apis:google-api-services-pubsub:jar:v1-rev10-1.22.0 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 in the shaded jar.
[INFO] Including com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 in the shaded jar.
[INFO] Including com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 in the shaded jar.
[INFO] Including com.google.cloud.bigdataoss:util:jar:1.4.5 in the shaded jar.
[INFO] Including com.google.api-client:google-api-client-java6:jar:1.20.0 in the shaded jar.
[INFO] Including com.google.api-client:google-api-client-jackson2:jar:1.20.0 in the shaded jar.
[INFO] Including com.google.oauth-client:google-oauth-client:jar:1.20.0 in the shaded jar.
[INFO] Including com.google.oauth-client:google-oauth-client-java6:jar:1.20.0 in the shaded jar.
[INFO] Including com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0 in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-protobuf:jar:1.20.0 in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-jackson:jar:1.20.0 in the shaded jar.
[INFO] Including com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-google-common-protos:jar:0.1.0 in the shaded jar.
[INFO] Including io.grpc:grpc-auth:jar:1.2.0 in the shaded jar.
[INFO] Including io.grpc:grpc-netty:jar:1.2.0 in the shaded jar.
[INFO] Including io.netty:netty-codec-http2:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-codec-http:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-handler-proxy:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-codec-socks:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-handler:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-buffer:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-common:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-transport:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-resolver:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.netty:netty-codec:jar:4.1.8.Final in the shaded jar.
[INFO] Including io.grpc:grpc-stub:jar:1.2.0 in the shaded jar.
[INFO] Including io.grpc:grpc-all:jar:1.2.0 in the shaded jar.
[INFO] Including io.grpc:grpc-okhttp:jar:1.2.0 in the shaded jar.
[INFO] Including com.squareup.okhttp:okhttp:jar:2.5.0 in the shaded jar.
[INFO] Including com.squareup.okio:okio:jar:1.6.0 in the shaded jar.
[INFO] Including io.grpc:grpc-protobuf-lite:jar:1.2.0 in the shaded jar.
[INFO] Including io.grpc:grpc-protobuf-nano:jar:1.2.0 in the shaded jar.
[INFO] Including com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-5 in the shaded jar.
[INFO] Including com.google.cloud:google-cloud-core:jar:1.0.2 in the shaded jar.
[INFO] Including org.json:json:jar:20160810 in the shaded jar.
[INFO] Including com.google.cloud:google-cloud-spanner:jar:0.20.0-beta in the shaded jar.
[INFO] Including com.google.api.grpc:proto-google-cloud-spanner-v1:jar:0.1.11 in the shaded jar.
[INFO] Including com.google.api.grpc:proto-google-cloud-spanner-admin-instance-v1:jar:0.1.11 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-google-cloud-spanner-v1:jar:0.1.11 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-google-cloud-spanner-admin-database-v1:jar:0.1.11 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-google-cloud-spanner-admin-instance-v1:jar:0.1.11 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-google-longrunning-v1:jar:0.1.11 in the shaded jar.
[INFO] Including com.google.api.grpc:proto-google-longrunning-v1:jar:0.1.11 in the shaded jar.
[INFO] Including junit:junit:jar:4.12 in the shaded jar.
[INFO] Including org.hamcrest:hamcrest-core:jar:1.3 in the shaded jar.
[INFO] Including com.google.cloud.bigtable:bigtable-protos:jar:1.0.0-pre3 in the shaded jar.
[INFO] Including com.google.cloud.bigtable:bigtable-client-core:jar:1.0.0 in the shaded jar.
[INFO] Including com.google.auth:google-auth-library-appengine:jar:0.7.0 in the shaded jar.
[INFO] Including io.opencensus:opencensus-contrib-grpc-util:jar:0.7.0 in the shaded jar.
[INFO] Including io.opencensus:opencensus-api:jar:0.7.0 in the shaded jar.
[INFO] Including io.dropwizard.metrics:metrics-core:jar:3.1.2 in the shaded jar.
[INFO] Including com.google.api-client:google-api-client:jar:1.22.0 in the shaded jar.
[INFO] Including com.google.http-client:google-http-client:jar:1.22.0 in the shaded jar.
[INFO] Including org.apache.httpcomponents:httpclient:jar:4.0.1 in the shaded jar.
[INFO] Including org.apache.httpcomponents:httpcore:jar:4.0.1 in the shaded jar.
[INFO] Including commons-codec:commons-codec:jar:1.3 in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-jackson2:jar:1.22.0 in the shaded jar.
[INFO] Including com.google.auth:google-auth-library-credentials:jar:0.7.1 in the shaded jar.
[INFO] Including com.google.auth:google-auth-library-oauth2-http:jar:0.7.1 in the shaded jar.
[INFO] Including com.google.guava:guava:jar:20.0 in the shaded jar.
[INFO] Including com.google.protobuf:protobuf-java:jar:3.2.0 in the shaded jar.
[INFO] Including io.netty:netty-tcnative-boringssl-static:jar:1.1.33.Fork26 in the shaded jar.
[INFO] Including com.google.api.grpc:proto-google-cloud-spanner-admin-database-v1:jar:0.1.9 in the shaded jar.
[INFO] Including com.google.api.grpc:proto-google-common-protos:jar:0.1.9 in the shaded jar.
[INFO] Including org.apache.beam:beam-runners-direct-java:jar:2.3.0 in the shaded jar.
[INFO] Including org.apache.beam:beam-runners-local-java-core:jar:2.3.0 in the shaded jar.
[INFO] Including org.postgresql:postgresql:jar:42.1.1 in the shaded jar.
[WARNING] grpc-google-common-protos-0.1.0.jar, proto-google-common-protos-0.1.9.jar, proto-google-longrunning-v1-0.1.11.jar define 28 overlapping classes:
[WARNING] - com.google.longrunning.ListOperationsRequestOrBuilder
[WARNING] - com.google.longrunning.ListOperationsRequest$Builder
[WARNING] - com.google.longrunning.OperationsProto$1
[WARNING] - com.google.longrunning.OperationOrBuilder
[WARNING] - com.google.longrunning.ListOperationsResponseOrBuilder
[WARNING] - com.google.longrunning.DeleteOperationRequestOrBuilder
[WARNING] - com.google.longrunning.DeleteOperationRequest$1
[WARNING] - com.google.longrunning.CancelOperationRequest$1
[WARNING] - com.google.longrunning.GetOperationRequest
[WARNING] - com.google.longrunning.Operation$2
[WARNING] - 18 more...
[WARNING] grpc-google-common-protos-0.1.0.jar, proto-google-common-protos-0.1.9.jar define 352 overlapping classes:
[WARNING] - com.google.api.Logging
[WARNING] - com.google.api.Usage$1
[WARNING] - com.google.rpc.ResourceInfoOrBuilder
[WARNING] - com.google.api.AuthProvider$1
[WARNING] - com.google.api.ProjectProperties$Builder
[WARNING] - com.google.api.DocumentationProto
[WARNING] - com.google.type.TimeOfDayOrBuilder
[WARNING] - com.google.api.MonitoringOrBuilder
[WARNING] - com.google.api.Authentication$Builder
[WARNING] - com.google.api.Monitoring
[WARNING] - 342 more...
[WARNING] beam-sdks-java-core-2.3.0.jar, beam-sdks-java-extensions-google-cloud-platform-core-2.3.0.jar define 3 overlapping classes:
[WARNING] - org.apache.beam.sdk.util.AutoValue_DoFnAndMainOutput
[WARNING] - org.apache.beam.sdk.util.package-info
[WARNING] - org.apache.beam.sdk.util.AutoValue_ReleaseInfo
[WARNING] grpc-google-common-protos-0.1.0.jar, grpc-google-longrunning-v1-0.1.11.jar define 7 overlapping classes:
[WARNING] - com.google.longrunning.OperationsGrpc$OperationsStub
[WARNING] - com.google.longrunning.OperationsGrpc$1
[WARNING] - com.google.longrunning.OperationsGrpc$OperationsFutureStub
[WARNING] - com.google.longrunning.OperationsGrpc$OperationsImplBase
[WARNING] - com.google.longrunning.OperationsGrpc$OperationsBlockingStub
[WARNING] - com.google.longrunning.OperationsGrpc
[WARNING] - com.google.longrunning.OperationsGrpc$MethodHandlers
[WARNING] maven-shade-plugin has detected that some class files are
[WARNING] present in two or more JARs. When this happens, only one
[WARNING] single version of the class is copied to the uber jar.
[WARNING] Usually this is not harmful and you can skip these warnings,
[WARNING] otherwise try to manually exclude artifacts based on
[WARNING] mvn dependency:tree -Ddetail=true and the above output.
[WARNING] See http://maven.apache.org/plugins/maven-shade-plugin/
As explained in this answer, when you use the Maven Shade Plugin in conjunction with
ServiceLoader
for dependency injection, you should specifyServicesResourceTransformer
in yourpom.xml
file:Even if the plugin is relocating classes, this will ensure that every service file under
META-INF/services
of your dependencies is merged, without the need to declare them all.Note: just posting this as a community wiki answer for now but I'll gladly delete it if @jkff posts his comment as an answer instead. All credit to @Tunaki and @jkff.