AWS Redshift driver in Zeppelin

I want to explore my data in Redshift using notebook Zeppelin. A small EMR cluster with Spark is running behind. I am loading databricks' spark-redshift library

%dep
z.reset()
z.load("com.databricks:spark-redshift_2.10:0.6.0")

and then

import org.apache.spark.sql.DataFrame

val query = "..."

val url = "..."
val port=5439
val table = "..."
val database = "..."
val user = "..."
val password = "..."

val df: DataFrame = sqlContext.read
  .format("com.databricks.spark.redshift")
  .option("url", s"jdbc:redshift://${url}:$port/$database?user=$user&password=$password")
  .option("query",query)
  .option("tempdir", "s3n://.../tmp/data")
  .load()

df.show

but I get the error

java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver

I added option

option("jdbcdriver", "com.amazon.redshift.jdbc41.Driver")

but not for the better. I think I need to specify redshift's JDBC driver somewhere like I would passing --driver-class-path to spark-shell, but how to do that with zeppelin?

标签： jdbc apache-spark amazon-redshift apache-zeppelin

1条回答

我只想做你的唯一

2楼-- · 2019-07-25 03:58

You can add external jars with dependencies like the JDBC driver using either Zeppelin's dependency-loading mechanism or, in case of Spark, using %dep dynamic dependency loader

When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %dep interpreter.

Load libraries recursively from Maven repository

Load libraries from local filesystem

Add additional maven repository

Automatically add libraries to SparkCluster (You can turn off)

The latter would look something like:

%dep
// loads with all transitive dependencies from Maven repo
z.load("groupId:artifactId:version")

// or add artifact from filesystem
z.load("/path/to.jar")

and by convention have to be in the first paragraph of the note.

0人赞添加讨论(0) 举报

AWS Redshift driver in Zeppelin

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间