storing a Dataframe to a hive partition table in s

I'm trying to store a stream of data comming in from a kafka topic into a hive partition table. I was able to convert the dstream to a dataframe and created a hive context. My code looks like this

val hiveContext = new HiveContext(sc)
hiveContext.setConf("hive.exec.dynamic.partition", "true")
hiveContext.setConf("hive.exec.dynamic.partition.mode", "nonstrict")
newdf.registerTempTable("temp") //newdf is my dataframe
newdf.write.mode(SaveMode.Append).format("osv").partitionBy("date").saveAsTable("mytablename")

But when I deploy the app on cluster, its says

Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: file:/tmp/spark-3f00838b-c5d9-4a9a-9818-11fbb0007076/scratch_hive_2016-10-18_23-18-33_118_769650074381029645-1, expected: hdfs://

When I try to save it as a normal table and comment out the hiveconfigurations it work. But, with partition table...its giving me this error.

I also tried registering the dataframe as a temp table and then to write that table to the partition table. Doing that also gave me the same error

Can someone please tell how can I solve it. Thanks.

标签： hadoop hive spark-streaming

2条回答

The star\"

2楼-- · 2019-07-13 09:51

You need to use hadoop(hdfs) configured if you are deploying the app on the cluster.

With saveAsTable the default location that Spark saves to is controlled by the HiveMetastore (based on the docs). Another option would be to use saveAsParquetFile and specify the path and then later register that path with your hive metastore OR use the new DataFrameWriter interface and specify the path option write.format(source).mode(mode).options(options).saveAsTable(tableName).

0人赞添加讨论(0) 举报

\"骚年 ilove

3楼-- · 2019-07-13 09:56

I figured it out. In the code for spark app, I declared the scratch dir location as below and it worked.

sqlContext.sql("SET hive.exec.scratchdir=<hdfs location>")

0人赞添加讨论(0) 举报

storing a Dataframe to a hive partition table in s

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间