Installed Spark 1.5 spark-1.5.0-bin-hadoop2.6 on my local machine.
Ran $ ./bin/spark-shell
Tried, following the doc to create a table, getting this:
> SQL context available as sqlContext.
>
> scala> sqlContext.sql("CREATE TABLE IF NOT EXISTS src (key INT, value
> STRING)"); 15/09/22 22:18:13 ERROR DDLTask:
> org.apache.hadoop.hive.ql.metadata.HiveException:
> MetaException(message:file:/user/hive/warehouse/src is not a directory
> or unable to create one) at
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:720)
Tried passing the hive parameter for this, but didnt work:
> $ ./bin/spark-shell --conf hive.metastore.warehouse.dir=./ Warning:
> Ignoring non-spark config property: hive.metastore.warehouse.dir=./
Finally tried the CLI itself, but getting the same issue.
Where do i change the hive warehouse parameter location ? I dont have Hadoop installed at the moment, nor hive.
thanks,
Matt
Metadata of hive tables are strored in metastore, hive context adds support for finding tables in the MetaStore.
import org.apache.spark.sql.hive.HiveContext
val hiveContext = new HiveContext(sc)
val myDF = sql("select * from mytable")
You will get dataFrame as result
myDF: org.apache.spark.sql.DataFrame = [.....]
I met this problem when spark-shell didn't have the access right to write to /user/hive/warehouse
- sudo spark-shell to have another try. If it works, do the second step.
- change access right of the dir and make it is same with spark-shell command.
Actually you don't really have to got Hive installed (nor Hadoop, but you need to get a hive-site.xml presented in your spark class path ( simplest way to add hive-site.xml
to your spark conf directory)
here is a simple default hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/PATH/TO/YOUR/METASTORE/DIR/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>PATH/TO/YOUR/WAREHOSUE/DIR/</value>
<description>location of default database for the warehouse</description>
</property>
</configuration>
Some times, when the metastore is local derby database, it might have locks which has not been deleted, if you are experienceing a problem about metstore locks, you could delete the locks (make sure it is just you who is using the metastore first ;) ) :
$ rm /PATH/TO/YOUR/METASTORE/DIR/metastore_db/*.lck