We are trying to use ThriftServer to query data from spark temp tables, in spark 2.0.0.
First, we have created sparkSession with enabled Hive Support. Currently, we start ThriftServer with sqlContext like this:
HiveThriftServer2.startWithContext(spark.sqlContext());
We have spark stream with registered temp table "spark_temp_table":
StreamingQuery streamingQuery = streamedData.writeStream()
.format("memory")
.queryName("spark_temp_table")
.start();
With beeline we are able to see temp tables (running SHOW TABLES);
When we want to run second job (with second sparkSession) with this approach we have to start second ThriftServer with different port.
I have two questions here:
Is there any way to have one ThriftServer on one port with access to all temp tables in a different sparkSessions?
HiveThriftServer2.startWithContext(spark.sqlContext());
is annotated with@DeveloperApi
. Is there any way to start thrift server with context not in the code programatically?
I saw there is configuration--conf spark.sql.hive.thriftServer.singleSession=true
passed to ThriftServer on startup (sbin/start-thriftserver.sh) but I don't understand how to define this for a job. I tried to set this configuration property in sparkSession builder , but beeline didn't display temp tables.