Spark SQL query execution on Hive

I am new to Spark SQL but aware of hive query execution framework. I would like to understand how does spark executes sql queries (technical description) ?

If I fire below command

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext.sql("select count(distinct(id)) from test.emp").collect

In Hive it will be converted into Map-Reduce job but how it gets executed in Spark?

How hive metastore will come into picture?

Thanks in advance.

标签： apache-spark hive apache-spark-sql

1条回答

走好不送

2楼-- · 2019-03-01 01:09

To answer you question briefly: No, HiveContext will not start MR job. Your SQL query will still use the spark engine

I will quote from the spark documentation:

In addition to the basic SQLContext, you can also create a HiveContext, which provides a superset of the functionality provided by the basic SQLContext. Additional features include the ability to write queries using the more complete HiveQL parser, access to Hive UDFs, and the ability to read data from Hive tables. To use a HiveContext, you do not need to have an existing Hive setup, and all of the data sources available to a SQLContext are still available. HiveContext is only packaged separately to avoid including all of Hive’s dependencies in the default Spark build. If these dependencies are not a problem for your application then using HiveContext is recommended for the 1.3 release of Spark. Future releases will focus on bringing SQLContext up to feature parity with a HiveContext

So The HiveContext is used by spark to enhance the query parsing and accessing to existing Hive tables, and even to persist your result DataFrames / Tables. Actually, moreover, Hive could use Spark as its execution engine instead of using MR or tez.

Hive metastore is a metadata about Hive tables. And when using HiveContext spark could use this metastore service. Refer to the documentation: http://spark.apache.org/docs/latest/sql-programming-guide.html

0人赞添加讨论(0) 举报

Spark SQL query execution on Hive

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间