I am trying to create a hive table from the list of case class. But it does not allow to specify the database name. Below error is being thrown.
Spark Version: 1.6.2
Error: diagnostics: User class threw exception: org.apache.spark.sql.AnalysisException: Table not found: mytempTable; line 1 pos 58
Please let me know the way to save the output of map method to a hive table withe same structure as case class.
Note: recordArray list is being populated in the map method (in getElem() method infact) for the input given
object testing extends Serializable {
var recordArray=List[Record]();
def main(args:Array[String])
{
val inputpath = args(0).toString();
val outputpath=args(1).toString();
val conf = new SparkConf().setAppName("jsonParsing")
val sc = new SparkContext(conf)
val sqlContext= new SQLContext(sc)
val hsc = new HiveContext(sc)
val input = sc.textFile(inputpath)
//val input=sc.textFile("file:///Users/Documents/Work/data/mydata.txt")
// input.collect().foreach(println)
val = input.map(data=>getElem(parse(data,false)))
val recordRDD = sc.parallelize(recordArray)
//
val recordDF=sqlContext.createDataFrame(recordRDD)
recordDF.registerTempTable("mytempTable")
hsc.sql("create table dev_db.ingestion as select * from mytempTable")
}
case class Record(summary_key: String, key: String,array_name_position:Int,Parent_Level_1:String,Parent_level_2:String,Parent_Level_3:String,Parent_level_4:String,Parent_level_5:String,
param_name_position:Integer,Array_name:String,paramname:String,paramvalue:String)
}
you need to have/create a HiveContext
Then directly save dataframe or select the columns to store as hive table
recordDF is dataframe
or
or
SaveModes are Append/Ignore/Overwrite/ErrorIfExists
I added here the definition for HiveContext from Spark Documentation,