I need to implement a auto increment column in my spark sql table, how could i do that. Kindly guide me. i am using pyspark 2.0
Thank you Kalyan
I need to implement a auto increment column in my spark sql table, how could i do that. Kindly guide me. i am using pyspark 2.0
Thank you Kalyan
I would write/reuse stateful Hive udf and register with pySpark as Spark SQL does have good support for Hive.
check this line
@UDFType(deterministic = false, stateful = true)
in below code to make sure it's stateful UDF.Now build the jar and add the location when pyspark get's started.
Then register with
sqlContext
.Now use
row_seq()
in select queryProject to use Hive UDFs in pySpark