How to convert json array to csv in spark

2019-05-30 05:22发布

问题:

I have tried this query to get required experience from linkedin data.

 Dataset<Row> filteredData = spark
                    .sql("select full_name ,experience from (select *, explode(experience['title']) exp from tempTable )"
                            + "  a where lower(exp) like '%developer%'");

But I got this error:

and finally I tried but I got more rows with the same name .

Dataset<Row> filteredData = spark
                    .sql("select full_name ,explode(experience) from (select *, explode(experience['title']) exp from tempTable )"
                            + "  a where lower(exp) like '%developer%'");

Please give me hint, how to convert array of string to comma separated string in the same column.

回答1:

You can apply UDF for making a comma separate string

Create UDF like this

def mkString(value: WrappedArray[String]): String = value.mkString(",")

Register UDF in sparkSQL context

sqlContext.udf.register("mkstring", mkString _)

Apply it on SparkSQL query

sqlContext.sql(select mkstring(columnName) from tableName)

it will return comma separate value of array