This command works with HiveQL:
insert overwrite directory '/data/home.csv' select * from testtable;
But with Spark SQL I'm getting an error with an org.apache.spark.sql.hive.HiveQl
stack trace:
java.lang.RuntimeException: Unsupported language features in query:
insert overwrite directory '/data/home.csv' select * from testtable
Please guide me to write export to CSV feature in Spark SQL.
You can use below statement to write the contents of dataframe in CSV format
If you need to write the whole dataframe into a single CSV file, then use
For spark 1.x, you can use spark-csv to write the results into CSV files
Below scala snippet would help
To write the contents into a single file
Since Spark
is integrated as native datasource. Therefore, the necessary statement simplifies to (windows)or UNIX
With the help of spark-csv we can write to a CSV file.
The answer above with spark-csv is correct but there is an issue - the library creates several files based on the data frame partitioning. And this is not what we usually need. So, you can combine all partitions to one:
and rename the output of the lib (name "part-00000") to a desire filename.
This blog post provides more details:
The error message suggests this is not a supported feature in the query language. But you can save a DataFrame in any format as usual through the RDD interface (
). Or you can check out code here IN DATAFRAME: