JavaPairRdd has saveAsTextfile
function, with which you can save data in a text format.
However what I need is to save the data as CSV file, so I can use it later with Neo4j.
My question is:
How to save the JavaPairRdd 's data in CSV format? Or is there a way to transform the rdd from :
Key Value
Jack [a,b,c]
to:
Key value
Jack a
Jack b
Jack c
You should use the
flatMapValues
function on your JavaPairRdd:Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
Just by returning the value it will create a line per element in the input lists preserving the keys.