I have [~]
as my delimiter for some csv files I am reading.
1[~]a[~]b[~]dd[~][~]ww[~][~]4[~]4[~][~][~][~][~]
I have tried this
val rddFile = sc.textFile("file.csv")
val rddTransformed = rddFile.map(eachLine=>eachLine.split("[~]"))
val df = rddTransformed.toDF()
display(df)
However this issue with this, is that it comes as a single value array with [
and ]
in each field. So the array would be
["1[","]a[","]b[",...]
I can't use
val df = spark.read.option("sep", "[~]").csv("file.csv")
Because multi-character seperator is not supported. What other approach can I take?
1[~]a[~]b[~]dd[~][~]ww[~][~]4[~]4[~][~][~][~][~]
2[~]a[~]b[~]dd[~][~]ww[~][~]4[~]4[~][~][~][~][~]
3[~]a[~]b[~]dd[~][~]ww[~][~]4[~]4[~][~][~][~][~]
Edit - this is not a duplicate, the duplicated thread is about multi delimiters, this is multi-character single delimiter
try below
for your another requirement