I am reading a file into a dataframe like this
val df = spark.read
.option("sep", props.inputSeperator)
.option("header", "true")
.option("badRecordsPath", "/mnt/adls/udf_databricks/error")
.csv(inputLoc)
The file is setup like this
col_a|col_b|col_c|col_d
1|first|last|
2|this|is|data
3|ok
4|more||stuff
5|||
Now, spark will read all of this as acceptable data. However, I want 3|ok
to be marked as a bad record because it's size does not match the header size. Is this possible?
The below code is supported by databricks implementation of spark.I dont see schema mapping in your code. could you map it and try ?
Change your code like below,
More detail's you can refer Datbricks doc on badrecordspath
Thanks, Karthick