I have below scenario:
I have 2 dataframes containing only 1 column Lets say
DF1=(1,2,3,4,5)
DF2=(3,6,7,8,9,10)
Basically those values are keys and I am creating a parquet file of DF1 if the keys in DF1 are not in DF2 (In current example it should return false). My current way of achieving my requirement is:
val df1count= DF1.count
val df2count=DF2.count
val diffDF=DF2.except(DF1)
val diffCount=diffDF.count
if(diffCount==(df2count-df1count)) true
else false
The problem with this approach is I am calling action elements 4 times which is for sure not the best way. Can someone suggest me the best effective way of achieving this?
You can use below function:
In your case run this:
Example
Here is a way to to get the uncommon rows between two dataframes: