I want to create a "Total" row in a dataframe.
This will add all rows EXCEPT the uid cell.
uid val1 val2 val3
3213 1 2 3
To create this:
uid val1 val2 val3 Total
3213 1 2 3 6
So, I need to filter out the UID, then sum. However, if I drop the UID before summing, then I won't be able to rejoin the tables after summing (as the join would have to be on UID).
I was playing with filter, but I cannot find a way to get the Column Name in filter.
So what I have so far is:
val dfvReducedTotalled = dfvReduced.withColumn("TOTAL", dfvReduced.columns
.filter(col=> !col.?????? == "UID")
.map(c => col(c)).reduce((c1, c2) => c1 + c2))