Original Data frame
0.2 0.3
+------+------------- -+
| name| country |
+------+---------------+
|Raju |UAS |
|Ram |Pak. |
|null |China |
|null |null |
+------+--------------+
I Need this
+------+--------------+
|Nwet|wet Con |
+------+--------------+
|0.2 | 0.3 |
|0.2 | 0.3 |
|0.0 | 0.3. |
|0.0 | 0.0 |
+------+--------------+
i want to create one Udf . for Both the column
which will apply to Name Column it check the if it not null then it return 0.2 return 0.0 .
and same Udf apply to country column check if it null return 0.0 . not null then it return 0.3
Using StringUtils of apache:
val transcodificationName: UserDefinedFunction =
udf { (name: String) => {
if (StringUtils.isBlank(name)) 0.0
else 0.2
}
}
val transcodificationCountry: UserDefinedFunction =
udf { (country: String) => {
if (StringUtils.isBlank(country)) 0.0
else 0.3
}
}
dataframe
.withColumn("Nwet", transcodificationName(col("name"))).cast(DoubleType)
.withColumn("wetCon", transcodificationCountry(col("country"))).cast(DoubleType)
.select("Nwet", "wetcon")
edit:
val transcodificationColumns: UserDefinedFunction =
udf { (input: String, columnName:String) => {
if (StringUtils.isBlank(country)) 0.0
else if(columnName.equals("name")) 0.2
else if(columnName.equals("country") 0.3
else 0.0
}
}
dataframe
.withColumn("Nwet", transcodificationColumns(col("name"), "name")).cast(DoubleType)
.withColumn("wetCon", transcodificationColumns(col("country")), "country").cast(DoubleType)
.select("Nwet", "wetcon")