UDF take two parameter spark data frame [duplicate

2019-09-02 11:05发布

站内文章 / Spark

54 0

姐就是有狂的资本

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

This question is an exact duplicate of:

spark data frame operation row and column level useing scala 1 answer

Original Data frame

+------+--------+
|  name| country|
+------+--------+
|Raju  |UAS     |
|Ram   |Pak     |
|null  |China   |
|null  |null    |
+------+--------+

  I Need  this 
+------+--------+
|Namwet|wet Con |
+------+--------+
|0.2   | 0.3    |
|0.2   | 0.3    |
|0.0   | 0.3    |
|0.0   | 0.0    |
+------+--------+

i want to create one Udf for both columns which will apply to Name Column it check them, if it not null then it return should 0.2, otherwise should return 0.0 . and same Udf apply to country column check if it null return 0.0 . not null then it return 0.3

回答1:

You don't need a udf.

You can do something like this

df
  .select(
    when($"name".isNotNull, 0.2).otherwise(0.0).as("Namewet"),
    when($"country".isNotNull, 0.3).otherwise(0.0).as("wet Con"),
    // Select more columns as required)