Spark SQL sql(“<some>”).first().getDouble(0) give

I have the below query which is supposed to find an average of the column values and return me the result which is a single number.

val avgVal = hiveContext.sql("select round(avg(amount), 4) from users.payment where dt between '2018-05-09' and '2018-05-09'").first().getDouble(0)

I'm facing inconsistent behavior at this statement. This often fails with below error however it gives non-NULL results when executed through Hive."

18/05/10 11:01:12 ERROR ApplicationMaster: User class threw exception: java.lang.NullPointerException: Value at index 0 in null
java.lang.NullPointerException: Value at index 0 in null
    at org.apache.spark.sql.Row$class.getAnyValAs(Row.scala:475)
    at org.apache.spark.sql.Row$class.getDouble(Row.scala:243)
    at org.apache.spark.sql.catalyst.expressions.GenericRow.getDouble(rows.scala:192)

The reason why I use HiveContext instead of SQLContext is that the later doesn't support some of the aggregation functions which I use extensively in my code.

Could you please help me understand why this problem occurs and how to solve?

标签： java apache-spark hive hivecontext

2条回答

倾城　Initia

2楼-- · 2019-07-30 17:33

You need to divide query and get into two parts:

var result = hiveContext.sql("select round(avg(amount), 4) from users.payment where dt between '2018-05-09' and '2018-05-09'");
var first = result.first();
if (first != null && !first.isNullAt(0)) {
var avgVal = first.getDouble(0);
}

This would avoid NPE. This would also be needed in List and array.

For insert or update query, you even need to surround with try...catch block to catch runtime exception.

0人赞添加讨论(0) 举报

欢心

3楼-- · 2019-07-30 17:37

Let's analyze the case and possible reasons when this exception can be thrown.

Row row = hiveContext.sql("select info, name, desc, id from users.payment where dt between '2018-05-09' and '2018-05-09'").first();

If the value of the row above returned something like:

[null, Kevin, cash, 300]

Trying to get getDouble(0) will lead to java.lang.NullPointerException: Value at index 0 in null

You can try the following approach:

Row row = hiveContext.sql("select round(avg(amount), 4) from users.payment where dt between '2018-05-09' and '2018-05-09'").first();

if (!row.isNullAt(0))
   double d = row.getDouble(0);
else
   logger.error("Value at index zero is null");

If you'll check the sources, the library class doing opposite:

private static Object getAnyValAs(Row $this, int i) {
    if($this.isNullAt(i)) {
        throw new NullPointerException((new StringContext(scala.Predef..MODULE$.wrapRefArray((Object[])(new String[]{"Value at index ", " is null"})))).s(scala.Predef..MODULE$.genericWrapArray(new Object[]{BoxesRunTime.boxToInteger(i)})));
    } else {
        return $this.getAs(i);
    }
}

0人赞添加讨论(0) 举报

Spark SQL sql(“”).first().getDouble(0) give

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间