SMB join not working over Hive Tables

2019-09-02 05:59发布

问题:

While performing SMB join over two ORC tables, bucketed and sorted on subscription_id, the join fails giving below error:

Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:210)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.joinFinalLeftData(SMBMapJoinOperator.java:345)
at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.closeOp(SMBMapJoinOperator.java:610)
at org.apache.hadoop.hive.ql.exec.vector.VectorSMBMapJoinOperator.closeOp(VectorSMBMapJoinOperator.java:275)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:192)
... 8 more

The task tracker URL also doesn't give much details.

The query is:

SELECT * FROM 
user_plays_buck
INNER JOIN small_user_subscription_buck
ON user_plays_buck.subscription_id = small_user_subscription_buck.subscription_id
LIMIT 1;

回答1:

Got exactly the same issue in Hive 1.1. The same query works in Hive 2.1. So upgrade your hive.



标签: hive hiveql