Error while trying to save the data to Hive tables

2019-09-19 12:50发布

We have the following issue when we try to insert data into Hive table.

Job aborted due to stage failure: Task 5 in stage 65.0 failed 4 times, most recent failure: Lost task 5.3 in stage 65.0 (TID 987, tnblf585.test.sprint.com): java.lang.ArrayIndexOutOfBoundsException: 45 at org.apache.spark.sql.catalyst.expressions.GenericMutableRow.genericGet(rows.scala:254) at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getAs(rows.scala:35) at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.isNullAt(rows.scala:36) at org.apache.spark.sql.catalyst.expressions.GenericMutableRow.isNullAt(rows.scala:248) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:107) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:104) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:104) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:84) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:84) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:

1条回答
萌系小妹纸
2楼-- · 2019-09-19 13:15

I have figured it out that one of the column name in dataframe and hive table are not same , after the column name correction it has loaded correctly

查看更多
登录 后发表回答