I'm trying to use Hive
on MR executing SQL
and it fails half way with errors below:
Application application_1570514228864_0001 failed 2 times due to AM Container for appattempt_1570514228864_0001_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2019-10-08 13:57:49.272]Failed to download resource { { s3a://tpcds/tmp/hadoop-yarn/staging/root/.staging/job_1570514228864_0001/libjars, 1570514262820, FILE, null },pending,[(container_1570514228864_0001_02_000001)],1132444167207544,DOWNLOADING} java.io.IOException: Resource s3a://tpcds/tmp/hadoop-yarn/staging/root/.staging/job_1570514228864_0001/libjars changed on src filesystem (expected 1570514262820, was 1570514269265
The key message from the error log from my perspective is libjars changed on src filesystem (expected 1570514262820, was 1570514269265
. There are several threads about this issue at SO but not been answered yet, like thread1 and thread2.
I found something valuable from apache jira and redhat bugzilla. I synced clock by NTP
through all nodes related. But same issue is still there.
Any comment is welcomed, thx.
I still didn't know why the timestamp of resource file is inconsistent and there isn't a way to fix it in configuration way, AFAIK.
However, I managed to find a workaround to skip the issue. Let me summarize it here for anyone who might run into same issue.
By checking error log and search it at
Hadoop
source code, we can trace the issue athadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
.Just remove the exception throwing statements,
Build
hadoop-yarn-project
and copy 'hadoop-yarn-common-x.x.x.jarto
$HADOOP_HOME/share/hadoop/yarn`.Leave this thread here and thanks for any further explanation about how to fix it without changing
hadoop
source.