I get a java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.0 exception
in my query. Here's the query:
WITH
t1 as
(select * from browserdata join citydata on cityid=id),
t2 as
(select uap.device as device, uap.os as os, uap.browser as browser, name as cityname
from t1
lateral view ParseUserAgentUDTF(UserAgent) uap as device, os, browser),
t3 as
(select t2.cityname as cityname, t2.device as device, t2.browser as browser, t2.os as os, count(*) as count from t2 group by t2.cityname, t2.os, t2.device, t2.browser)
select cityname, maximum, device, os, browser
from
(select cityname, device, browser, os,
max(count) over(partition by cityname) as maximum,
dense_rank() over (partition by cityname order by count desc ) as rnk
from t3
) s where rnk =1
;
And here's the log from my container:
Log Type: stdout
Log Upload Time: Mon Dec 24 16:21:37 +0000 2018
Log Length: 5529
Showing 4096 bytes of 5529 total. Click here for the full log.
.8.0_171]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) [udf.jar:?]
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) [tez-runtime-internals-0.7.0.2.6.5.0-292.jar:0.7.0.2.6.5.0-292]
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) [tez-runtime-internals-0.7.0.2.6.5.0-292.jar:0.7.0.2.6.5.0-292]
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) [tez-common-0.7.0.2.6.5.0-292.jar:0.7.0.2.6.5.0-292]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.0
at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:178) ~[hive-exec-1.2.1000.2.6.5.0-292-d249a9484f801bbb96f01e7bbd357a58127aaca3e59c783a90c062bf99c9310d.jar:1.2.1000.2.6.5.0-292]
at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:143) ~[hive-exec-1.2.1000.2.6.5.0-292-d249a9484f801bbb96f01e7bbd357a58127aaca3e59c783a90c062bf99c9310d.jar:1.2.1000.2.6.5.0-292]
at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:102) ~[hive-exec-1.2.1000.2.6.5.0-292-d249a9484f801bbb96f01e7bbd357a58127aaca3e59c783a90c062bf99c9310d.jar:1.2.1000.2.6.5.0-292]
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:452) ~[hive-exec-1.2.1000.2.6.5.0-292-d249a9484f801bbb96f01e7bbd357a58127aaca3e59c783a90c062bf99c9310d.jar:1.2.1000.2.6.5.0-292]
... 16 more
I haven't been able to understand what the problem is, everything works on mapreduce
but doesn't want to work on tez
.
I use a user defined function for this query to parse the user agent string in one of the columns.
Thanks to @leftjoin my problem is now resolved. It turns that I was using this library in my
udf.jar
(A jar with my custom user defined function) maven project:But my
hive
version is1.2.1
.So, adding this:
fixed everything for me.
You can see in my logs
udf.jar
being mentioned. This is the jar that I added with theadd jar /path/to/jar
command, but the error is really cryptic...Also, I want to mention that even if you just added the
udf.jar
to your session and you don't use it in your query, you will still get this error.You're running Hortonworks-sandbox for HDP 2.6.5, which has Hadoop 2.7.x and Hive 1.2, not Hadoop/Hive 3