I need to include newer protobuf jar (newer than 2.5.0) in Hive. Somehow no matter where I put the jar - it's being pushed to the end of the classpath. How can I make sure that the jar is in the beginning of the classpath of Hive?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
To add your own jar to the Hive classpath so that it's included in the beginning of the classpath and not overloaded by some hadoop jar you need to set the following Env variable -
export HADOOP_USER_CLASSPATH_FIRST=true
This indicates that the HADOOP_CLASSPATH will gain priority over general hadoop jars.
At Amazon emr instances you can add this to /home/hadoop/conf/hadoop-env.sh, and modify the classpath in this file also.
This is useful when you want to overload jars like protobuf that come with the hadoop general classpath.
回答2:
The other thing you might consider doing is including the protobuf classes in your jar. You would need to build your jar with the assembly plugin, which will those classes. Its an option.