how to use external jars in Cloudera hadoop?

2019-07-29 10:00发布

i have a cloudera hadoop version 4 installed on my cluster. It comes packaged with google protobuffer jar version 2.4. in my application code i use protobuffer classes compiled with protobuffer version 2.5.

This causes unresolved compilation problems at run time. Is there a way to run the map reduce jobs with an external jar or am i stuck until cloudera upgrades their service?

Thanks.

1条回答
做自己的国王
2楼-- · 2019-07-29 10:44

Yes you can run MR jobs with external jars.

Be sure to add any dependencies to both the HADOOP_CLASSPATH and -libjars upon submitting a job like in the following examples:

You can use the following to add all the jar dependencies from current and lib directories:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`

Bear in mind that when starting a job through hadoop jar you'll need to also pass it the jars of any dependencies through use of -libjars. I like to use:

hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]

NOTE: The sed commands require a different delimiter character; the HADOOP_CLASSPATH is : separated and the -libjars need to be , separated.

EDIT: If you need your classpath to be interpreted first to ensure your jar (and not the pre-packaged jar) is the one that gets used, you can set the following:

export HADOOP_USER_CLASSPATH_FIRST=true

查看更多
登录 后发表回答