how to kill hadoop jobs

2019-01-30 02:58发布

问题:

I want to kill all my hadoop jobs automatically when my code encounters an unhandled exception. I am wondering what is the best practice to do it?

Thanks

回答1:

Depending on the version, do:

version <2.3.0

Kill a hadoop job:

hadoop job -kill $jobId

You can get a list of all jobId's doing:

hadoop job -list

version >=2.3.0

Kill a hadoop job:

yarn application -kill $ApplicationId

You can get a list of all ApplicationId's doing:

yarn application -list


回答2:

Use of folloing command is depreciated

hadoop job -list
hadoop job -kill $jobId

consider using

mapred job -list
mapred job -kill $jobId


回答3:

Run list to show all the jobs, then use the jobID/applicationID in the appropriate command.

Kill mapred jobs:

mapred job -list
mapred job -kill <jobId>

Kill yarn jobs:

yarn application -list
yarn application -kill <ApplicationId>


回答4:

An unhandled exception will (assuming it's repeatable like bad data as opposed to read errors from a particular data node) eventually fail the job anyway.

You can configure the maximum number of times a particular map or reduce task can fail before the entire job fails through the following properties:

  • mapred.map.max.attempts - The maximum number of attempts per map task. In other words, framework will try to execute a map task these many number of times before giving up on it.
  • mapred.reduce.max.attempts - Same as above, but for reduce tasks

If you want to fail the job out at the first failure, set this value from its default of 4 to 1.



回答5:

Simply forcefully kill the process ID, the hadoop job will also be killed automatically . Use this command:

kill -9 <process_id> 

eg: process ID no: 4040 namenode

username@hostname:~$ kill -9 4040


标签: hadoop kill jobs