I'm running a hadoop job with, say, 1000 tasks. I need the job to attempt to run every task but many of the tasks will not complete and will instead throw an exception. I cannot change this behavior, but I still need the data obtained from the tasks that did not fail.
How can I make sure Hadoop goes through with all the 1000 tasks despite encountering a large number of failed tasks?