Error during benchmarking Sort in Hadoop2 - Partit

2019-09-18 08:11发布

问题:

I am trying to benchmark Hadoop2 MapReduce framework. It is NOT TeraSort. But testmapredsort.

step-1 Create random data:

hadoop jar hadoop/ randomwriter -Dtest.randomwrite.bytes_per_map=100 -Dtest.randomwriter.maps_per_host=10 /data/unsorted-data

step-2 sort the random data created in step-1:

hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort /data/unsorted-data /data/sorted-data

step-3 check if the sorting by MR works:

hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar testmapredsort -sortInput /data/unsorted-data -sortOutput /data/sorted-data

I get the following error during step-3. I want to know how to fix this this error.

java.lang.Exception: java.io.IOException: Partitions do not match for record# 0 ! - '0' v/s '5'
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.io.IOException: Partitions do not match for record# 0 ! - '0' v/s '5'
    at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker$Map.map(SortValidator.java:266)
    at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker$Map.map(SortValidator.java:191)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:695)
14/08/18 11:07:39 INFO mapreduce.Job: Job job_local2061890210_0001 failed with state FAILED due to: NA
14/08/18 11:07:39 INFO mapreduce.Job: Counters: 23
    File System Counters
        FILE: Number of bytes read=1436271
        FILE: Number of bytes written=1645526
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=1077294840
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=13
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=1
    Map-Reduce Framework
        Map input records=102247
        Map output records=102247
        Map output bytes=1328251
        Map output materialized bytes=26
        Input split bytes=102
        Combine input records=102247
        Combine output records=1
        Spilled Records=1
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=22
        Total committed heap usage (bytes)=198766592
    File Input Format Counters 
        Bytes Read=1077294840
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
    at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker.checkRecords(SortValidator.java:367)
    at org.apache.hadoop.mapred.SortValidator.run(SortValidator.java:579)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.mapred.SortValidator.main(SortValidator.java:594)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
    at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:115)
    at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

EDIT:

hadoop fs -ls /data/unsorted-data
-rw-r--r--   3 david supergroup          0 2014-08-14 12:45 /data/unsorted-data/_SUCCESS
-rw-r--r--   3 david supergroup 1077294840 2014-08-14 12:45 /data/unsorted-data/part-m-00000

hadoop fs -ls /data/sorted-data
-rw-r--r--   3 david supergroup          0 2014-08-14 12:55 /data/sorted-data/_SUCCESS
-rw-r--r--   3 david supergroup  137763270 2014-08-14 12:55 /data/sorted-data/part-m-00000
-rw-r--r--   3 david supergroup  134220478 2014-08-14 12:55 /data/sorted-data/part-m-00001
-rw-r--r--   3 david supergroup  134219656 2014-08-14 12:55 /data/sorted-data/part-m-00002
-rw-r--r--   3 david supergroup  134218029 2014-08-14 12:55 /data/sorted-data/part-m-00003
-rw-r--r--   3 david supergroup  134219244 2014-08-14 12:55 /data/sorted-data/part-m-00004
-rw-r--r--   3 david supergroup  134220252 2014-08-14 12:55 /data/sorted-data/part-m-00005
-rw-r--r--   3 david supergroup  134224231 2014-08-14 12:55 /data/sorted-data/part-m-00006
-rw-r--r--   3 david supergroup  134210232 2014-08-14 12:55 /data/sorted-data/part-m-00007

回答1:

Aside from the change in keys from test.randomwrite.bytes_per_map and test.randomwriter.maps_per_host to mapreduce.randomwriter.bytespermap and mapreduce.randomwriter.mapsperhost causing the settings to not get through to randomwriter, the core of the problem as indicated by the filenames you listed under /data/sorted-data is that your sorted data consists of map outputs, whereas correctly sorted output only comes from reduce outputs; essentially, your sort command is only performing the map portion of the sort, and never performing the merge in a subsequent reduce stage. Because of this, your testmapredsort command is correctly reporting that the sort did not work.

Checking the code of Sort.java you can see that there is in fact no protection against num_reduces somehow getting set to 0; the typical behavior of Hadoop MR is that setting the number of reduces to 0 indicates a "map only" job, where the map outputs go directly to HDFS rather than being intermediate outputs passed to reduce tasks. Here are the relevant lines:

85     int num_reduces = (int) (cluster.getMaxReduceTasks() * 0.9);
86     String sort_reduces = conf.get(REDUCES_PER_HOST);
87     if (sort_reduces != null) {
88        num_reduces = cluster.getTaskTrackers() * 
89                        Integer.parseInt(sort_reduces);
90     }

Now, in a normal setup, all of that logic using "default" settings should provide a nonzero number of reduces, such that the sort works. I was able to repro your problem by running:

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort -r 0 /data/unsorted-data /data/sorted-data

using the -r 0 to force 0 reduces. In your case, more likely cluster.getMaxReduceTasks() is returning 1 (or possibly even 0 if your cluster is broken). I don't know off the top of my head all the ways that method could return 1; it appears that simply setting mapreduce.tasktracker.reduce.tasks.maximum to 1 doesn't apply to that method. Other factors that go into task capacity include numbers of cores and the amount of memory available.

Assuming your cluster is at least capable of 1 reduce task per TaskTracker, you can retry your sort step using -r 1:

hadoop fs -rmr /data/sorted-data
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort -r 1 /data/unsorted-data /data/sorted-data