How can I access the Mapper/Reducer counters on th

2019-02-15 12:07发布

I have some counters I created at my Mapper class:

(example written using the appengine-mapreduce Java library v.0.5)

@Override
public void map(Entity entity) {
    getContext().incrementCounter("analyzed");
    if (isSpecial(entity)){
        getContext().incrementCounter("special");
    }
}

(The method isSpecial just returns true or false depending on the state of the entity, not relevant to the question)

I want to access those counters when I finish processing the whole stuff, at the finish method of the Output class:

@Override
public Summary finish(Collection<? extends OutputWriter<Entity>> writers) {
    //get the counters and save/return the summary
    int analyzed = 0; //getCounter("analyzed");
    int special = 0; //getCounter("special");
    Summary summary = new Summary(analyzed, special);
    save(summary);
    return summary;
}

... but the method getCounter is only available from the MapperContext class, which is accessible only from Mappers/Reducers getContext() method.

How can I access my counters at the Output stage?

Side note: I can't send the counters values to my outputted class because the whole Map/Reduce is about transforming a set of Entities to another set (in other words: the counters are not the main purpose of the Map/Reduce). The counters are just for control - it makes sense I compute them here instead of creating another process just to make the counts.

Thanks.

1条回答
Bombasti
2楼-- · 2019-02-15 12:11

There is not a way to do this inside of output today. But feel free to request it here: https://code.google.com/p/appengine-mapreduce/issues/list

What you can do however is to chain a job to run after your map-reduce that will receive it's output and counters. There is an example of this here: https://code.google.com/p/appengine-mapreduce/source/browse/trunk/java/example/src/com/google/appengine/demos/mapreduce/entitycount/ChainedMapReduceJob.java

In the above example it is running 3 MapReduce jobs in a row. Note that these don't have to be MapReduce jobs, you can create your own class that extends Job and has a run method which creates your Summary object.

查看更多
登录 后发表回答