How to share a variable in Mapper and Reducer clas

2019-04-16 00:48发布

I have a requirement like I wanna share a variable between mapper and reducer class. Scenario is as follows:-

Suppose my input records are of type A, B and C. I'm processing these records and accordingly generating the key and value for output.collect in map function. But at the same time I've also declared 3 static int variables in mapper class to keep the count of type of record A, B and C. Now these variables will be updated by various map threads. When all the map tasks are done I wanna pass these three values to Reduce function.

How can this be achieved? I tried overriding close() method but it would be called after every map function is executed not when all the map functions are done executing. Or is there any other way to share variables. I wish to output the total count of each type of record along with whatever processed output I'm displaying.

2条回答
兄弟一词,经得起流年.
2楼-- · 2019-04-16 01:32

Counters are there for a specific reason, ie. to keep count of some specific state, for example, "NUMBER_OF_RECORDS_DISCARDED".And I believe one can only increment these counters and not set to any arbitrary value(I may be wrong here). But sure they can be used as message passers, but there is a better way, and that is to use job configuration to set a variable and seamlessly. But this can only be used to pass a custom message to mapper or reducer and the changes in mapper will not be available in reducer.

Setting the message/variable using the old mapred API

JobConf job = (JobConf) getConf();
job.set("messageToBePassed-OR-anyValue", "123-awesome-value :P");

Setting the message/variable using the new mapreduce API:

Configuration conf = new Configuration();
conf.set("messageToBePassed-OR-anyValue", "123-awesome-value :P");
Job job = new Job(conf);

Getting the message/variable using the old API in the Mapper and Reducer: The configure() has to be implemented in the Mapper and Reducer class and the values may be then assigned to a class member so as to be used inside map() or reduce().

...
private String awesomeMessage;
public void configure(JobConf job) {
    awesomeMessage = Long.parseLong(job.get("messageToBePassed-OR-anyValue"));
}
...

The variable awesomeMessage can then be used with the map and reduce functions.

Getting the message/variable using the new API in the Mapper and Reducer: Similar thing needs to be done here in the setup().

Configuration conf = context.getConfiguration();
String param = conf.get("messageToBePassed-OR-anyValue");
查看更多
Evening l夕情丶
3楼-- · 2019-04-16 01:52

Got the solution.

Used Counters. Which is accessible by reporter class in both Mapper and Reducer.

查看更多
登录 后发表回答