how can I build a key with two components? The reason for this is I have an undirected graph. There is an edge between two nodes A and B if A and B were associated through a communication (the direction is irrelevant). This communication has a numerical parameter. So what I would like to achieve is to have a key which combines A and B together as a set, so that the communication from A to B and B to A can be considered equivalent and be summed up to get stats
Say:
A B 5
B A 10
The key then should be semantically "A or B together", so that the set containing A and B as key should have the value 5+10=15.
The wordcount example has as key the specific words. In my case, I want to have as key a set with two components. During the map and reduce phases, I would like to sum as long as A to B or B to A satisfies.
Thx!
You need custom key, with own comparison rules. You doing it by implementing WritableComparable over you class containing graph link information. This is example / explanations: http://developer.yahoo.com/hadoop/tutorial/module5.html#keytypes
In addition to the (correct) answer by David: If your problem has to with graphs then have a look at http://incubator.apache.org/giraph/ also.