Couchbase rereduce questions

2019-06-09 13:58发布

Here is coding from Couchbase Document and I dont understand it

function(key, values, rereduce) {
  var result = {total: 0, count: 0};
  for(i=0; i < values.length; i++) {
    if(rereduce) {
        result.total = result.total + values[i].total;
        result.count = result.count + values[i].count;
    } else {
        result.total = sum(values);
        result.count = values.length;
    }
  }
  return(result);
}
  1. rereduce means the current function call has already done the reduce or not. right?
  2. the first argument of the reduce function, key, when will it be used? I saw a numbers of examples, key seems to be unused
  3. When does rereduce return true and the array size is more than 1?
  4. Again, When does rereduce return is false and the array size is more than 1?

标签: couchbase
2条回答
Deceive 欺骗
2楼-- · 2019-06-09 14:17
  1. Rereduce means that the reduce function is called before and now it is called again with params that were returnd as a result in first reduce call. So if we devide it into two functions it will look like:

    function reduce(k,v){
      // ... doing something with map results
      // instead of returning result we must call rereduce function)
      rereduce(null, result)
    }
    function rereduce(k,v){
      // do something with first reduce result
    }
    

    In most cases rereduce will happen when you have 2 or more servers in cluster or you have a lot of items in your database and the calculation is done on multiple "nodes" of the B*Tree. Example with 2 servers will be easier to understand: Let's imagine that your map function returned pairs: [key1-1, key2-2, key6-6] from 1st server and [key5-5,key7-7] from 2nd. You'll get 2 reduce function calls with: reduce([key1,key2,key6],[1,2,6],false) and reduce([key5,key7],[5,7],false). Then if we just return values (do nothing in reduce, just return values), the reduce function will be called with such params: reduce(null, [[1,2,6],[5,7]], true). Here values will be an array of results that came from first reduce calls.

  2. On rereduce key will be null. Values will be an array of values as returned by a previous reduce() function.

  3. Array size depends only on your data. It not depends on rereduce variable. Same answer for 4th question.

You can just try to run examples from Views basics and Views with reduce. I.e. you can modify reduce function to see what it returns on each step:

function reduce(k,v,r){
 if (!r){
   // let reduce function return only one value:
   return 1;
 } else {
   // and lets see what values have came in "rereduce"
   return v; 
 }
}
查看更多
啃猪蹄的小仙女
3楼-- · 2019-06-09 14:17

I am also confused by the example from the official couchbase website as well, and below is what i thought.

confusion: the reduce method signature

1) its written as function (keys, values, rereduce)

2) its written as function(key, values, rereduce)

What exactly is the first param, key or keys

For all my understand from my previous exp on the map/reduce, the key the key that emit from the map function and there is a hidden shuffle method that will aggregate the value into a value list for the same key. So the key param can be an array under the circumstances that you emit an array as key (which you can use group by level control the level of aggregation)

So i am not agree with the example that given by @m03geek, it should not be a list of different keys, correct me if i am wrong.

My assumption: Both reduce and rereduce work on the SAME key only.

eg: reduce is like:

1)reduce(keyA, [1,2,3]) this is precalculated, and stored in Btree structure

2) rereduce(keyA, [6, reduce(keyA, [4,5,6])]), 6 is the sum of [1,2,3] from the first reduce method, then we add a new doc into couchbase, which will trigger the reduce method again, instead of calculating the whole thing again as the original map/reduce will do, couchbase get the precalculated data out from the btree which is 6, and run reduce from the key-value pairs from the map method (which is triggered by adding a new doc), then run re-reduce on the precalculated value + new value.

查看更多
登录 后发表回答