Couchbase, reduction too large error

2019-07-08 17:15发布

问题:

On my work I using couchbase and I have some problems. From some devices data come to couchbase, and after I calling aggregate view. This view must aggregate values by 2 keys: timestamp and deviceId. Everything was fine, before I have tried to aggregate more then 10k values. In this case I have reduction error

Map function:

function(doc, meta)
{
  if (doc.type == "PeopleCountingIn"&& doc.undefined!=true)
  {
    emit(doc.id+"@"+doc.time, [doc.in, doc.out, doc.id, doc.time, meta.id]);
  }
}

Reduce function:

function(key, values, rereduce)
{
  var result = 
      { 
        "id":0,
        "time":0,
        "in" : 0, 
        "out" : 0,
        "docs":[]
      };
  if (rereduce)
  {
    result.id=values[0].id;
    result.time = values[0].time;
    for (i = 0; i<values.length; i++)
    {
      result.in = result.in + values[i].in;
      result.out = result.out + values[i].out;
      for (j = 0; j < values[i].docs.length; j++)
      {
        result.docs.push(values[i].docs[j]);
      }        
    }
  }
  else
  {
    result.id = values[0][2];
    result.time = values[0][3];
    for(i = 0; i < values.length; i++)
    {
      result.docs.push(values[i][4]);
      result.in = result.in + values[i][0];
      result.out = result.out + values[i][1];
    }
  }
  return result;
}

Document sample:

{
   "id": "12292228@0",
   "time": 1401431340,
   "in": 0,
   "out": 0,
   "type": "PeopleCountingIn"
}

UPDATE

Output document:

{"rows":[
{"key":"12201774@0@1401144240","value":{"id":"12201774@0","time":1401144240,"in":0,"out":0,"docs":["12231774@0@1401546080@1792560127"]}},
{"key":"12201774@0@1401202080","value":{"id":"12201774@0","time":1401202080,"in":0,"out":0,"docs":["12201774@0@1401202080@1792560840"]}}
]
}

}

Error occurs in the case where "docs" array length more then 100. And I think in that cases working rereduce function. Is there some way to fix this error exept making count of this array less?

回答1:

There are a number of limits on the output of map & reduce functions, to prevent indexes taking too long and/or growing too large.

These are in the process of being added to the official documentation, but in the meantime quoting from the issue (MB-11668) tracking the documentation update:

1) indexer_max_doc_size - documents larger then this value are skipped by the indexer. A message is logged (with document ID, its size, bucket name, view name, etc) when such a document is encountered. A value of 0 means no limit (like what it used to be before). Current default value is 1048576 bytes (1Mb).

2) max_kv_size_per_doc - maximum total size (bytes) of KV pairs that can be emitted for a single document for a single view. When such limit is passed, message is logged (with document ID, its size, bucket name, view name, etc). A value of 0 means no limit (like what it used to be before). Current default value is 1048576 bytes (1Mb)

Edit: Additionally, there is a limit of 64kB for the size of a single reduction (output of the reduce(). I suggest you re-work your reduce function to return data within this limit. See MB-7952 for a technical discussion on why this is the case.