Floating Point Numbers Do Not Match In Aggregation

2020-03-26 20:47发布

问题:

I have this sample test collection documents:

/* 1 */
{
    "_id" : 1.0,
    "value" : 10.7
}

/* 2 */
{
    "_id" : 2.0,
    "value" : 10.8
}

/* 3 */
{
    "_id" : 3.0,
    "value" : 10.7
}

So when I am using $addFields in aggregation pipeline to add new "result" field in document using query:

db.test.aggregate([{$addFields:{result:{ $add : ["$value",  .10]}}}]);

It gives below result:

/* 1 */
{
    "_id" : 1.0,
    "value" : 10.7,
    "result" : 10.8
}

/* 2 */
{
    "_id" : 2.0,
    "value" : 10.8,
    "result" : 10.9
}

/* 3 */
{
    "_id" : 3.0,
    "value" : 10.7,
    "result" : 10.8
}

Now I wanted to perform comparison with this newly added field using mongo query:

db.test.aggregate([
  {$addFields:{result:{ $add : ["$value",  .10]}}}, 
  { $match : { result : { $eq : 10.8}}}
]);

What I see is that above query is correct, but not sure why it return no documents matched?

Am I doing anything wrong here?

回答1:

The default (and currently, only) native numeric type in JavaScript is a double precision floating point Number. As noted in other discussion on this question, binary floating point arithmetic is subject to rounding error because some decimal fractions cannot be represented exactly in binary floating point. A common workaround for this is a data model approach using a Scale Factor to avoid storing fractional values.

However, if you are using MongoDB 3.4+ there is a native Decimal BSON type you can use for precision when working with floating point or monetary values. This implements the IEEE 754 Decimal 128 floating-point format which supports exact decimal representation including arithmetic manipulation through MongoDB's aggregation pipeline.

The mongo shell includes a NumberDecimal helper for passing decimal values through to aggregation queries or CRUD commands.

Setting up some example data:

db.test.insert([
    { "_id" : 1.0, "value" : NumberDecimal(10.7) },
    { "_id" : 2.0, "value" : NumberDecimal(10.8) }
])

... and using aggregation to compare against the original value and an incremented result:

db.test.aggregate([
    { $addFields: {
        "result"  : { "$add": [ "$value",  NumberDecimal(0.10) ] },
    }},

    // Compare value and result against expected value of 10.8
    { $addFields: {
        "matches_value":   { $eq: ["$value",  NumberDecimal(10.8)] },
        "matches_result":  { $eq: ["$result", NumberDecimal(10.8)] }
    }},
])

The decimal type correctly maintains precision and can be used for exact matches. Sample output of this aggregation:

{
  "result": [
    {
      "_id": 1,
      "value": NumberDecimal("10.7000000000000"),
      "result": NumberDecimal("10.800000000000000"),
      "matches_value": false,
      "matches_result": true
    },
    {
      "_id": 2,
      "value": NumberDecimal("10.8000000000000"),
      "result": NumberDecimal("10.900000000000000"),
      "matches_value": true,
      "matches_result": false
    }
  ],
  "ok": 1
}


回答2:

You are not really telling the whole truth in your question. If I take the initial "value" fields and add them with 0.1 then the result is exactly what I expect:

So insert the documents:

db.numbers.insert([
  { "_id" : 1.0, "value" : 10.7  },
  { "_id" : 2.0, "value" : 10.8  },
  { "_id" : 3.0, "value" : 10.7  }
])

Then run the same initial aggregation statement:

db.numbers.aggregate([
  { "$addFields":{
    "result": { "$add": [ "$value",  0.10 ] }
  }}
]);

The result:

{ "_id" : 1, "value" : 10.7, "result" : 10.799999999999999 }
{ "_id" : 2, "value" : 10.8, "result" : 10.9 }
{ "_id" : 3, "value" : 10.7, "result" : 10.799999999999999 }

Welcome to computer science. This is floating point math, and it simply always has rounding error. For the full read, get into this:

What Every Computer Scientist Should Know About Floating-Point Arithmetic.

We can correct this simply by not using fractional numbers and rounding out to factored integers. In this case x10:

db.numbers.aggregate([
  { "$addFields": {
    "result": {
      "$divide": [
        { "$add": [
          { "$multiply": [ "$value", 10 ] },
          1
        ]},
        10
      ]
    }
  }}
])

Where the "result" comes out like this:

{ "_id" : 1, "value" : 10.7, "result" : 10.8 }
{ "_id" : 2, "value" : 10.8, "result" : 10.9 }
{ "_id" : 3, "value" : 10.7, "result" : 10.8 }

And with the final $match only on the requested value:

db.numbers.aggregate([
  { "$addFields": {
    "result": {
      "$divide": [
        { "$add": [
          { "$multiply": [ "$value", 10 ] },
          1
        ]},
        10
      ]
    }
  }},
  { "$match": { "result": 10.8 } }
])

The correct result

{ "_id" : 1, "value" : 10.7, "result" : 10.8 }
{ "_id" : 3, "value" : 10.7, "result" : 10.8 }