Sorting mongodb by reddit ranking algorithm

2019-01-24 17:41发布

问题:

Here is a js code to rank items according to Reddit's ranking algorithm.

My question is: how do I use this code to rank my mongodb documents ?

(Reddit's ranking algorithm)

function hot(ups,downs,date){
    var score = ups - downs;
    var order = log10(Math.max(Math.abs(score), 1));
    var sign = score>0 ? 1 : score<0 ? -1 : 0;
    var seconds = epochSeconds(date) - 1134028003;
    var product = order + sign * seconds / 45000;
    return Math.round(product*10000000)/10000000;
}
function log10(val){
  return Math.log(val) / Math.LN10;
}
function epochSeconds(d){
    return (d.getTime() - new Date(1970,1,1).getTime())/1000;
}

回答1:

Well you can use mapReduce:

var mapper = function() {

    function hot(ups,downs,date){
        var score = ups - downs;
        var order = log10(Math.max(Math.abs(score), 1));
        var sign = score>0 ? 1 : score<0 ? -1 : 0;
        var seconds = epochSeconds(date) - 1134028003;
        var product = order + sign * seconds / 45000;
        return Math.round(product*10000000)/10000000;
    }

   function log10(val){
      return Math.log(val) / Math.LN10;
   }

   function epochSeconds(d){
       return (d.getTime() - new Date(1970,1,1).getTime())/1000;
   }

   emit( hot(this.ups, this.downs, this.date), this );

};

And the run the mapReduce (without a reducer):

db.collection.mapReduce(
    mapper,
    function(){},
    {
        "out": { "inline": 1 }
    }
)

And of course presuming that your "collection" has the fields for ups, downs and date. Of course the "rankings" need to be emitted in a way that is "unique" otherwise you need a "reducer" to sort out the results.

But generally speaking that should do the job.



回答2:

Theres a problem with your function:

new Date(1970, 1, 1) // Sun Feb 01 1970 00:00:00 GMT-0300 (BRT)

Yep, month 1 is February, and it uses the systems timezone as well. Epoch in JavaScript is

var epoch = new Date(Date.UTC(1970, 0, 1))

Since

epoch.getTime() // 0

The function

function epochSeconds(d){
    return (d.getTime() - new Date(1970,1,1).getTime())/1000;
}

should be just

function epochSeconds(d){
    return d.getTime()/1000;
}

Compressing a bit, this returns exactly the same results as the python function in http://amix.dk/blog/post/19588

function hot (ups, downs, date){
  var score = ups - downs;
  var order = Math.log(Math.max(Math.abs(score), 1)) / Math.LN10;
  var sign = score > 0 ? 1 : score < 0 ? -1 : 0;
  var seconds = (date.getTime()/1000) - 1134028003;
  var product = order + sign * seconds / 45000;
  return Math.round(product*10000000)/10000000;
}