Merging two collections using Underscore.JS

2019-01-14 15:37发布

问题:

Provided I have two collections:

c1 - [{a:1},{a:2},{a:3}]

and

c2 - [{a:1},{a:7},{a:8}]

what's the fastest way to add unique items from c2 into c1 using Underscore.JS? The real numbers in collections would be 2K for c1 and 500 for c2, the operation is performed often, so must be performant!

Update 1 - I'm only using Underscore.JS for a few days, I couldn't find a way to add one collection into another (I can filter c2 myself) - is that trivial in Underscore.JS?

回答1:

The following will:

  • create a new array that contains all the elements of c1 and c2. See union.
  • from that mix, create a new array that contains only the unique elements. See uniq.

Note that this would work only if all your objects have the property a.

_.uniq(_.union(c1, c2), false, function(item, key, a){ return item.a; });

You can find other options in this question.



回答2:

Try:

_.uniq(_.union(c1, c2), false, _.property('a'))

In detail:

  1. _.union(*arrays)

    Computes the union of the passed-in arrays.

  2. _.property(key) (since Version 1.6.0)

    Returns a function that will itself return the key property of any passed-in object.

  3. _.uniq(array, [isSorted], [iteratee])

    Produces a duplicate-free version of the array, using === to test object equality. If you know in advance that the array is sorted, passing true for isSorted will run a much faster algorithm. If you want to compute unique items based on a transformation, pass an iteratee function.



回答3:

The documentation for uniq() function mentions that it runs much faster if the list is sorted. Also using the chained calls can improve readability. So you can do:

_.chain(c1).union(c2).sortBy("a").uniq(true, function(item){ return item.a; }).value();

Or if you prefer the unchained version (which is 11 characters shorter but less readable):

_.uniq(_.sortBy(_.union(c1,c2),"a"),true, function(item){ return item.a; });

The documentation and examples for uniq() don't make it clear how the callback function works. The algorithm for uniq() function calls this function on every element from both lists. If the result of this function is the same, it removes that element (assuming it is duplicated).

union() in fact prevents duplicates when called on an array. We can use this fact too:

_.map(_.union(_.pluck(c1,"a"),_.pluck(c2,"a")),function (item) {return {a:item};});

The above like first converts the list of objects to simple arrays (pluck()) then combines them using union() and eventually uses map() to make a list of objects.

Reference: uniq()



回答4:

Since there is a huge number of properties in both objects and this algorithm runs often, it would be better to use core Javascript instead of any library:

//adds all new properties from the src to dst. If the property already exists, updates the number in dst. dst and src are objects
function extendNumberSet( dst, src ) {
    var allVals = [];
    for ( var i = 0; i < dst.length; i++ ) {
        allVals.push(dst[i].a);
    }
    for ( var i = 0; i < src.length; i++ ) {
        if ( allVals.indexOf( src[i].a ) === -1 ) {
            dst.push( src[i] );
        }
    }
}

here is a JSfiddle to test it.