Javascript. Optimal way to join 2 sets of objects

2019-02-17 23:08发布

问题:

Let's assume that we have 2 sets of objects

set1 = [{'id':'1', 'x':'1', 'y':'2'}, {'id':'2', 'x':'2', 'y':'2'}]
set2 = [{'id':'1', 'z':'1'}, {'id':'2', 'z':'2'}]

We want:

set3 = set1.join(set2).on('id'); 

>> set3 
[{'id':'1', 'x':'1', 'y':'2', 'z':'1'},{'id':'2', 'x':'2', 'y':'2', 'z':'2'}]

What's right tools for achieving this functionality? May underscore help here?

回答1:

OPTION 1, plain js

I would suggest that you transform each of the lists into a set by id, e.g.

{1: {x: 1, y: 1}, 2: {x: 2, y: 2}}

Then run a for over one (or both) of the sets and create a new dictionary with attributes from these two -- this latter bit depends on whether you're looking for the inner or the outer join. This should result in a roughly linear runtime, the javascript implementation of dictionaries is pretty efficient.

OPTION 2, underscore, for dense id sets, using _.zip()

If the id's are relatively dense and you'd like the outer join or know in advance that the sets of ids are exactly the same, another option is to stuff the data into three arrays -- one for each attribute and then use the underscore's zip() method.

OPTION 3, underscore, using _.groupBy()

A yet another possibility in to run _.groupBy() on the lists you have with a custom comparison method, that will also allow joining on multiple keys. Some simple postprocessing will be required, though, since the direct result will be a dictionary of the form

{1: [{'id':'1', 'x':'1', 'y':'2'}, {'id':'1', 'z':'1'}],
 2: [{'id':'2', 'x':'2', 'y':'2'}, {'id':'2', 'z':'2'}]}

An inner join behaviour in the latter case can be achieved by filtering out those items in the resulting dictionary that don't have the maximum number of items in the list (2, in the example).



回答2:

OPTION 4: Alasql library

Alasql can join two tables in 'SQL manner'.:

var set1 = [{'id':'1', 'x':'1', 'y':'2'}, {'id':'2', 'x':'2', 'y':'2'}];
var set2 = [{'id':'1', 'z':'1'}, {'id':'2', 'z':'2'}];

var res = alasql('SELECT * FROM ? set1 JOIN ? set2 USING id',[set1, set2]);

It gives exactly what you need:

[{"z":"1","id":"1","x":"1","y":"2"},{"z":"2","id":"2","x":"2","y":"2"}]

Try this example in jsFiddle.



回答3:

Another option using Ramda:

const r = require('ramda')
const outerJoin = r.curry(function(relationName, set1, keyName1, set2, keyName2) {
    const processRecord = function(record1) {
        const key1 = record1[keyName1]
        const findIn2 = r.find(r.propEq(keyName2, key1))
        const record2 = findIn2(set2)
        record1[relationName] = record2
        return record1
    }
    return r.map(processRecord, set1)
})

assumptions

  //set1 is an array of objects
  set1 : [{}]
  //set1 has a property for the key of type T
  set1[keyName1] : T
  //set2 is an array of objects
  set2 : [{}]
  //set2 has a property for the key which is also of type T
  set2[keyName2] : T

Output

 [{
     ...set1 members...
     , relationName: ...set2 members...
  }]

I guess a better output might be (shouldn't be hard to get here):

 [{
     , leftObj:...set1 members...
     , rightObj: ...set2 members...
  }]

and add support for inner joins. But I was replacing some crappy code and needed to replicate the object hierarchy.