Graph/Gremlin for social media use case

2019-01-20 17:35发布

Consider instagram feed scenario. I want to get all the posts 'posted' by the people I follow. For each of these posts I want to know whether I have liked it or not and also know which of the other people I follow have liked it (if any). What is the best solution to get this in gremlin (possibly avoiding duplication)?

Image for clarity

The following just gives the posts 'posted' by USER 2. How to get other information in the same query?

g.V().has('ID','USER 2').out('posted')

1条回答
家丑人穷心不美
2楼-- · 2019-01-20 18:32

When you ask questions about Gremlin, especially one of this complexity, it is always best to include a Gremlin script that provides some sample data, like this:

g.addV('user').property('id',1).as('1').
  addV('user').property('id',2).as('2').
  addV('user').property('id',3).as('3').
  addV('user').property('id',4).as('4').
  addV('post').property('postId','post1').as('p1').
  addV('post').property('postId','post2').as('p2').
  addE('follow').from('1').to('2').
  addE('follow').from('1').to('3').
  addE('follow').from('1').to('4').
  addE('posted').from('2').to('p1').
  addE('posted').from('2').to('p2').
  addE('liked').from('1').to('p2').
  addE('liked').from('3').to('p2').
  addE('liked').from('4').to('p2').iterate()

As for the answer, I would probably do something like this:

gremlin> g.V().has('id',1).as('me').
......1>   out('follow').
......2>   aggregate('followers').
......3>   out('posted').
......4>   group().
......5>     by('postId').
......6>     by(project('likedBySelf','likedByFollowing').
......7>          by(__.in('liked').where(eq('me')).count()).
......8>          by(__.in('liked').where(within('followers')).values('id').fold()))
==>[post2:[likedBySelf:1,likedByFollowing:[3,4]],post1:[likedBySelf:0,likedByFollowing:[]]]

You find the user and get their followers holding them in a list with aggregate(). Then you find their posts with out('posted'). To get your Map structure for your output you can group() on those "posts". The second by() modulator uses project() to build your inner Map and basically makes two traversals, where the first uses zero or one to represent your boolean value by doing a count() and the second goes back to the "followers" list we aggregated earlier to filter for those. Note the important use of fold() at the end there to reduce the result of that inner traversal to a list.

查看更多
登录 后发表回答