Using as a graph database for finding “friends” of

2019-04-08 05:41发布

I have been investigating a graph database and I have found neo4j and although this seems ideal I have also come across Mongodb.

Mongodb is not an official graph database but I wondered if it could be used for my scenario.

I am writing an application where users can have friends and those friends can have friends etc, the typical social part of a social network.

I was wondering in my situation whether Mongodb might suffice. How easy would it be to implement or do I really need to focus on REAL graph databases?

I do notice foursquare are using Mongodb so I presume it supports their infrastructure.

But how easy would it be to find all friends of my friends that also have friends in common, for example?

2条回答
看我几分像从前
2楼-- · 2019-04-08 06:01

You likely want an actual graph database as opposed to MongoDB. Try using the TinkerPop graph technology stack to get started. Using Blueprints (which is like JDBC for graphs) you can see the performance of MongoDB as a graph (using the Blueprints MongoDB implementation) versus Neo4j, Titan, or any number of other graph implementations.

查看更多
何必那么认真
3楼-- · 2019-04-08 06:06

Although it wouldn't be impossible, MongoDB would not be a good fit for this scenario.

The reason is that MongoDB does not do JOINs. When you need a query which spans multiple documents, you need a separate query for each document.

In your example, each user document would have an array with the _id's of their friends. To find "all friends of the friends of UserA who are also friends of UserB" would mean that you would:

  1. find userA and get his friends-array
  2. find all users in that array and get their friend-arrays
  3. find all users in these arrays who have UserB in their friends-array

These are three queries you have to perform. Between each of these queries, the result set has to be sent to the application, the application has to formulate a new query and send it back to the database. The result-set returned from the 2nd query can be quite large, which means that the 3rd query could take a while.

tl;dr: Use the right tool for the job. When your data is graph-based and you want to do graph-based queries on it, use a graph database.

查看更多
登录 后发表回答