I am starting a new project and I am looking at using MongoDB as the document storage facility and Neo4j as the mechanism to map relationships between documents and then I want to expose the results of my queries via rest API. What would one say are the advantages and disadvantages of doing it this manner? Are there any better ways of achieving this perhaps using a different NoSQL document store? Are there any examples one could find online where someone has tried to do something similar?
相关问题
- MongoDB can not create unique sparse index (duplic
- Spring Data MongoDB - lazy access to some fields
- Golang mongodb aggregation
- How to convert from Timestamp to Mongo ObjectID
- MongoDB Indexing: Multiple single-field vs single
相关文章
- mongodb有没有什么办法禁止读取数据的时候进行缓存
- mongodb-aggregate聚合查询分组后如何获得多字段
- mongodb error: how do I make sure that your journa
- How to track MongoDB requests from a console appli
- MongoError: cannot infer query fields to set, path
- Pymongo $in Query Not Working
- django.core.exceptions.ImproperlyConfigured: '
- How does Cassandra scale horizontally ?
We ended up using Neo4j as an "index" to do routing calculations (in a bus/train search). The bulk of the data was stored in MongoDB. We used MongoConnector as a way to sync the two databases. Mongo was superior for manipulating raw JSON data.
We tried to store "everything" in Neo4j initially, but then queries started to take +2 mins, so afterwards we only stored the minimal data necessary. In addition, Neo4j has limitations on the what you can index. For example, they don't have "date" type, so range queries on dates are cumbersome. Also you run into problems when you have a "super node", a node with thousands or hundred of thousands of links (relationships). Relationships are stored as a link list in Neo4j, so random access can be very slow (for looking up relationships).
You have to be picky how you use Neo4j, in the end we used it for shortest path calculations/search, which is Neo4j's strength.
For more details checkout a video and presentation of our findings at GraphConnect NY 2013: https://vimeo.com/79477603
in neo4j you should construct your relations, for example if you have 2 user in mongodb, if one follow another you should make relations of nodes in neo4j, instead make another collections with info. you should use 2 databases as one.
I would take a look at Gremlin.
Check out this article: http://thinkaurelius.com/2013/02/04/polyglot-persistence-and-query-with-gremlin/ I personally find the Groovy syntax awesome when working with data.
You might be interested in the Neo4j Doc Manager for Mongo Connector. It's an extension to the Mongo Connector project that allows for real time one way synchronization of data from MongoDB to Neo4j. Documents inserted in MongoDB are converted to a property graph and automatically inserted into Neo4j. The collection and fields to be synched from Mongo to Neo4j can be configured.
The idea here is to facilitate using Neo4j and MongoDB together in a single application without having to write code in the application layer to sync data.
I have been thinking about using these two together for a while because my data is already in mongodb. But I don't want to add one more DB top of the existing architecture, because addition of neo4j will require more resources e.g. memory, diskspace and not to mention time invested in maintaining 2 DBs.
Another problem which I can think of is when you shard your data with mongodb, you'll also have to manage your neo4j data w.r.t. these new shards. Scaling in neo4j is done through clusters and it is a part of enterprise edition which is commercial.
I did further research and found out that OrientDB can store the data as documents and its a graph db.
Another way is building the relationships in MongoDB itself and write your logic on top of that and expose this logic through a REST API.
If you like Neo4j, you should have a look at Structr (https://github.com/structr/structr, http://structr.org).
With Structr, you can define a custom schema on top of Neo4j (in Java, or starting with 0.7, even through the UI), and it will create a (nearly) production-ready RESTful JSON API for you.
The JSON "documents" are created by Structr on the fly in realtime as any sort of aggregation or mapping of a subgraph in Neo4j. That allows you to define an arbitrary number of different views on the same data.
Structr has built-in functionality like search (full-text, keyword/exact, location range w/ Neo4j spatial), paging, sorting, constraints, users/groups, access control, cron-like background jobs, maintenance commands, and a supplemental (beta) UI for CRUD operations and with basic CMS functionality.
Disclaimer: I'm the founder of Structr.