Full-text search in NoSQL databases

2019-01-06 13:11发布

  • Has anyone here have any experience deploying a real online system that had a full text search in any of the NoSQL databases?
  • For example, how does the full-text search compare in MongoDB, Riak and CouchDB?
  • Some of the metric that I am looking for is ease of deployment and maintaince and of course speed.
  • How mature are they? Are they any replacement for the Lucene infrastructure?

Thanks.

12条回答
仙女界的扛把子
2楼-- · 2019-01-06 13:59

Couchbase 5.0 is releasing full text search capabilities built on the open source Bleve engine. You enable indexing for full text and start using against existing JSON documents in the database.

Some slides and presentation video covering the topic, mentioning Elasticsearch and Lucene as well... https://www.slideshare.net/Couchbase/fulltext-search-how-it-works-and-what-it-can-do

查看更多
小情绪 Triste *
3楼-- · 2019-01-06 14:01

For MongoDB, there isn't a full full-text indexing feature yet, however there's possibly one in the pipeline, perhaps due in v2.2.

In the meantime, you can create a simple inverted index by using a string array field, and putting an index on it, as described here: Full Text Search in Mongo

Or, you could maintain a parallel full-text index in a dedicated Solr or Lucene index, and if you're feeling really ambitious replicate directly to your full-text store from the Mongo oplog. Otherwise, populate both and keep in sync from your application logic.

查看更多
做自己的国王
4楼-- · 2019-01-06 14:05

I'm involved in the development of an application using Solandra (Cassandra based Apache Solr). In my experience the system is quite stable and able to handle TB+ data. I'm personally quite happy with the software for the following reasons: 1. Automated partitioning of data due to Cassandra backend. 2. Rich querying capabilities (due to Solr and Lucene). 3. Fast read and writes (writes significantly faster than reads).

However currently Solandra, I believe does not support batch mutations. That is, I can insert 100 columns in a single insertion into Cassandra, however Solandra does not support this.

查看更多
唯我独甜
5楼-- · 2019-01-06 14:08

None of the existing "NoSQL" database provides a reasonable implementation of something that could be named "fulltext search". MongoDB in particular has barely nothing so far (matching using regular expressions is not fulltext search and searching using $in or $all operators on a keyword word list is just a very poor implementation of a "fulltext search"). Using Solr, ElasticSearch or Sphinx is straight forward - an implementation and integration on the application level. Your choice widely depends on you requirements and current setup.

查看更多
啃猪蹄的小仙女
6楼-- · 2019-01-06 14:10

Definitely Solr. It is NoSQL.

It has:

  • awesome performance
  • awesome storage options
  • stemmers
  • highligting
  • faceting
  • distributed search (SolrCloud)
  • perfect API
  • web admin
  • HTML, PDF, DOC indexing
  • many other features
查看更多
成全新的幸福
7楼-- · 2019-01-06 14:12

Yes. See CouchDB-Lucene which is a CouchDB extension to support full Lucene queries of the data.

查看更多
登录 后发表回答