CouchDB Full Text Search

2019-07-22 13:31发布

问题:

i need some direction about full text search in couchdb. Is it already enabled by default in 2.0, or do we have to rebuild couchdb enabling it?

Scenario: It is a document management system, documents are shown in a grid. I need to be able to sort results. After several changes from map reduces to elastic indexes, now i am trying to use mango queries. Problem is that sorting does not give the expected results.

{
  "selector": {
    "directoryName": {
      "$eq": "mail\\test\\inbox"
    }},
    "sort": [{"subject": "asc"}]
}

Trying to sort by "subject" or other text field, mixes results with, i suppose, "index logic" (e.g.: returned sorted subjects: "This email...", "Hello...", "This email...", definetly not what i need). Dont remember if analyzers, tokens, etc, have something to do with "weird" search results. With date fields desc sort, for example, it works much better, but i have an "intruder" result of a document of year 2014 when showing documents from 2017 to desc, having 2016 and 2015 documents.

I have created indexes of type json for a few of the possible document sorting fields. Creating a text type index does not work. I do not know it full text search will solve my "sorting" problems, but with all the references to cloudant query language and full text search, i thought that this feature was included in 2.0.

回答1:

CouchDB itself doesn't have a full-text indexer built in. You can do a lot with mango, but you'll probably be much better served by a dedicated full-text indexer.

The 2 most common options are: couchdb-lucene and elasticseach



回答2:

after searching for a while and failed, finally, I got some working system, please let me know any comments if someone want a full text-search, maybe you can try , (delete all comments and copy then paste)

function (doc) {
  var prefix;
  for(prop in doc){
    if(prop=="_id"||prop=="_rev") // ignore _id, _rev or any unwanted properties
    continue;
    if(!Date.parse(doc[prop])) //ignore if it's a date type 
         prefix += doc[prop];
    else if(!isNAN(doc[prop])) // accept if it's a number type
        prefix += doc[prop];
    else if(typeof(doc[prop]!=="boolean") //ignore if it's a boolean type
        prefix += doc[prop];
  }
    var i;
    if (prefix) {
        for (i = 0; i < prefix.length; i += 1) {
            emit([prefix.slice(i)], doc);
        }
    }
//searchText?startkey=["abc"]&endkey=["abc\u9999"]&reduce=false&skip=0&limit=3