Desire feature of searching for part of word in El

2020-05-09 09:39发布

I tried two different approaches for creating index and both are returning anything if I search for part o the word. Basically, if I search for first letters or letters in the middle of the word I want get all the documents.

FIRST TENTATIVE BY CREATING INDEX THAT WAY (other stackoverflow question a bit old):

POST correntistas/correntista
{
  "index": {
    "index": "correntistas",
    "type": "correntista",
    "analysis": {
      "index_analyzer": {
        "my_index_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "mynGram"
          ]
        }
      },
      "search_analyzer": {
        "my_search_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "standard",
            "lowercase",
            "mynGram"
          ]
        }
      },
      "filter": {
        "mynGram": {
          "type": "nGram",
          "min_gram": 2,
          "max_gram": 50
        }
      }
    }
  }
}

SECOND TENTATIVE BY CREATING INDEX THAT WAY (other recent stackoverflow question)

PUT /correntistas
{
    "settings": {
        "analysis": {
            "filter": {
                "autocomplete_filter": {
                    "type": "edge_ngram",
                    "min_gram": 1,
                    "max_gram": 20
                }
            },
            "analyzer": {
                "autocomplete_search": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase"
                    ]
                },
                "autocomplete_index": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter"
                    ]
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "nome": {
                "type": "text",
                "analyzer": "autocomplete_index",
                "search_analyzer": "autocomplete_search"
            }
        }
    }
}

This second tentative failed with

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [properties]: Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "Root mapping definition has unsupported parameters:  [nome : {search_analyzer=autocomplete_search, analyzer=autocomplete_index, type=text}]"
    }
  },
  "status": 400
}

Kibana Print Screen

Although the first way I created the index the index was created without exception, it doesn't work when I type part of the properties "nome".

I added one document this way

POST /correntistas/correntista/1
    {
        "conta": "1234",
        "sobrenome": "Carvalho1",
        "nome": "Demetrio1"
    }

Now I want to retrieve the above document either by typing first letters (eg. De) or typing part of the word from middle (eg met). But none of the two ways bellow I am searching is retrieving the document

Simple way to query:

GET correntistas/correntista/_search
{
    "query": {
        "match": {
            "nome": {
                "query": "De" #### "met" should I also work from my perspective
            }
        }
    }
}

More elaborated way to query also failling

GET correntistas/correntista/_search
{
    "query": {
        "match": {
            "nome": {
                "query": "De",  #### "met" should I also work from my perspective
                "operator": "OR",
                "prefix_length": 0,
                "max_expansions": 50,
                "fuzzy_transpositions": true,
                "lenient": false,
                "zero_terms_query": "NONE",
                "auto_generate_synonyms_phrase_query": true,
                "boost": 1
            }
        }
    }
}

I don't think is relevant but here are the verions (I am using this version because it is intended to work in production with spring-data and there is some "delay" on adding Elasticsearch newer versions in Spring-data)

elasticsearch and kibana 6.8.4

PS.: please don't suggest me to use regular expression neither wilcards (*).

1条回答
萌系小妹纸
2楼-- · 2020-05-09 10:09

In my this SO answer, the requirement was kinda prefixed search, ie for text Demetrio1 only searching for de demet required, which worked as I created edge-ngram tokenizer to address this, but in this question, requirement is to provide the infix search for which we will use the ngram tokenizer in our custom analyzer.

Below is the step by step example

Index def

{
  "settings": {
    "index.max_ngram_diff" :10,
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "ngram",  --> note this
          "min_gram": 2,
          "max_gram": 8
        }
      },
      "analyzer": {
        "autocomplete": { 
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete", 
        "search_analyzer": "standard" 
      }
    }
  }
}

Index sample doc

{
    "title" : "Demetrio1"
}

Search query

{
    "query" :{
        "match" :{
            "title" :"met"
        }
    }
}

search result bring the sample doc:)

 "hits": [
            {
                "_index": "ngram",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.47766083,
                "_source": {
                    "title": "Demetrio1"
                }
            }
        ]
查看更多
登录 后发表回答