ElasticSearch - JavaApi searching by each characte

2019-08-29 07:51发布

问题:

Am fetching documents from elastic search using java api, i have the following code in my elastic search documents and am trying to search it with the following pattern.

code : MS-VMA1615-0D

Input : MS-VMA1615-0D   -- Am getting the results (MS-VMA1615-0D).
Input : VMA1615         -- Am getting the results (MS-VMA1615-0D) .
Input : VMA             -- Am getting the results (MS-VMA1615-0D) .

But, if i give input like below, am not getting results.

Input : V       -- Am not getting the results.
INPUT : MS      -- Am not getting the results.
INPUT : -V      -- Am not getting the results.
INPUT : 615     -- Am not getting the results.

Am expecting to return the code MS-VMA1615-0D. In simple, am trying to search character by character instead of term (word).

It should not return the code MS-VMA1615-0D for the following cases, Because its not matching with my code.

Input : VK      -- should not return the results.
INPUT : MS3     -- should not return the results.

Please find my below java code that am using

private final String INDEX = "products";
private final String TYPE = "doc";
SearchRequest searchRequest = new SearchRequest(INDEX); 
    searchRequest.types(TYPE);
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    QueryStringQueryBuilder qsQueryBuilder = new QueryStringQueryBuilder(code); 

    qsQueryBuilder.defaultField("code");
    searchSourceBuilder.query(qsQueryBuilder);

    searchSourceBuilder.size(50);
    searchRequest.source(searchSourceBuilder);
    SearchResponse searchResponse = null;
    try {
         searchResponse = SearchEngineClient.getInstance().search(searchRequest);
    } catch (IOException e) {
        e.getLocalizedMessage();
    }
    Item item = null;
    SearchHit[] searchHits = searchResponse.getHits().getHits();

Please find my mapping details :

PUT products
{
"settings": {
"analysis": {
  "analyzer": {
    "custom_analyzer": {
      "type": "custom",
      "tokenizer": "my_pattern_tokenizer",
      "char_filter": [
        "html_strip"
      ],
      "filter": [
        "lowercase",
        "asciifolding"
      ]
    }
   },
   "tokenizer": {
     "my_pattern_tokenizer": {
          "type": "pattern",
          "pattern": "-|\\d"
        }
   }
  }
},
"mappings": {
"doc": {
  "properties": {
    "code": {
      "type": "text",
       "analyzer": "custom_analyzer"
      }
    }
  }
 }
}

After Update with new Answer :

This is my request via Java API

'SearchRequest{searchType=QUERY_THEN_FETCH, indices=[products], indicesOptions=IndicesOptions[id=38, ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[doc], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, source={"size":50,"query":{"match_phrase":{"code":{"query":"1615","slop":0,"boost":1.0}}}}}

' . But am getting response as null

回答1:

Follow up: ElasticSearch - JavaApi searching not happening without (*) in my input query

Your mapping should look like:

PUT products
{
"settings": {
"analysis": {
  "analyzer": {
    "custom_analyzer": {
      "type": "custom",
      "tokenizer": "ngram",
      "char_filter": [
        "html_strip"
      ],
      "filter": [
        "lowercase",
        "asciifolding"
      ]
    }
  }
}
},
"mappings": {
"doc": {
  "properties": {
    "code": {
      "type": "text",
       "analyzer": "custom_analyzer"
      }
    }
  }
 }
}

And you should be using a match_phrase query.

In Kibana:

GET products/_search
{
  "query": {
    "match_phrase": {
      "code": "V"
    }
  }
}

will return the result:

"hits": [
      {
        "_index": "products",
        "_type": "doc",
        "_id": "EoGtdGQBqdof7JidJkM_",
        "_score": 0.2876821,
        "_source": {
          "code": "MS-VMA1615-0D"
        }
      }
    ]

But this:

GET products/_search
{
  "query": {
    "match_phrase": {
      "code": "VK"
    }
  }
}

wont:

{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

Based on your comment:

Instead of using a Query string:

QueryStringQueryBuilder qsQueryBuilder = new QueryStringQueryBuilder(code); 
qsQueryBuilder.defaultField("code");
searchSourceBuilder.query(qsQueryBuilder);
searchSourceBuilder.size(50);
searchRequest.source(searchSourceBuilder);

Use a match phrase query:

QueryBuilder query = QueryBuilders.matchPhraseQuery("code", code);
searchSourceBuilder.query(query);
searchSourceBuilder.size(50);
searchRequest.source(searchSourceBuilder);