Azure-search: How to get documents which exactly c

2019-09-14 17:12发布

问题:

This question/answer dealt with a pretty similar topic, but I couldn't find the solution I was searching for. How to practially use a keywordanalyzer in azure-search?

Starting situation:

I created a resource with multiple indexes. One of these indexes contains a Collection(Edm.String) field. From this field i only want to get documents which exactly contain the search term. For example the field contains documents like these: "Hovercraft zero", "Hovercraft one", "Hovercraft two".

If the search term is "Hover" all three documents should be returned. If the search term is "craft zer" only the document "Hovercraft zero" should be returned. The document shouldn't get a higher score, the desired behaviour is that I only get the "Hovercraft zero" document as result.

Further information:

It is not possible to set the searchmode to all (like it was recommended in the question on the top) because I just want to set this behaviour for this specific field and not for all search queries. It also is not possible to let the responsibility on the user to enter the search term with quotes.

What I have tried so far:

  • Use the keyword analyzer like it was described in the question on top: no success
  • Use an indexanalyzer with specific token filters (ngram, lowercase) and a searchanalyzer as a keyword analyzer: no success
  • Use Charfilters to manipulate the search term and manually set the quotes on the first and last position (craft zer -> "craft zer"). Like Yahnoosh explained in the question on top, the query parser processes the query string before the analyzers are applied. So: no success

Is there any solution for this issue? Or is there a other approach to achieve the desired behaviour?

Hopefully someone can help.

Thanks in advance!

回答1:

Using your example with three documents: "Hovercraft zero", "Hovercraft one", "Hovercraft two"

  1. Issue a prefix query to find all documents that contain terms that start with "Hover"

    search=Hover*

  2. To match the term "craft zer", you need to use the keyword analyzer (or the keyword tokenizer with the lowercase token filter) at indexing time to make sure elements of your string collection are not tokenized. Then at query time you can issue a regex query (note regex queries are much slower than term or prefix queries)

    search=/.craft zer./&queryType=full

Also, please use the Analyze API to test your custom analyzer configurations. It will help you make sure the analyzer produces the terms you expect.



回答2:

Thanks @Yahnoosh for your answer, I found a solution that worked for me.

Short example: I have an index including three fields (field1, field2, field3). From field3 I want a result where documents exactly contain the search term. From field1 and field2 I want do get a "standard" result.

Solution: I manipulated the searchquery to ->

field1:{searchterm} || field2:{searchterm} || field3:"{searchterm}" &queryType=full

Using this searchquery field1 and field2 are queried in the "standard" way and field3 is queried with the behaviour i was searching for. Of course there are more efficient and elegant ways out there to solve this issue, but it worked for me.

If anybody has a better solution let me know ;)