ElasticSearch - Searching with hyphens in name

2019-01-26 13:48发布

问题:

I have a product catalog which I am indexing in ElasticSearch using the Elastica client. I am very new to ElasticSearch BTW.

There are products in my catalog which have 't-shirt' in their names. But, they won't appear in search results if I type 'tshirt'.

What can I do so that 't-shirt' can also pop-up in results?

I have followed this tutorial and implemented the following for indexes:

'analysis' => array(
    'analyzer' => array(
        'indexAnalyzer' => array(
            'type' => 'custom',
            'tokenizer' => 'whitespace',
            'filter' => array('lowercase', 'mySnowball')
        ),
        'searchAnalyzer' => array(
            'type' => 'custom',
            'tokenizer' => 'whitespace',
            'filter' => array('lowercase', 'mySnowball')
        )
    ),
    'filter' => array(
        'mySnowball' => array(
            'type' => 'snowball',
            'language' => 'English'
        )
    )
)

回答1:

You can try removing the hyphen using a mapping char filter:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

something like this would remove the hyphen:

{
    "index" : {
        "analysis" : {
            "char_filter" : {
                "my_mapping" : {
                    "type" : "mapping",
                    "mappings" : ["-=>"]
                }
            },
            "analyzer" : {
                "custom_with_char_filter" : {
                    "tokenizer" : "standard",
                    "char_filter" : ["my_mapping"]
                }
            }
        }
    }
}

it's something of a blunt force instrument as it will strip all hyphens but it should make "t-shirt" and "tshirt" match