How to not-analyze in ElasticSearch?

2019-01-08 14:38发布

I've got a field in an ElasticSearch field which I do not want to have analyzed, i. e. it should be stored and compared verbatim. The values will contain letters, numbers, whitespace, dashes, slashes and maybe other characters.

If I do not give an analyzer in my mapping for this field, the default still uses a tokenizer which hacks my verbatim string into chunks of words. I don't want that.

Is there a super simple analyzer which, basically, does not analyze? Or is there a different way of denoting that this field shall not be analyzed?

I only create the index, I don't do anything else. I can use analyzers like "english" for other fields which seems to be built-in names for pre-configured analyzers. Is there a list of other names? Maybe there's one fitting my needs (namely doing nothing with the input).

This is my mapping currently:

{
  "my_type": {
    "properties": {
      "my_field1": { "type": "string", "analyzer": "english" },
      "my_field2": { "type": "string" }
    }
  }
}

my_field1 is language-dependent; this seems to work. my_field2 shall be verbatim. I'd like to give an analyzer there which simply does not do anything.

A sample value for my_field2 would be "B45c 14/04".

3条回答
地球回转人心会变
2楼-- · 2019-01-08 15:15

keyword analyser can be also used.

// don't actually use this, use "index": "not_analyzed" instead
{
  "my_type": {
    "properties": {
      "my_field1": { "type": "string", "analyzer": "english" },
      "my_field2": { "type": "string", "analyzer": "keyword" }
    }
  }
}

As noted here: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-keyword-analyzer.html, it makes more sense to mark those fields as not_analyzed.

But keyword analyzer can be useful when it is set by default for whole index.

UPDATE: As it said in comments, string is no longer supported in 5.X

查看更多
我欲成王,谁敢阻挡
3楼-- · 2019-01-08 15:21

This is no longer true due to the removal of the string (replaced by keyword and ``) type as described here. Instead you should use "index": true | false. For Example OLD:

{
  "foo": {
    "type" "string",
    "index": "not_analyzed"
  }
}

becomes NEW:

{
  "foo": {
    "type" "keyword",
    "index": true
  }
}

This means the field is indexed but as it is typed as keyword not analyzed implicitly.

查看更多
We Are One
4楼-- · 2019-01-08 15:37
"my_field2": {
    "properties": {
        "title": {
            "type": "string",
            "index": "not_analyzed"
        }
    }
}

Check you here, https://www.elastic.co/guide/en/elasticsearch/reference/1.4/mapping-core-types.html, for further info.

查看更多
登录 后发表回答