I have been playing around with ElasticSearch for a new project of mine. I have set the default analyzers to use the ngram tokenfilter. This is my elasticsearch.yml file:
index:
analysis:
analyzer:
default_index:
tokenizer: standard
filter: [standard, stop, mynGram]
default_search:
tokenizer: standard
filter: [standard, stop]
filter:
mynGram:
type: nGram
min_gram: 1
max_gram: 10
I created a new index and added the following document to it:
$ curl -XPUT http://localhost:9200/test/newtype/3 -d '{"text": "one two three four five six"}'
{"ok":true,"_index":"test","_type":"newtype","_id":"3"}
However, when I search using the query text:hree
or text:ive
or any other partial terms, ElasticSearch does not return this document. It returns the document only when I search for the exact term (like text:two
).
I have also tried changing the config file such that default_search also uses the ngram token filter, but the result was the same. What am I doing wrong here and how do I correct it?
Not sure about the default_* settings.
But applying a mapping that specifies index_analyzer and search_analyzer works:
curl -XDELETE localhost:9200/twitter
curl -XPOST localhost:9200/twitter -d '
{"index":
{ "number_of_shards": 1,
"analysis": {
"filter": {
"mynGram" : {"type": "nGram", "min_gram": 2, "max_gram": 10}
},
"analyzer": { "a1" : {
"type":"custom",
"tokenizer": "standard",
"filter": ["lowercase", "mynGram"]
}
}
}
}
}
}'
curl -XPUT localhost:9200/twitter/tweet/_mapping -d '{
"tweet" : {
"index_analyzer" : "a1",
"search_analyzer" : "standard",
"date_formats" : ["yyyy-MM-dd", "dd-MM-yyyy"],
"properties" : {
"user": {"type":"string", "analyzer":"standard"},
"message" : {"type" : "string" }
}
}}'
curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}'
curl -XGET localhost:9200/twitter/_search?q=ear
curl -XGET localhost:9200/twitter/_search?q=sea
curl -XGET localhost:9200/twitter/_mapping
You should check the get mapping API to see if your mapping has been applied:
http://www.elasticsearch.org/guide/reference/api/admin-indices-get-mapping.html
Btw it has been said on the mailing list that when an index already contains documents, the mappings you put on the elasticsearch.yml are not applied. You need to clean your index first.
I've tried ngrams with ES and it works fine for me.