I am trying to figure out how does lucene's analyzer work? My question is how does lucene handle synonym words? Here is the situation: we have single words and multi words
single: foo = bar multi words: foo bar = foobar
For single words:
- Does lucene expand the indexed records or not? I guess if a query has a word like "foo", it adds "bar" to the query too. I don't know if it happens for indexing or not?
For multi words:
- Does lucene expand both query and indexing? for example if we have "foo bar", does it add foobar to the indexing/query?
My second question is : Lucene uses a stream of tokens and gives them to the filters like lowercase filter. My question is how does lucene find the multi words? like how does it find out that "foo bar" is a multi words that are together?
thanks