I have "documents" (activerecords) with an attribute called deviations. The attribute has values like "Bin X" "Bin $" "Bin q" "Bin %" etc.
I am trying to use tire/elasticsearch to search the attribute. I am using the whitespace analyzer to index the deviation attribute. Here is my code for creating the indexes:
settings :analysis => {
:filter => {
:ngram_filter => {
:type => "nGram",
:min_gram => 2,
:max_gram => 255
},
:deviation_filter => {
:type => "word_delimiter",
:type_table => ['$ => ALPHA']
}
},
:analyzer => {
:ngram_analyzer => {
:type => "custom",
:tokenizer => "standard",
:filter => ["lowercase", "ngram_filter"]
},
:deviation_analyzer => {
:type => "custom",
:tokenizer => "whitespace",
:filter => ["lowercase"]
}
}
} do
mapping do
indexes :id, :type => 'integer'
[:equipment, :step, :recipe, :details, :description].each do |attribute|
indexes attribute, :type => 'string', :analyzer => 'ngram_analyzer'
end
indexes :deviation, :analyzer => 'whitespace'
end
end
The search seems to work fine when the query string contains no special characters. For example Bin X
will return only those records that have the words Bin
AND X
in them. However, searching for something like Bin $
or Bin %
shows all results that have the word Bin
almost ignoring the symbol (results with the symbol do show up higher in the search that results without).
Here is the search method I have created
def self.search(params)
tire.search(load: true) do
query { string "#{params[:term].downcase}:#{params[:query]}", default_operator: "AND" }
size 1000
end
end
and here is how I am building the search form:
<div>
<%= form_tag issues_path, :class=> "formtastic issue", method: :get do %>
<fieldset class="inputs">
<ol>
<li class="string input medium search query optional stringish inline">
<% opts = ["Description", "Detail","Deviation","Equipment","Recipe", "Step"] %>
<%= select_tag :term, options_for_select(opts, params[:term]) %>
<%= text_field_tag :query, params[:query] %>
<%= submit_tag "Search", name: nil, class: "btn" %>
</li>
</ol>
</fieldset>
<% end %>
</div>
You can sanitize your query string. Here is a sanitizer that works for everything that I've tried throwing at it: