Elasticsearch with Tire: edgeNgram with multiple w

2019-05-11 12:22发布

Let's say I have 5 film titles:

  • Sans Soleil
  • Sansa
  • So Is This
  • Sol Goode
  • Sole Survivor

I want to implement an auto-complete search field with this expected behavior:

  • "Sans" > Sans Soleil, Sansa
  • "Sans so" > Sans Soleil
  • "So" > So Is This, Sol Goode, Sole Survivor
  • "So Is" > So Is This
  • "Sol" > Sol Goode, Sole Survivor, Sans Soleil

This use-case seems obvious and must be one utilized by many, but I just can't get it to work properly and I can't seem to find any answer or documentation to help. This is my current model:

class Film < Media
  include Tire::Model::Search
  include Tire::Model::Callbacks

  settings  :analysis => {
              :filter => {
                :title_ngram  => {
                  "type"      => "edgeNGram",
                  "min_gram"  => 2,
                  "max_gram"  => 8,
                  "side"      => "front" }
              },
              :analyzer => {
                :title_analyzer => {
                  "tokenizer"    => "lowercase",
                  "filter"       => ["title_ngram"],
                  "type"         => "custom" }
              }
            } do
    mapping do
      indexes :title, :type => 'string', :analyzer => 'title_analyzer'
      indexes :int_english_title, :type => 'string', :analyzer => 'title_analyzer'
    end
  end
end

And how the query is handled in my search_controller:

search = Tire.search ['books', 'films', 'shows'], :load => true, :page => 1, :per_page => 10 do |s|
    s.query do |query|
        query.string "title:#{params[:search]}"
    end
end
@results = search.results

This produces some strange behavior:

  • "Sans so" returns "Sansa, Sans Soleil, So Is This" in that order.
  • "So is" returns "Sol Goode, Sans Soleil, Sole Survivor, So Is This" in that order.

2条回答
女痞
2楼-- · 2019-05-11 13:10

I think you might achieve what you want with the match query set to type:"phrase_prefix". Most, but not all, of your examples would work.

With Ngrams, you have much finer control over the process, but they have a rather big recall (they usually return more data then you want), and you have to fight it. That's the "strange behaviour" you observe with multiple query terms ("Sans so"), because they are effectively executed as a Sans OR so query.

Try using the default_operator: "AND" option (see Tire's query_string_test.rb), or rather the match query (see Tire's match_query_test.rb) with the operator: "AND" option.

There are some articles about autocomplete, Tire and Ngrams available:

查看更多
来,给爷笑一个
3楼-- · 2019-05-11 13:18

Try following

search = Tire.search ['books', 'films', 'shows'], :load => true, :page => 1, :per_page => 10 do |s|
      s.query do |q|
        q.boolean do |b|
          b.must {|m| m.string params[:search]} 
        end
      end
end
查看更多
登录 后发表回答