Using crc32 tweak on has_many relations in Thinkin

2019-05-24 07:07发布

It's weird actually. I have two models that have has_many relation each other, here are my models

#model city
class City < ActiveRecord::Base
belong_to :state
end

#model state
class State < ActiveRecord::Base
has_many :city
end

and I have state index

ThinkingSphinx::Index.define 'state', :with => :active_record do
  indexes state_name, :sortable => true

  #here is the problem
  has "CRC32(cities.city_name)", :as => :city_name, :type => :integer
end

I want to use city_name as a filter. My code above doesn't work and i got an error message when run

rake ts:index

here is the error message

ERROR: index 'state_core': sql_range_query: Unknown column 'cities.city_name' in 'field list'

but, when i put city_name in indexes block like below, the indexer runs well!

ThinkingSphinx::Index.define 'state', :with => :active_record do
  indexes state_name, :sortable => true
  indexes cities.city_name
  has "CRC32(cities.city_name)", :as => :city_name, :type => :integer
end

any suggestions ?

1条回答
贪生不怕死
2楼-- · 2019-05-24 07:56

Thinking Sphinx can't tell if you're referring to association tables within SQL snippets - so in your first example, there's nothing indicating that it needs to join on cities.

The join method within an index definition exists for this very purpose - so, try the following:

ThinkingSphinx::Index.define 'state', :with => :active_record do
  indexes state_name, :sortable => true

  has "CRC32(cities.city_name)", :as => :city_name, :type => :integer

  join cities
end

However, it's worth noting a few things: firstly, you may also need to add cities.city_name to the GROUP BY clause, since it's not part of any aggregate values:

# within index definition
group_by 'cities.city_name

But also: your State model has many cities, not just one, so it should actually be aggregated into a set of integer values, not just one. This means you don't need the group_by call, but you do need to add the aggregate behaviour yourself. This is done differently depending on whether you're using PostgreSQL or MySQL:

# PostgreSQL
has "array_to_string(array_agg(crc32(cities.name)), ',')",
  :as => :city_names, :type => :integer, :multi => true

# MySQL
has "GROUP_CONCAT(CRC32(cities.name) SEPARATOR ',')",
  :as => :city_names, :type => :integer, :multi => true

CRC32 is not a native function in PostgreSQL, and so you may need to add it yourself. Thinking Sphinx prior to v3 did this for you, but I've rewritten it so the CRC32 function is no longer required. This is largely due to the fact that CRC32 can result in collisions, and it can't be reversed, and so it's an inelegant and imperfect solution. Hence, I think using fields for string comparison is better, but it's up to you for whether this is preferred in your app.

I would recommend this approach instead:

ThinkingSphinx::Index.define :state, :with => :active_record do
  indexes state_name, :sortable => true

  has cities.id, :as => :city_ids
end

city = City.find_by_name('Melbourne')
State.search :with => {:city_ids => city.id}

It's accurate and elegant.

查看更多
登录 后发表回答