I have a metric table that I expect to be very large. It has a polymorphic association so that it can belongs_to other models that want to record some metric. I typically index association columns like this to speed up association loading. I've heard people talking about joint-indexing this association. This looks like:
add_index :comments, [:commentable_type, :commentable_id]
But I've also heard counsel against creating indexes of low-cardinality, because the payoff of the index doesn't offset the overhead of maintaining it. Since the _type half of my polymorphic association will probably only have 4-5 values across the millions of rows, I'm inclined to only index on the _id portion of the polymorphic association. I will probably create some additional joint-indexes using the _id column and some other unmentioned integer and datetime columns, but I won't include the _type in these indexes either.
Is this what you would do/recommend?
It is best practice to put the most selective field first when creating an index on multiple fields. Since you only have 4-5 values of
commentable_type
, you would be better off doing:Ultimately, this is worth benchmarking before and after adding the index, on a realistic data set - realistic in size and data.
However, you're not creating an index on a field with just a few values. The index is on the combination of the two fields, which is likely to have a lot of different value combinations. The index on the combined fields is a smart idea.