I need to join a table to the a select/group-by query (which includes the same table), and I'd like to do it using Arel.
I have a table of :phenotypes
which are has_and_belongs_to_many :genes
, which are themselves has_and_belongs_to_many :orthogroups
. As a result, the relationship between phenotypes and orthogroups are many-to-many.
I have two scopes (on Orthogroup) which get all orthogroups associated with a specific phenotype:
scope :with_phenotype, lambda { |phenotype_id|
where("observations.phenotype_id = ?", phenotype_id).
joins("inner join orthologies on (orthologies.orthogroup_id = orthogroups.id) inner join observations on (observations.gene_id = orthologies.gene_id)")
}
scope :with_associated_gene_ids_for_phenotype, lambda { |phenotype_id|
with_phenotype(phenotype_id).
select("orthogroups.id, array_agg(distinct observations.gene_id) as associated_gene_ids").
group("orthogroups.id")
}
Thus, doing Orthogroup.with_associated_gene_ids_for_phenotype(48291)
should return a table of orthogroup IDs and the genes which link them to the phenotypes.
That stuff all works fine.
The issue is that I'd like to get the rest of orthogroups.*
and join it to the results of the second scope, so that the list of genes is basically like an extra field on my Orthogroup ActiveRecord model.
Roughly, something like this:
SELECT o1.*, o_genes.associated_gene_ids
FROM orthogroups o1
INNER JOIN (
SELECT o2.id, array_agg(DISTINCT obs.gene_id) AS associated_gene_ids
FROM orthogroups o2
INNER JOIN orthologies ortho ON (ortho.orthogroup_id = o2.id)
INNER JOIN observations obs ON (ortho.gene_id = obs.gene_id)
WHERE obs.phenotype_id = ? GROUP BY o2.id
) AS o_genes
ON (o1.id = o_genes.id);
Now, that query appears to work. But I'd much rather find a way to join the Orthogroup table directly to its own scope to get those genes.
Perhaps it'd be simpler to use SQL, but it seems like there should be an easy way with Arel. I've found several similar questions, but none seem to have answers.
The closest solution I've found is this:
def self.orthogroups phenotype_id
Orthogroup.select("orthogroups.*, o_genes.associated_gene_ids").
joins(Arel.sql("inner join (" + Orthogroup.with_associated_gene_ids_for_phenotype(phenotype_id).to_sql + ") AS o_genes ON (o_genes.id = orthogroups.id)"))
end
The outputted SQL uses the table "orthogroups" in two contexts, and this worried me; however, a spot check of results suggests the query is correct.
Still, this is not the elegant solution for which I might have hoped. Is it possible to do this without the awkward "inner join (...)"
?