I am trying to change my sub-query in to a join where it selects only one record in the sub-query. It seems to run the sub-query for each found record, taking over a minute to execute:
select afield1, afield2, (
select top 1 b.field1
from anothertable as b
where b.aForeignKey = a.id
order by field1
) as bfield1
from sometable as a
If I try to only select related records, it doesn't know how to bind a.id in the nested select.
select afield1, afield2, bfield1
from sometable a left join (
select top 1 id, bfield, aForeignKey
from anothertable
where anothertable.aForeignKey = a.id
order by bfield) b on
b.aForeignKey = a.id
-- Results in the multi-part identifier "a.id" could not be bound
If I hard code values in the nested where clause, the select duration drops from 60 seconds to under five. Anyone have any suggestions on how to join the two tables while not processing every record in the inner table?
EDIT:
I ended up adding
left outer join (
select *, row_number() over (partition by / order by) as rank) b on
b.aforeignkey = a.id and b.rank = 1
went from ~50 seconds to 8 for 22M rows.