-->

SQLAlchemy ORM select multiple entities from subqu

2020-06-18 10:28发布

问题:

I need to query multiple entities, something like session.query(Entity1, Entity2), only from a subquery rather than directly from the tables. The docs have something about selecting one entity from a subquery but I can't find how to select more than one, either in the docs or by experimentation.

My use case is that I need to filter the tables underlying the mapped classes by a window function, which in PostgreSQL can only be done in a subquery or CTE.

EDIT: The subquery spans a JOIN of both tables so I can't just do aliased(Entity1, subquery).

回答1:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class A(Base):
    __tablename__ = "a"

    id = Column(Integer, primary_key=True)
    bs = relationship("B")

class B(Base):
    __tablename__ = "b"

    id = Column(Integer, primary_key=True)

    a_id = Column(Integer, ForeignKey('a.id'))

e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
s = Session(e)
s.add_all([A(bs=[B(), B()]), A(bs=[B()])])
s.commit()

# with_labels() here is to disambiguate A.id and B.id.
# without it, you'd see a warning
# "Column 'id' on table being replaced by another column with the same key."
subq = s.query(A, B).join(A.bs).with_labels().subquery()


# method 1 - select_from()
print s.query(A, B).select_from(subq).all()

# method 2 - alias them both.  "subq" renders
# once because FROM objects render based on object
# identity.
a_alias = aliased(A, subq)
b_alias = aliased(B, subq)
print s.query(a_alias, b_alias).all()


回答2:

I was trying to do something like the original question: join a filtered table with another filtered table using an outer join. I was struggling because it's not at all obvious how to:

  • create a SQLAlchemy query that returns entities from both tables. @zzzeek's answer showed me how to do that: get_session().query(A, B).
  • use a query as a table in such a query. @zzzeek's answer showed me how to do that too: filtered_a = aliased(A).
  • use an OUTER join between the two entities. Using select_from() after outerjoin() destroys the join condition between the tables, resulting in a cross join. From @zzzeek answer I guessed that if a is aliased(), then you can include a in the query() and also .outerjoin(a), and it won't be joined a second time, and that appears to work.

Following either of @zzzeek's suggested approaches directly resulted in a cross join (combinatorial explosion), because one of my models uses inheritance, and SQLAlchemy added the parent tables outside the inner SELECT without any conditions! I think this is a bug in SQLAlchemy. The approach that I adopted in the end was:

filtered_a = aliased(A, A.query().filter(...)).subquery("filtered_a")
filtered_b = aliased(B, B.query().filter(...)).subquery("filtered_b")
query = get_session().query(filtered_a, filtered_b)
query = query.outerjoin(filtered_b, filtered_a.relation_to_b)
query = query.order_by(filtered_a.some_column)

for a, b in query:
    ...