Good behaviour for eager loading multiple siblings

2019-05-08 19:23发布

问题:

I'm trying to do the following with NHibernate 3.0's LINQ interface. I want to query for an object (using some Where clause), and load some children and grandchildren. Currently I'm doing it like so:

var results = session.Query<Thing>()
                     .Where(...)
                     .Fetch(x => x.SubThingA)
                     .ThenFetch(st => st.SubSubThingA)

                     .Fetch(x => x.SubThingB)
                     .ThenFetch(st => st.SubSubThingB)

                     // etc...

However, this results in a Cartesian product between all grandchildren (every result row contains many, many columns). This is discussed by "ayende" here. On the other hand I get a single round-trip, unlike splitting the query and then combining it.

How can I do it in a better (SQL and performance-wise) way, still using NHibernate's LINQ interface?

(For one thing, I've noticed that currently the ToFuture methods don't work when you use Fetch)

Thanks a lot!

回答1:

In most cases, you'll get better performance by using batch-size in entities and collections instead of creating a mega-query.

Best case scenario, it's a query by id per root entity type.


Let's say you have a root entity Customer, which has a collection of Orders, which have a collection of OrderItems, which reference Products, and all batch-size properties are set to 1000.

Say you retrieve a list of 10 customers, which have in average 10 orders with 10 products each:

var results = session.Query<Customer>().Where(...).Take(10).ToList();
  • The first query will fetch just the customers.
  • When you start iterating the first customer.Orders collection, one query will be used to load all of them (for all the customers)
  • When you start iterating the first order.OrderItems collection, one query will be used to load all of them (for all the orders and all the customers)
  • When you read a property from the first product, one query will be used to load all of them

So, you have just 4 queries, with no joins at all, retrieving everything by PK. It's easy and efficient.



回答2:

Although Diego's answer is the accepted method for doing these things in NHibernate, I really don't feel comfortable with that approach. I don't want to have to define explicit contracts for my objects just because I may need to retrieve them in certain ways. Also, I don't always want to serialize them at all. Furthermore, in many cases I know that the best performance will always be a single round-trip to fetch all the data.

The solution I ended up using was to implement a function that takes a list of (type safe) expressions on the root object, such as

x => x.Child.GrandChild1
x => x.Child.GrandChild2Collection.SubInclude(c => c.GreatGrandChild)

Where SubInclude is an extension method for IEnumerable that is used when parsing these expressions.

I parse this list of expressions and build, for each subpath of each expression (x, x.Child, x.Child.GrandChild1) an NHibernate criteria query on the root type:

var queryOver = session.QueryOver<T>().Where( ...expression to select root objects... );
for every subpath in the current expression:
    queryOver.RootCriteria.SetFetchMode(subPath, FetchMode.Eager)

queryOver.RootCriteria
         .SetResultTransformer(new DistinctRootEntityResultTransformer())

queryOver.Future()

This is repeated for every expression in the list. The last line makes sure that this eager fetch will be included in whatever round-trip happens next. Then I make the actual query on the root object T, and the session automatically performs in that same round-trip all the queries required to fetch each of the paths I've passed in the expressions.

The queries are performed separately for each expression path, so there is no cartesian-product problem.

Bottom line is that this is no simple feat. There's a little too much code for me to publish as-is. I prefer EF4.1's Include(expression) API, which does all of this automagically.