I'm trying to do the following with NHibernate 3.0's LINQ interface. I want to query for an object (using some Where clause), and load some children and grandchildren. Currently I'm doing it like so:
var results = session.Query<Thing>()
.Where(...)
.Fetch(x => x.SubThingA)
.ThenFetch(st => st.SubSubThingA)
.Fetch(x => x.SubThingB)
.ThenFetch(st => st.SubSubThingB)
// etc...
However, this results in a Cartesian product between all grandchildren (every result row contains many, many columns). This is discussed by "ayende" here. On the other hand I get a single round-trip, unlike splitting the query and then combining it.
How can I do it in a better (SQL and performance-wise) way, still using NHibernate's LINQ interface?
(For one thing, I've noticed that currently the ToFuture methods don't work when you use Fetch)
Thanks a lot!
In most cases, you'll get better performance by using batch-size
in entities and collections instead of creating a mega-query.
Best case scenario, it's a query by id per root entity type.
Let's say you have a root entity Customer, which has a collection of Orders, which have a collection of OrderItems, which reference Products, and all batch-size
properties are set to 1000.
Say you retrieve a list of 10 customers, which have in average 10 orders with 10 products each:
var results = session.Query<Customer>().Where(...).Take(10).ToList();
- The first query will fetch just the customers.
- When you start iterating the first customer.Orders collection, one query will be used to load all of them (for all the customers)
- When you start iterating the first order.OrderItems collection, one query will be used to load all of them (for all the orders and all the customers)
- When you read a property from the first product, one query will be used to load all of them
So, you have just 4 queries, with no joins at all, retrieving everything by PK. It's easy and efficient.
Although Diego's answer is the accepted method for doing these things in NHibernate, I really don't feel comfortable with that approach. I don't want to have to define explicit contracts for my objects just because I may need to retrieve them in certain ways. Also, I don't always want to serialize them at all. Furthermore, in many cases I know that the best performance will always be a single round-trip to fetch all the data.
The solution I ended up using was to implement a function that takes a list of (type safe) expressions on the root object, such as
x => x.Child.GrandChild1
x => x.Child.GrandChild2Collection.SubInclude(c => c.GreatGrandChild)
Where SubInclude is an extension method for IEnumerable that is used when parsing these expressions.
I parse this list of expressions and build, for each subpath of each expression (x, x.Child, x.Child.GrandChild1) an NHibernate criteria query on the root type:
var queryOver = session.QueryOver<T>().Where( ...expression to select root objects... );
for every subpath in the current expression:
queryOver.RootCriteria.SetFetchMode(subPath, FetchMode.Eager)
queryOver.RootCriteria
.SetResultTransformer(new DistinctRootEntityResultTransformer())
queryOver.Future()
This is repeated for every expression in the list.
The last line makes sure that this eager fetch will be included in whatever round-trip happens next. Then I make the actual query on the root object T, and the session automatically performs in that same round-trip all the queries required to fetch each of the paths I've passed in the expressions.
The queries are performed separately for each expression path, so there is no cartesian-product problem.
Bottom line is that this is no simple feat. There's a little too much code for me to publish as-is. I prefer EF4.1's Include(expression) API, which does all of this automagically.