I have this query which runs a join on Books, TradingDesks and ProductInfos. The data is huge in each of this collection.
var queryJoin = from b in books.Values
join d in tradingDesks.Values
on b.TradingDeskId equals d.Id
join p in ProductInfos.Values
**on b.Id equals p.RiskBookId**
select new { p, Book = b.Name, TradingDeskName = d.Name };
In the highlighted line (on b.Id equals p.RiskBookId)
, I also want to add another condition like, (on b.Id equals p.RiskBookId || p.RiskBookId == 0)
. How do I do this in this linq syntax.
I tried querying like this
var queryJoin = from b in books.Values
from d in tradingDesks.Values.Where(x => x.Id == b.TradingDeskId)
from p in cachedProductInfos.Values.Where(y => y.RiskBookId == b.Id)
select new { p, Book = b.Name, TradingDeskName = d.Name };
But in this case, the query runs forever and I run out of memory. So I guess framing it in this fashion does something crazy :(
Any help is appreciated.
Thanks Mani
Actually :
or
Should work perfectly fine.
The join requires a true condition:
will return true or false, so the join SHOULD evaluate that just so.
In the original query, those calls to Enumerable.Join are using a hashtable behind the scenes to make things fast. If you switch to .Where, you don't get those hash benefits. You can use hashing explicitly to get the same effect.
try this, a little verbatim but should work as you described it:
you could try framing it as a union instead of a single join:
equals
can only be used to do a single comparison.Someone else may be able to give you an optimised approach to the query for your circumstance, but adding to the
equals
condition isn't the solution here. What might be best though is to approach it in a different fashion - rather than yanking at all the data from within the body of a single query - by some sort of bespoke loading and caching mechanism, since the size of the data is obviously problematic.