EntityFramework - contains query of composite key

2019-01-01 06:10发布

given a list of ids, I can query all relevant rows by:

context.Table.Where(q => listOfIds.Contains(q.Id));

But how do you achieve the same functionality when the Table has a composite key?

6条回答
看风景的人
2楼-- · 2019-01-01 06:14

I tried this solution and it worked with me and the output query was perfect without any parameters

using LinqKit; // nuget     
   var customField_Ids = customFields?.Select(t => new CustomFieldKey { Id = t.Id, TicketId = t.TicketId }).ToList();

    var uniqueIds1 = customField_Ids.Select(cf => cf.Id).Distinct().ToList();
    var uniqueIds2 = customField_Ids.Select(cf => cf.TicketId).Distinct().ToList();
    var predicate = PredicateBuilder.New<CustomFieldKey>(false); //LinqKit
    var lambdas = new List<Expression<Func<CustomFieldKey, bool>>>();
    foreach (var cfKey in customField_Ids)
    {
        var id = uniqueIds1.Where(uid => uid == cfKey.Id).Take(1).ToList();
        var ticketId = uniqueIds2.Where(uid => uid == cfKey.TicketId).Take(1).ToList();
        lambdas.Add(t => id.Contains(t.Id) && ticketId.Contains(t.TicketId));
    }

    predicate = AggregateExtensions.AggregateBalanced(lambdas.ToArray(), (expr1, expr2) =>
     {
         var invokedExpr = Expression.Invoke(expr2, expr1.Parameters.Cast<Expression>());
         return Expression.Lambda<Func<CustomFieldKey, bool>>
               (Expression.OrElse(expr1.Body, invokedExpr), expr1.Parameters);
     });


    var modifiedCustomField_Ids = repository.GetTable<CustomFieldLocal>()
         .Select(cf => new CustomFieldKey() { Id = cf.Id, TicketId = cf.TicketId }).Where(predicate).ToArray();
查看更多
浪荡孟婆
3楼-- · 2019-01-01 06:18

This is a nasty problem for which I don't know any elegant solution.

Suppose you have these key combinations, and you only want to select the marked ones (*).

Id1  Id2
---  ---
1    2 *
1    3
1    6
2    2 *
2    3 *
... (many more)

How to do this is a way that Entity Framework is happy? Let's look at some possible solutions and see if they're any good.

Solution 1: Join (or Contains) with pairs

The best solution would be to create a list of the pairs you want, for instance Tuples, (List<Tuple<int,int>>) and join the database data with this list:

from entity in db.Table // db is a DbContext
join pair in Tuples on new { entity.Id1, entity.Id2 }
                equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
select entity

In LINQ to objects this would be perfect, but, too bad, EF will throw an exception like

Unable to create a constant value of type 'System.Tuple`2 (...) Only primitive types or enumeration types are supported in this context.

which is a rather clumsy way to tell you that it can't translate this statement into SQL, because Tuples is not a list of primitive values (like int or string).1. For the same reason a similar statement using Contains (or any other LINQ statement) would fail.

Solution 2: In-memory

Of course we could turn the problem into simple LINQ to objects like so:

from entity in db.Table.AsEnumerable() // fetch db.Table into memory first
join pair Tuples on new { entity.Id1, entity.Id2 }
             equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
select entity

Needless to say that this is not a good solution. db.Table could contain millions of records.

Solution 3: Two Contains statements

So let's offer EF two lists of primitive values, [1,2] for Id1 and [2,3] for Id2. We don't want to use join (see side note), so let's use Contains:

from entity in db.Table
where ids1.Contains(entity.Id1) && ids2.Contains(entity.Id2)
select entity

But now the results also contains entity {1,3}! Well, of course, this entity perfectly matches the two predicates. But let's keep in mind that we're getting closer. In stead of pulling millions of entities into memory, we now only get four of them.

Solution 4: One Contains with computed values

Solution 3 failed because the two separate Contains statements don't only filter the combinations of their values. What if we create a list of combinations first and try to match these combinations? We know from solution 1 that this list should contain primitive values. For instance:

var computed = ids1.Zip(ids2, (i1,i2) => i1 * i2); // [2,6]

and the LINQ statement:

from entity in db.Table
where computed.Contains(entity.Id1 * entity.Id2)
select entity

There are some problems with this approach. First, you'll see that this also returns entity {1,6}. The combination function (a*b) does not produce values that uniquely identify a pair in the database. Now we could create a list of strings like ["Id1=1,Id2=2","Id1=2,Id2=3]" and do

from entity in db.Table
where computed.Contains("Id1=" + entity.Id1 + "," + "Id2=" + entity.Id2)
select entity

(This would work in EF6, not in earlier versions).

This is getting pretty messy. But a more important problem is that this solution is not sargable, which means: it bypasses any database indexes on Id1 and Id2 that could have been used otherwise. This will perform very very poorly.

Solution 5: Best of 2 and 3

So the only viable solution I can think of is a combination of Contains and a join in memory: First do the contains statement as in solution 3. Remember, it got us very close to what we wanted. Then refine the query result by joining the result as an in-memory list:

var rawSelection = from entity in db.Table
                   where ids1.Contains(entity.Id1) && ids2.Contains(entity.Id2)
                   select entity;

var refined = from entity in rawSelection.AsEnumerable()
              join pair in Tuples on new { entity.Id1, entity.Id2 }
                              equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
              select entity;

It's not elegant, messy all the same maybe, but so far it's the only scalable2 solution to this problem I found, and applied in my own code.

Solution 6: Build a query with OR clauses

Using a Predicate builder like Linqkit or alternatives, you can build a query that contains an OR clause for each element in the list of combinations. This could be a viable option for really short lists. With a couple of hundreds of elements, the query will start performing very poorly. So I don't consider this a good solution unless you can be 100% sure that there will always be a small number of elements. One elaboration of this option can be found here.


1As a funny side note, EF does create a SQL statement when you join a primitive list, like so

from entity in db.Table // db is a DbContext
join i in MyIntegers on entity.Id1 equals i
select entity

But the generated SQL is, well, absurd. A real-life example where MyIntegers contains only 5(!) integers looks like this:

SELECT 
    [Extent1].[CmpId] AS [CmpId], 
    [Extent1].[Name] AS [Name], 
    FROM  [dbo].[Company] AS [Extent1]
    INNER JOIN  (SELECT 
        [UnionAll3].[C1] AS [C1]
        FROM  (SELECT 
            [UnionAll2].[C1] AS [C1]
            FROM  (SELECT 
                [UnionAll1].[C1] AS [C1]
                FROM  (SELECT 
                    1 AS [C1]
                    FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]
                UNION ALL
                    SELECT 
                    2 AS [C1]
                    FROM  ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
            UNION ALL
                SELECT 
                3 AS [C1]
                FROM  ( SELECT 1 AS X ) AS [SingleRowTable3]) AS [UnionAll2]
        UNION ALL
            SELECT 
            4 AS [C1]
            FROM  ( SELECT 1 AS X ) AS [SingleRowTable4]) AS [UnionAll3]
    UNION ALL
        SELECT 
        5 AS [C1]
        FROM  ( SELECT 1 AS X ) AS [SingleRowTable5]) AS [UnionAll4] ON [Extent1].[CmpId] = [UnionAll4].[C1]

There are n-1 UNIONs. Of course that's not scalable at all.

Later addition:
Somewhere along the road to EF version 6.1.3 this has been greatly improved. The UNIONs have become simpler and they are no longer nested. Previously the query would give up with less than 50 elements in the local sequence (SQL exception: Some part of your SQL statement is nested too deeply.) The non-nested UNION allow local sequences up to a couple of thousands(!) of elements. It's still slow though with "many" elements.

2As far as the Contains statement is scalable: Scalable Contains method for LINQ against a SQL backend

查看更多
大哥的爱人
4楼-- · 2019-01-01 06:20

in Case of composite key you can use another idlist and add a condition for that in your code

context.Table.Where(q => listOfIds.Contains(q.Id) && listOfIds2.Contains(q.Id2));

or you can use one another trick create a list of your keys by adding them

listofid.add(id+id1+......)
context.Table.Where(q => listOfIds.Contains(q.Id+q.id1+.......));
查看更多
浮光初槿花落
5楼-- · 2019-01-01 06:29

In the absence of a general solution, I think there are two things to consider:

  1. Avoid multi-column primary keys (will make unit testing easier too).
  2. But if you have to, chances are that one of them will reduce the query result size to O(n) where n is the size of the ideal query result. From here, its Solution 5 from Gerd Arnold above.

For example, the problem leading me to this question was querying order lines, where the key is order id + order line number + order type, and the source had the order type being implicit. That is, the order type was a constant, order ID would reduce the query set to order lines of relevant orders, and there would usually be 5 or less of these per order.

To rephrase: If you have a composite key, changes are that one of them have very few duplicates. Apply Solution 5 from above with that.

查看更多
路过你的时光
6楼-- · 2019-01-01 06:32

You need a set of objects representing the keys you want to query.

class Key
{
    int Id1 {get;set;}
    int Id2 {get;set;}

If you have two lists and you simply check that each value appears in their respective list then you are getting the cartesian product of the lists - which is likely not what you want. Instead you need to query the specific combinations required

List<Key> keys = // get keys;

context.Table.Where(q => keys.Any(k => k.Id1 == q.Id1 && k.Id2 == q.Id2)); 

I'm not completely sure that this is valid use of Entity Framework; you may have issues with sending the Key type to the database. If that happens then you can be creative:

var composites = keys.Select(k => p1 * k.Id1 + p2 * k.Id2).ToList();
context.Table.Where(q => composites.Contains(p1 * q.Id1 + p2 * q.Id2)); 

You can create an isomorphic function (prime numbers are good for this), something like a hashcode, which you can use to compare the pair of values. As long as the multiplicative factors are co-prime this pattern will be isomorphic (one-to-one) - i.e. the result of p1*Id1 + p2*Id2 will uniquely identify the values of Id1 and Id2 as long as the prime numbers are correctly chosen.

But then you end up in a situation where you're implementing complex concepts and someone is going to have to support this. Probably better to write a stored procedure which takes the valid key objects.

查看更多
梦醉为红颜
7楼-- · 2019-01-01 06:34

You can create a collection of strings with both keys like this (I am assuming that your keys are int type):

var id1id2Strings = listOfIds.Select(p => p.Id1+ "-" + p.Id2);

Then you can just use "Contains" on your db:

using (dbEntities context = new dbEntities())
            {
                var rec = await context.Table1.Where(entity => id1id2Strings .Contains(entity.Id1+ "-" + entity.Id2));
                return rec.ToList();
            }
查看更多
登录 后发表回答