Linq .Contains with large set causes TDS error

2019-07-02 02:06发布

问题:

I've oversimplified this a bit, because I'm looking for a general-purpose answer. Let's say I've got a table setup like this:

Parent
recno    int (unique, pk)
date     datetime
stuff    varchar(50)

Child
parentrecno (int, fk)   --- PK
sequence    (int)       --- PK
data     varchar(50)

And in my C# program I've gone through a lot of hassle to find the Parent records that I'm interested in and stuff them into a list. Parent is a really large table, and I'd rather not query it more than necessary. So I squirrel away the keys:

List<int> recs = (from d in Parent where [.....] select d.recno).ToList();

Later on in Linq I can say, find all of the child records for the associated parents:

var kids = from k in database.Childs
      where recs.Contains(k.parentrecno)
      select new { k };

This is all great until recs contains more than 2100 entries. Then I get a TDS RPC error (too many parameters).

The way I see it I can:

  • Do the whole thing in straight up SQL (didn't really want to do go through the hassle with a DataReader, etc...). There was an external system involved in qualifying the records, so I don't know if that's entirely possible either. Plus, I'd be generating that list twice -- once when I need to use it in .Contains(), and again for other purposes.

  • Break the list (recs) up, and then read Child in chunks.

If I break it up in chunks, then my pretty Linq a little farther down won't work at all:

var kids2 = (from kid in paydb.Childs
         where
             recs.Contains(kid.parentrecno)
         group pay by kid.parentrecno into kgroup
         select new { ParentRecNo = kgroup.Key, KidRecords = kgroup })
              .ToDictionary(kx => kx.ParentRecNo);

Because the List recs will contain things that needed to be grouped together, but necessarily split apart for the Linq query.

回答1:

We use a SQL function taking a varchar(max) delimited list of values as an argument and returning a table variable. The linq looks like this:

from a in Context.SomeTable    
join nl in Context.ParseDelimited(nodeIdList) on a.NodeId.ToString() equals nl.FieldValue

where nodeIdList is a string variable containing the list of ids from a previous query. Ugly, but it does get around the 2100 parameter limit.

create function dbo.ParseDelimited(@delimitedList NVARCHAR(MAX)) returns @tblSample table(counter int, fieldValue NVARCHAR(100)) 
WITH SCHEMABINDING
as
begin
  declare @counter    NVARCHAR(  4)
  declare @fieldValue NVARCHAR(100)

  declare @tmpTable table(counter int primary key, fieldValue NVARCHAR(100))

  set @counter = 1

  while charindex(',', @delimitedList) > 0
  begin
    set @fieldValue = ltrim(rtrim(substring(@delimitedList, 1, charIndex(',', @delimitedList)-1)))

    insert into @tmpTable select @counter, @fieldValue

    set @delimitedList = ltrim(rtrim(substring(@delimitedList, (charindex(',', @delimitedList) + 1), len(@delimitedList))))

    set @counter = @counter + 1
  end

  if ltrim(rtrim(@delimitedList)) != ''
  begin
    insert into @tmpTable select @counter, @delimitedList
  end

  insert into @tblSample select counter, fieldValue from @tmpTable

  return
end


回答2:

This looks like a job for Linq .Join() . I've used objects below instead of a database as the data source but the concept is the same if your LinqToSql provider supports it.

List<int> recs = new List<int>(Enumerable.Range(1, 2500));
List<Child> children = new List<Child>(Enumerable.Range(2000, 1000)
    .Select(x => new Child
    {
        ParentRecNo = x
    }));

var kids = children.Join(recs, x => x.ParentRecNo, y => y, (x, y) => x);

Console.WriteLine(kids.First().ParentRecNo);
Console.WriteLine(kids.Last().ParentRecNo);

output:

2000
2500

You can use the same Join in your Dictionary creation code as follows:

var kids2 = children
    .Join(recs, x => x.ParentRecNo, y => y, (x, y) => x)
    .GroupBy(x => x.ParentRecNo)
    .Select(kgroup => new
        {
            ParentRecNo = kgroup.Key,
            KidRecords = kgroup
        })
    .ToDictionary(kx => kx.ParentRecNo);


回答3:

Instead of using SQL and a DataReader you could also write two stored procedures and use those via LINQ. Or even read the list of id's via LINQ and feed that as input to your stored procedures.

You probably won't be able to solve the too many parameters issue (do you have access to the database) and the chunk solution isn't very nice and doesn't solve the whole problem because of the second query.

EDIT: Since the recs collection is not entirely database-generated, you need some way to tell your database the contents of this collection. I think your best option would be to use a stored procedure (or two) that accepts the collection as a large comma-separated string. Inside the stored procedure you split the string into ids again.

Some links that explain how to write and use a split string function.

By the way, if you are using SQL Server 2008, there is a far better approach than string parsing: table-valued parameters. You can pass a table as parameter to your stored procedure.



回答4:

I think this is what you're looking for:

List<Child> children = 
database.Parents.Where( 'your favourite lambda expression').Select(x=>x.Childs).ToList();

So... I don't know what condition you're using for getting the parents, but hopefully it can be done with a lambda expression, such as:

List<Child> children = database.Parents
     .Where(p=>p.recno > 10 && p.recno < 40)
     .Select(x=>x.Childs).ToList();


标签: linq tsql