Is there a more efficient method to find dupes in

2019-07-22 14:06发布

问题:

I have this question for which I provided a solution. However, I don't feel that is as efficient as it could be:

using (DataContext context = new DataContext(SqlConnection)
{
    var custInfo = context.GetTable<tbl_CustomerInfo>();

    string compID = ImportCust.Rows[0]["CompanyID"].ToString();

    var imports = from cust in ImportCust.AsEnumerable()
                  select cust.Field<int>("CustomerID");

    var dupes = from import in imports
                join cust in custInfo
                on import equals cust.CustomerID 
                where cust.CompanyID== pivnum
                select cust;

    var records = dupes.GetEnumerator();

    while (records.MoveNext())
    { custInfo.DeleteOnSubmit(records.Current); }

    context.SubmitChanges();
}

I have used Stopwatch to look at elapsed time spanning iteration of records to completion of SubmitChanges. The elapsed times seem to have no rhyme or reason:

10666 records completed in 20 seconds
15425 records completed in 12 seconds
289 records completed in 21 seconds

Obviously, one thing that would speed things up is if I were to drop indexes. Can that be done programmatically? In addition, is there a better method than what I have provided?

回答1:

you could use SQL statment:

-- TSQL (SQL Server 2005/2008): --

select CompanyID from tbl_CustomerInfo
    group by CompanyID
    having COUNT(*)>1

回答2:

The linq version of Tefod's sql:

from ci in dc.tbl_CustomerInfo
group ci by ci.CompanyID into g
where g.Count() > 1
select g.Key