Join two tables using linq, and fill a Dictionary

2020-03-04 03:14发布

问题:

I've been searching how to join two tables (Data and DataValues, one to many) and fill a dictionary of type .

The records of Data(s) might be thousands (e.g. 500,000 or more) and each Data may have 10 to 20 DataValues which makes it a much heavier query, so the performance is really important here.

here is the code I've write:

// Passed via the arguments, for example, sensorIDs would contain:
int[] sensorIDs = { 0, 1, 2, 3, 4, 5, 6, 17, 18 };
Dictionary<Data, List<DataValue>> dict = new Dictionary<Data, List<DataValue>>();

foreach (Data Data in dt.Datas)
{
    var dValues = from d in dt.Datas
                        join dV in dt.DataValues on d.DataID equals dV.DataID
                        where (SensorIDs.Contains(dV.SensorID))
                        select dV;
    dict.Add(Data, dValues.ToList<DataValue>());
}

But this approach has a significant performance issue and takes a long time to execute. Not sure if I need to use SQL Views. any suggestions?

回答1:

You're querying way too many times. You can do this in one query.

var dict = (from d in dt.Datas
            join dV in dt.DataValues on d.DataID equals dv.DataID
            where SensorIDs.Contains(dv.SensorID)
            select new { d, dV }).ToDictionary(o => o.d, o => o.dV.ToList());

In your foreach loop, you are fetching all Data and for each of them, you are doing the same thing.

Edit: Now that wasn't very clear, but I think you want to join only the DataValues that are in the SensorIDs array. In this case:

var dict = (from d in dt.Datas
            let dV = (from dataValue in dt.DataValues
                      where SensorIDs.Contains(dataValue.SensorID) &&
                            dataValue.DataID = d.DataID
                      select dataValue)
            select new { d, dV }).ToDictionary(o => o.d, o => o.dV.ToList());


回答2:

You do not need a foreach loop in this case, you can use group join to create the dictionary straight from linq which should give you better performance.

dict=(from DataValue d in dt.DataValues
           where sensorIDs.Contains(d.SensorID)
       group d by d.DataID 
           into datavalues
       join data in dt.Datas 
           on datavalues.Key equals data.DataId
       select new { 
         Key = data, 
         Value = datavalues
       }).ToDictionary(a=>a.Key,a=>a.Value.ToList());

or you can use linq expression methods

dict = dt.DataValues.Where(d=>sensorIDs.Contains(d.SensorID))
            .GroupBy(a=>a.DataID)
             .Join(dt.Datas,a=>a.Key,a=>a.DataId,
                    (a,b)=>new{Key=b,Value=a.ToList()})
        .ToDictionary(a=>a.Key,a=>a.Value);


回答3:

You don't need foreach loop. Try something like this in general:

var columns = dt.Columns.Cast<DataColumn>();
dt.AsEnumerable().Select(dataRow => columns.Select(column => 
                     new { Column = column.ColumnName, Value = dataRow[column] })
                 .ToDictionary(data => data.Column, data => data.Value));

Also, consider reading this: http://blogs.teamb.com/craigstuntz/2010/01/13/38525/