I've been searching how to join two tables (Data and DataValues, one to many) and fill a dictionary of type .
The records of Data(s) might be thousands (e.g. 500,000 or more) and each Data may have 10 to 20 DataValues which makes it a much heavier query, so the performance is really important here.
here is the code I've write:
// Passed via the arguments, for example, sensorIDs would contain:
int[] sensorIDs = { 0, 1, 2, 3, 4, 5, 6, 17, 18 };
Dictionary<Data, List<DataValue>> dict = new Dictionary<Data, List<DataValue>>();
foreach (Data Data in dt.Datas)
{
var dValues = from d in dt.Datas
join dV in dt.DataValues on d.DataID equals dV.DataID
where (SensorIDs.Contains(dV.SensorID))
select dV;
dict.Add(Data, dValues.ToList<DataValue>());
}
But this approach has a significant performance issue and takes a long time to execute.
Not sure if I need to use SQL Views. any suggestions?
You're querying way too many times. You can do this in one query.
var dict = (from d in dt.Datas
join dV in dt.DataValues on d.DataID equals dv.DataID
where SensorIDs.Contains(dv.SensorID)
select new { d, dV }).ToDictionary(o => o.d, o => o.dV.ToList());
In your foreach
loop, you are fetching all Data
and for each of them, you are doing the same thing.
Edit: Now that wasn't very clear, but I think you want to join only the DataValue
s that are in the SensorIDs array. In this case:
var dict = (from d in dt.Datas
let dV = (from dataValue in dt.DataValues
where SensorIDs.Contains(dataValue.SensorID) &&
dataValue.DataID = d.DataID
select dataValue)
select new { d, dV }).ToDictionary(o => o.d, o => o.dV.ToList());
You do not need a foreach
loop in this case, you can use group join to create the dictionary straight from linq which should give you better performance.
dict=(from DataValue d in dt.DataValues
where sensorIDs.Contains(d.SensorID)
group d by d.DataID
into datavalues
join data in dt.Datas
on datavalues.Key equals data.DataId
select new {
Key = data,
Value = datavalues
}).ToDictionary(a=>a.Key,a=>a.Value.ToList());
or you can use linq expression methods
dict = dt.DataValues.Where(d=>sensorIDs.Contains(d.SensorID))
.GroupBy(a=>a.DataID)
.Join(dt.Datas,a=>a.Key,a=>a.DataId,
(a,b)=>new{Key=b,Value=a.ToList()})
.ToDictionary(a=>a.Key,a=>a.Value);
You don't need foreach loop. Try something like this in general:
var columns = dt.Columns.Cast<DataColumn>();
dt.AsEnumerable().Select(dataRow => columns.Select(column =>
new { Column = column.ColumnName, Value = dataRow[column] })
.ToDictionary(data => data.Column, data => data.Value));
Also, consider reading this: http://blogs.teamb.com/craigstuntz/2010/01/13/38525/