I'm trying to join two DataTables together in a similar way to this question:
Inner join of DataTables in C#
I'm trying to get the output to be a single 'combined' table, with columns from both of the original tables. They will both have a datestamp column in common.
The answer given is good for DataTables with fixed columns, but what if they are created dynamically, and can have any number of columns, how can I join them?
e.g.
T1 (datestamp, t1Column1, t1Column2, t1ColumnN...)
T2 (datestamp, t2Column1, t2Column2, t2ColumnN...)
I would like to join to create the following:
J1 (datestamp, t1Column1, t1Column2, t1ColumnN, ..., t2Column1, t2Column2, t2ColumnN...)
Is this possible?
I found a solution which doesn't rely on looping through the columns.
It uses the 'Merge' method, which I had previously dismissed as I thought both tables required the same structure.
First you need to create a primary key on the two data-tables:
// set primary key
T1.PrimaryKey = new DataColumn[] { T1.Columns["DateStamp"] };
T2.PrimaryKey = new DataColumn[] { T2.Columns["DateStamp"] };
Then add both tables to a data-set so a relationship can be added:
// add both data-tables to data-set
DataSet dsContainer = new DataSet();
dsContainer.Tables.Add(T1);
dsContainer.Tables.Add(T2);
Next add the relationship between the two key columns in the data-set:
// add a relationship between the two timestamp columns
DataRelation relDateStamp = new DataRelation("Date", new DataColumn[] { T1.Columns["DateStamp"] }, new DataColumn[] { T2.Columns["DateStamp"] });
dsContainer.Relations.Add(relDateStamp);
Finally you can now copy the first data-table into a new 'combined' version, and then merge in the second:
// populate combined data
DataTable dtCombined = new DataTable();
dtCombined = T1.Copy();
dtCombined.Merge(T2, false, MissingSchemaAction.Add);
Note: The Merge method requires the second argument to be false or else it copies the structure but not the data of the second table.
This would then combine the following tables:
T1 (2012-05-09, 111, 222)
T2 (2012-05-09, 333, 444, 555)
into a combined version based on the primary-key:
J1 (2012-05-09, 111, 222, 333, 444, 555)
I think you can adapt the answer in the linked question to use the index of the column, rather than the column name. Or you could just loop through the items in each Row, like this:
foreach(DataRow row in table.Rows)
{
foreach(DataColumn column in table.Columns)
{
object value = row[column]; // add this to your combined table
}
}
After getting tired of seeing all these inner join functions that don't reliably emulate SQL, I decided to make my own here:
private DataTable JoinDataTables(DataTable t1, DataTable t2, params Func<DataRow, DataRow, bool>[] joinOn)
{
DataTable result = new DataTable();
foreach (DataColumn col in t1.Columns)
{
if (result.Columns[col.ColumnName] == null)
result.Columns.Add(col.ColumnName, col.DataType);
}
foreach (DataColumn col in t2.Columns)
{
if (result.Columns[col.ColumnName] == null)
result.Columns.Add(col.ColumnName, col.DataType);
}
foreach (DataRow row1 in t1.Rows)
{
var joinRows = t2.AsEnumerable().Where(row2 =>
{
foreach (var parameter in joinOn)
{
if (!parameter(row1, row2)) return false;
}
return true;
});
foreach (DataRow fromRow in joinRows)
{
DataRow insertRow = result.NewRow();
foreach (DataColumn col1 in t1.Columns)
{
insertRow[col1.ColumnName] = row1[col1.ColumnName];
}
foreach (DataColumn col2 in t2.Columns)
{
insertRow[col2.ColumnName] = fromRow[col2.ColumnName];
}
result.Rows.Add(insertRow);
}
}
return result;
}
An example of how you might use this:
var test = JoinDataTables(transactionInfo, transactionItems,
(row1, row2) =>
row1.Field<int>("TransactionID") == row2.Field<int>("TransactionID"));