Inner join of DataTables in C#

2018-12-31 22:33发布

问题:

Let T1 and T2 are DataTables with following fields

T1(CustID, ColX, ColY)

T2(CustID, ColZ)

I need the joint table

TJ (CustID, ColX, ColY, ColZ)

How this can be done in C# code in a simple way? Thanks.

回答1:

If you are allowed to use LINQ, take a look at the following example. It creates two DataTables with integer columns, fills them with some records, join them using LINQ query and outputs them to Console.

    DataTable dt1 = new DataTable();
    dt1.Columns.Add(\"CustID\", typeof(int));
    dt1.Columns.Add(\"ColX\", typeof(int));
    dt1.Columns.Add(\"ColY\", typeof(int));

    DataTable dt2 = new DataTable();
    dt2.Columns.Add(\"CustID\", typeof(int));
    dt2.Columns.Add(\"ColZ\", typeof(int));

    for (int i = 1; i <= 5; i++)
    {
        DataRow row = dt1.NewRow();
        row[\"CustID\"] = i;
        row[\"ColX\"] = 10 + i;
        row[\"ColY\"] = 20 + i;
        dt1.Rows.Add(row);

        row = dt2.NewRow();
        row[\"CustID\"] = i;
        row[\"ColZ\"] = 30 + i;
        dt2.Rows.Add(row);
    }

    var results = from table1 in dt1.AsEnumerable()
                 join table2 in dt2.AsEnumerable() on (int)table1[\"CustID\"] equals (int)table2[\"CustID\"]
                 select new
                 {
                     CustID = (int)table1[\"CustID\"],
                     ColX = (int)table1[\"ColX\"],
                     ColY = (int)table1[\"ColY\"],
                     ColZ = (int)table2[\"ColZ\"]
                 };
    foreach (var item in results)
    {
        Console.WriteLine(String.Format(\"ID = {0}, ColX = {1}, ColY = {2}, ColZ = {3}\", item.CustID, item.ColX, item.ColY, item.ColZ));
    }
    Console.ReadLine();

// Output:
// ID = 1, ColX = 11, ColY = 21, ColZ = 31
// ID = 2, ColX = 12, ColY = 22, ColZ = 32
// ID = 3, ColX = 13, ColY = 23, ColZ = 33
// ID = 4, ColX = 14, ColY = 24, ColZ = 34
// ID = 5, ColX = 15, ColY = 25, ColZ = 35


回答2:

I wanted a function that would join tables without requiring you to define the columns using an anonymous type selector, but had a hard time finding any. I ended up having to make my own. Hopefully this will help anyone in the future who searches for this:

private DataTable JoinDataTables(DataTable t1, DataTable t2, params Func<DataRow, DataRow, bool>[] joinOn)
{
    DataTable result = new DataTable();
    foreach (DataColumn col in t1.Columns)
    {
        if (result.Columns[col.ColumnName] == null)
            result.Columns.Add(col.ColumnName, col.DataType);
    }
    foreach (DataColumn col in t2.Columns)
    {
        if (result.Columns[col.ColumnName] == null)
            result.Columns.Add(col.ColumnName, col.DataType);
    }
    foreach (DataRow row1 in t1.Rows)
    {
        var joinRows = t2.AsEnumerable().Where(row2 =>
            {
                foreach (var parameter in joinOn)
                {
                    if (!parameter(row1, row2)) return false;
                }
                return true;
            });
        foreach (DataRow fromRow in joinRows)
        {
            DataRow insertRow = result.NewRow();
            foreach (DataColumn col1 in t1.Columns)
            {
                insertRow[col1.ColumnName] = row1[col1.ColumnName];
            }
            foreach (DataColumn col2 in t2.Columns)
            {
                insertRow[col2.ColumnName] = fromRow[col2.ColumnName];
            }
            result.Rows.Add(insertRow);
        }
    }
    return result;
}

An example of how you might use this:

var test = JoinDataTables(transactionInfo, transactionItems,
               (row1, row2) =>
               row1.Field<int>(\"TransactionID\") == row2.Field<int>(\"TransactionID\"));

One caveat: This is certainly not optimized, so be mindful when getting to row counts above 20k. If you know that one table will be larger than the other, try to put the smaller one first and the larger one second.



回答3:

This is my code. Not perfect, but working good. I hope it helps somebody:

    static System.Data.DataTable DtTbl (System.Data.DataTable[] dtToJoin)
    {
        System.Data.DataTable dtJoined = new System.Data.DataTable();

        foreach (System.Data.DataColumn dc in dtToJoin[0].Columns)
            dtJoined.Columns.Add(dc.ColumnName);

        foreach (System.Data.DataTable dt in dtToJoin)
            foreach (System.Data.DataRow dr1 in dt.Rows)
            {
                System.Data.DataRow dr = dtJoined.NewRow();
                foreach (System.Data.DataColumn dc in dtToJoin[0].Columns)
                    dr[dc.ColumnName] = dr1[dc.ColumnName];

                dtJoined.Rows.Add(dr);
            }

        return dtJoined;
    }


回答4:

this function will join 2 tables with a known join field, but this cannot allow 2 fields with the same name on both tables except the join field, a simple modification would be to save a dictionary with a counter and just add number to the same name filds.

public static DataTable JoinDataTable(DataTable dataTable1, DataTable dataTable2, string joinField)
{
    var dt = new DataTable();
    var joinTable = from t1 in dataTable1.AsEnumerable()
                            join t2 in dataTable2.AsEnumerable()
                                on t1[joinField] equals t2[joinField]
                            select new { t1, t2 };

    foreach (DataColumn col in dataTable1.Columns)
        dt.Columns.Add(col.ColumnName, typeof(string));

    dt.Columns.Remove(joinField);

    foreach (DataColumn col in dataTable2.Columns)
        dt.Columns.Add(col.ColumnName, typeof(string));

    foreach (var row in joinTable)
    {
        var newRow = dt.NewRow();
        newRow.ItemArray = row.t1.ItemArray.Union(row.t2.ItemArray).ToArray();
        dt.Rows.Add(newRow);
    }
    return dt;
}