Efficient Way To Query Nested Data

2020-07-18 05:18发布

问题:

I have need to select a number of 'master' rows from a table, also returning for each result a number of detail rows from another table. What is a good way of achieving this without multiple queries (one for the master rows and one per result to get the detail rows).

For example, with a database structure like below:

MasterTable:
    - MasterId BIGINT
    - Name NVARCHAR(100)

DetailTable:
    - DetailId BIGINT
    - MasterId BIGINT
    - Amount MONEY

How would I most efficiently populate the data object below?

IList<MasterDetail> data;

public class Master
{
    private readonly List<Detail> _details = new List<Detail>();

    public long MasterId
    {
        get; set;
    }

    public string Name
    {
        get; set;
    }

    public IList<Detail> Details
    {
        get
        {
            return _details;
        }
    }
}

public class Detail
{
    public long DetailId
    {
        get; set;
    }

    public decimal Amount
    {
        get; set;
    }
}

回答1:

Normally, I'd go for the two grids approach - however, you might also want to look at FOR XML - it is fairly easy (in SQL Server 2005 and above) to shape the parent/child data as xml, and load it from there.

SELECT parent.*,
       (SELECT * FROM child
       WHERE child.parentid = parent.id FOR XML PATH('child'), TYPE)
FROM parent
FOR XML PATH('parent')

Also - LINQ-to-SQL supports this type of model, but you need to tell it which data you want ahead of time. Via DataLoadOptions.LoadWith:

// sample from MSDN
Northwnd db = new Northwnd(@"c:\northwnd.mdf");
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Customer>(c => c.Orders);
db.LoadOptions = dlo;

var londonCustomers =
    from cust in db.Customers
    where cust.City == "London"
    select cust;

foreach (var custObj in londonCustomers)
{
    Console.WriteLine(custObj.CustomerID);
}

If you don't use LoadWith, you will get n+1 queries - one master, and one child list per master row.



回答2:

It can be done with a single query like this:

select   MasterTable.MasterId,
         MasterTable.Name,
         DetailTable.DetailId,
         DetailTable.Amount
from     MasterTable
         inner join
         DetailTable
         on MasterTable.MasterId = DetailTable.MasterId
order by MasterTable.MasterId

Then in psuedo code

foreach(row in result)
{
   if (row.MasterId != currentMaster.MasterId)
   {
       list.Add(currentMaster);
       currentMaster = new Master { MasterId = row.MasterId, Name = row.Name };
   }
   currentMaster.Details.Add(new Detail { DetailId = row.DetailId, Amount = row.Amount});
}
list.Add(currentMaster);

There's a few edges to knock off that but it should give you the general idea.



回答3:

select < columns > from master

select < columns > from master M join Child C on M.Id = C.MasterID



回答4:

You can do it with two queries and one pass on each result set:

Query for all masters ordered by MasterId then query for all Details also ordered by MasterId. Then, with two nested loops, iterate the master data and create a new Master object foreach row in the main loop, and iterate the details while they have the same MasterId as the current Master object and populate its _details collection in the nested loop.



回答5:

Depending on the size of your dataset you can pull all of the data into your application in memory with two queries (one for all masters and one for all nested data) and then use that to programatically create your sublists for each of your objects giving something like:

List<Master> allMasters = GetAllMasters();
List<Detail> allDetail = getAllDetail();

foreach (Master m in allMasters)
    m.Details.Add(allDetail.FindAll(delegate (Detail d) { return d.MasterId==m.MasterId });

You're essentially trading memory footprint for speed with this approach. You can easily adapt this so that GetAllMasters and GetAllDetail only return the master and detail items you're interested in. Also note for this to be effective you need to add the MasterId to the detail class



回答6:

This is an alternative you might consider. It does cost $150 per developer, but time is money too...

We use an object persistence layer called Entity Spaces that generates the code for you to do exactly what you want, and you can regenerate whenever your schema changes. Populating the objects with data is transparent. Using the objects you described above would look like this (excuse my VB, but it works in C# too):

Dim master as New BusinessObjects.Master
master.LoadByPrimaryKey(43)
Console.PrintLine(master.Name)
For Each detail as BusinessObjects.Detail in master.DetailCollectionByMasterId
   Console.PrintLine(detail.Amount)
   detail.Amount *= 1.15
End For
With master.DetailCollectionByMasterId.AddNew
   .Amount = 13
End With
master.Save()