Aggregate of aggregate with EF Core Linq2Sql

2020-04-21 04:01发布

问题:

I have a ASP.NET Core 2.2 project with EF Core 2.2 Code-First DB. I have the following entities:

  • Building, which is basically an address with some other important data.
  • Floor, which contains the number of the floor. A building can have multiple floors. A floor has to have exactly one building where it's located.
  • Room, which has a number. A floor can have multiple rooms. A room has to have exactly one floor.
  • WorkGroup, which contains how many employees are int the group, whether the group is still active, and when did the group started to operate (which can be in the future).
  • RoomOccupancy, which is a join table between a WorkGroup and a room and shows in which room a WorkGroup is/was/will be.

I need a list of the buildings with the buildings name, how many floors it has, how many rooms the building has (not a floor), and how many people currently works int the building.

Currently I'm able to get all the data, but the translated SQL is not optimal and takes many trips to the DB. I was able to write one SQL select statement (with inner select) by hand for the problem, so I know this should be possible with one query.

dbContext.Buildings.Select(x=> new BuildingDatableElementDTO(){
            BuildingId = b.Id,
            Name = b.Name,
            FloorCount = b.Floors.Count(),
            //this is the part where problems start,
            //this translates to multiple SQL statements
            RoomCount = b.Floors.Sum(f=>f.Rooms.Count()),
            // I replaced the next line with
            // CurrentWorkerCount = 10, but a solution would be nice
            CurrentWorkerCount = b.Floors.Sum(f=>f.Rooms
              .Sum(r=>r.RoomOccupancies
                 .Where(o=>!o.WorkGroup.IsFinished && o.WorkGroup.StartDate < Datetime.Now).
                 .Sum(w => w.NumberOfEmployees)
                 ))),
    }).ToList();

For testing purposes I have replaced the CurrentWorkerCount lambda with CurrentWorkerCount = 10, because I can understand if it's hard to translate to SQL, but it still fails to create one SQL statement with the RoomCount.

Logging with info level shows this: "The LINQ expression '"Sum()"' could not be translated and will be evaluated locally" for every building that has at least one floor. Then I have one bigger DbCommand (too long to copy), then one DbCommand for every building, which counts the number of rooms.

I read that there are problems with the aggregates with EF Core 2.1, but I think it shouldn't be a hard task for the ORM to translate this Projection into one query.

Am I doing something wrong there or these are the capabilities of the LINQ and the EF Core? I think I could easily do that with the non-Core EF previously. I read about some workarounds for GroupBy and aggregates, but it didn't help in my case.

UPDATE

Here is the generated log (only the interesting parts). I'm using a custom solution for filtering, sorting and paging which works great with simple problems. No filtering in this example, sorting by the buildings name and basic fetch (skip 0 take 15). There is only a minimal amount of test data in the database (15 buildings one has 1 floor, another one has 2, of which one has 1 room, which has 1 workgroup with 100 employees). I also have soft delete with a global filter configured for the IsDeleted flag. I don't think these things affect the results, but here they are, maybe they do.

  • The LINQ expression '"Sum()"' could not be translated and will be evaluated locally.
  • The LINQ expression '"Sum()"' could not be translated and will be evaluated locally.
  • The LINQ expression '"Sum()"' could not be translated and will be evaluated locally.
  • The LINQ expression '"Sum()"' could not be translated and will be evaluated locally.
  • The LINQ expression '"Sum()"' could not be translated and will be evaluated locally.
  • The LINQ expression '"Sum()"' could not be translated and will be evaluated locally.
  • The LINQ expression '"Sum()"' could not be translated and will be evaluated locally.
  • The LINQ expression '"Sum()"' could not be translated and will be evaluated locally.
  • Executed DbCommand ("2"ms) [Parameters=["@__p_0='?' (DbType = Int32), @__p_1='?' (DbType = Int32)"], CommandType='Text', CommandTimeout='30']"
SELECT CONVERT(VARCHAR(36), [x].[Id]) AS [BuildingId], [x].[Name], (
    SELECT COUNT(*)
    FROM [Floors] AS [x0]
    WHERE ([x0].[IsDeleted] = 0) AND ([x].[Id] = [x0].[BuildingId])
) AS [FloorCount], [x].[Id]
FROM [Buildings] AS [x]
WHERE [x].[IsDeleted] = 0
ORDER BY [x].[Name]
OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("0"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@__Now_2='?' (DbType = DateTime2), @_outer_Id3='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT SUM([x14].[NumberOfEmployees])
    FROM [RoomOccupancys] AS [x14]
    LEFT JOIN [WorkGroups] AS [k.WorkGroup2] ON [x14].[WorkGroupId] = [k.WorkGroup2].[Id]
    WHERE (([x14].[IsDeleted] = 0) AND (([k.WorkGroup2].[IsFinished] = 0) AND ([k.WorkGroup2].[StartDate] < @__Now_2))) AND ([x13].[Id] = [x14].[RoomId])
)
FROM [Rooms] AS [x13]
WHERE ([x13].[IsDeleted] = 0) AND (@_outer_Id3 = [x13].[FloorId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("0"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("0"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@__Now_2='?' (DbType = DateTime2), @_outer_Id3='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT SUM([x14].[RemainingAmount])
    FROM [RoomOccupancys] AS [x14]
    LEFT JOIN [WorkGroups] AS [k.WorkGroup2] ON [x14].[WorkGroupId] = [k.WorkGroup2].[Id]
    WHERE (([x14].[IsDeleted] = 0) AND (([k.WorkGroup2].[IsFinished] = 0) AND ([k.WorkGroup2].[StartDate] < @__Now_2))) AND ([x13].[Id] = [x14].[RoomId])
)
FROM [Rooms] AS [x13]
WHERE ([x13].[IsDeleted] = 0) AND (@_outer_Id3 = [x13].[FloorId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("0"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("0"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("0"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])
  • Executed DbCommand ("1"ms) [Parameters=["@_outer_Id='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT (
    SELECT COUNT(*)
    FROM [Rooms] AS [x4]
    WHERE ([x4].[IsDeleted] = 0) AND ([x3].[Id] = [x4].[FloorId])
)
FROM [Floors] AS [x3]
WHERE ([x3].[IsDeleted] = 0) AND (@_outer_Id = [x3].[BuildingId])
  • Executed DbCommand ("0"ms) [Parameters=["@_outer_Id2='?' (DbType = Guid)"], CommandType='Text', CommandTimeout='30']"
SELECT [x10].[Id]
FROM [Floors] AS [x10]
WHERE ([x10].[IsDeleted] = 0) AND (@_outer_Id2 = [x10].[BuildingId])

回答1:

I read that there are problems with the aggregates with EF Core 2.1, but I think it shouldn't be a hard task for the ORM to translate this Projection into one query.

You are right that EF Core had (and still have - the latest at this time v2.2) problems translating GroupBy and aggregates (and not only). But not for "shouldn't be a hard task" - try converting arbitrary expression tree to pseudo SQL yourself and you'll quickly find that it is quite complicated task.

Anyway, EF Core query translation improves over the time, but as mentioned, is far from perfect. The showstopper in this case are nested aggregates - sum of sum/count etc. The solution is to flatten the target set and apply single aggregate. For instance, rewriting your LINQ query as follows:

dbContext.Buildings.Select(b => new //BuildingDatableElementDTO()
{
    BuildingId = b.Id,
    Name = b.Name,
    FloorCount = b.Floors.Count(),
    // (1)
    RoomCount = b.Floors.SelectMany(f => f.Rooms).Count(),
    // (2)
    CurrentWorkerCount = b.Floors
        .SelectMany(f => f.Rooms)
        .SelectMany(r => r.RoomOccupancies)
        .Select(o => o.WorkGroup)
        .Where(w => !w.IsFinished && w.StartDate < DateTime.Now)
        .Sum(w => w.NumberOfEmployees),
})
.ToList();

is translated to a single SQL (as expected):

  SELECT [e].[Id] AS [BuildingId], [e].[Name], (
      SELECT COUNT(*)
      FROM [Floors] AS [e0]
      WHERE ([e0].[IsDeleted] = 0) AND ([e].[Id] = [e0].[BuildingId])
  ) AS [FloorCount], (
      SELECT COUNT(*)
      FROM [Floors] AS [e1]
      INNER JOIN (
          SELECT [e2].[Id], [e2].[FloorId], [e2].[IsDeleted], [e2].[Name]
          FROM [Rooms] AS [e2]
          WHERE [e2].[IsDeleted] = 0
      ) AS [t] ON [e1].[Id] = [t].[FloorId]
      WHERE ([e1].[IsDeleted] = 0) AND ([e].[Id] = [e1].[BuildingId])
  ) AS [RoomCount], (
      SELECT SUM([f.Rooms.RoomOccupancies.WorkGroup].[NumberOfEmployees])
      FROM [Floors] AS [e3]
      INNER JOIN (
          SELECT [e4].*
          FROM [Rooms] AS [e4]
          WHERE [e4].[IsDeleted] = 0
      ) AS [t0] ON [e3].[Id] = [t0].[FloorId]
      INNER JOIN (
          SELECT [e5].*
          FROM [RoomOccupancies] AS [e5]
          WHERE [e5].[IsDeleted] = 0
      ) AS [t1] ON [t0].[Id] = [t1].[RoomId]
      INNER JOIN [WorkGroups] AS [f.Rooms.RoomOccupancies.WorkGroup] ON [t1].[WorkgroupId] = [f.Rooms.RoomOccupancies.WorkGroup].[Id]
      WHERE (([e3].[IsDeleted] = 0) AND (([f.Rooms.RoomOccupancies.WorkGroup].[IsFinished] = 0) AND ([f.Rooms.RoomOccupancies.WorkGroup].[StartDate] < GETDATE()))) AND ([e].[Id] = [e3].[BuildingId])
  ) AS [CurrentWorkerCount]
  FROM [Building] AS [e]
  WHERE [e].[IsDeleted] = 0


回答2:

What if you didn't use EF Navigations properties but used manual joins with LINQ to EF?

var ans2 = (from b in dbContext.Buildings
            join f in dbContext.Floors on b.Id equals f.BuildingId into fj
            from f in fj.DefaultIfEmpty()
            join r in dbContext.Rooms on f.Id equals r.FloorId into rj
            from r in rj.DefaultIfEmpty()
            join ro in dbContext.RoomOccupancies on r.Id equals ro.RoomId
            join w in dbContext.WorkGroups on ro.WorkGroupId equals w.Id into wj
            from w in wj.DefaultIfEmpty()
            where !w.IsFinished && w.StartDate < DateTime.Now
            select new BuildingDatableElementDTO() {
                BuildingId = b.Id,
                Name = b.Name,
                FloorCount = fj.Count(),
                RoomCount = rj.Count(),
                CurrentWorkerCount = wj.Sum(w => w.NumberOfEmployees)
           })
           .ToList();