Should we disable lazy loading of Entity Framework

2020-08-17 03:24发布

问题:

I've heard that you have to disable the lazy loading feature of EF in web applications. (ASP.NET). Here and here, for instance.

Now I'm really confused because I always thought that lazy loading should always be enabled. So now my question is: is it really a better idea to disable lazy loading in web apps in terms of performance. If yes, could anyone explain the reasons and mention the pros and cons?

回答1:

Disabling lazy loading will prevent Select N+1 performance problems as well as recursive serialization bail-outs, however it replaces those with another artifact, null references.

When working with web applications I do not disable lazy loading, but rather I ensure that my controllers/APIs do not return entities, but rather they return ViewModels or DTOs. When you adopt using POCO classes to feed your views just the amount of data, and the structure they need, and use .Select() or Automapper's ProjectTo<TViewModel>() to populate them via deferred execution, you avoid the need to worry about lazy load, and introduce better overall performance and resource usage for your application. Technically, using this approach, lazy loading can be disabled, so it's not really an argument for or against disabling it but rather that the mere act of disabling lazy loading won't make your web application "better".

Adopting ViewModels offers a number of advantages:

  • Avoids lazy load calls or unexpected #null references.
  • Sends only the data needed by the view/consumer, and nothing more. (Less data over the wire, and less information for hackers with debugging views.)
  • Builds efficient, index-able queries to the database server.
  • Provides a place for computed columns that won't trip up EF.
  • Helps reduce security issues (unexpected entity modifications) on the trip back. (A view model cannot simply be re-attached and committed to DB by a lazy developer)

So for me, the issue with web applications isn't to lazy load or not, it's simply to avoid passing entities to the client. I see far, far too many "examples" out there where they pass entities around. I don't consider it a healthy pattern.

As an example, using a view model the first question is "What data does my view actually need?" So given a product and a product category, if I want to send a product entity, but I also need the product category name for instance we hit a nasty problem if our product category contains a collection of products, and each of those products has a category. When we pass our product to the serializer it's going to hit that cyclic reference and it's either going to bail out (leaving a reference or collection #null) or it is going to throw an exception. But working through the Select N+1 we would iterate over the Product properties, hit the ProductCategory reference, then "SELECT FROM ProductCategory WHERE ProductCategoryID = 3". Then as we iterate over that product category, we hit another reference, and that is another SELECT.... and so-forth down the chain.

By using a view model you limit the data you want to retrieve to just what the view needs. I create a Product View-model which outlines the fields I care about, irregardless of where the data comes from. If I want something like a product, it's name, and it's category name:

public class ProductViewModel
{
    public int ProductId { get; set; }
    public string ProductName { get; set; }
    public string CategoryName { get; set; }
}

then to load it:

var viewModel = context.Products
    .Where(x => x.ProductId == productId) 
    .Select(x => new ProductViewModel
    {
        ProductId = x.ProductId,
        ProductName = x.Name,
        CategoryName = x.Category.Name
    }).Single();

Done. No lazy loads or eager loads required. This produces a single query that returns a single record with just the 3 columns we need. (Rather than the entire Product record, and it's Product Category)

As requirements get more complex we can introduce a view model hierarchy, but we can continue to flatten down the related data based on what the view actually needs. In some cases it might mean selecting into an anonymous type then translating those results into view models where we need to use functions etc. that EF cannot translate down to SQL. The approach is a powerful and fast alternative to relying on loading entities, but requires attention at the start to understand what the end consumer (view/API) will need from the data.



回答2:

In web apps you have a lot of concurrent requests from different users, each request is being handled pretty fast (at least it should be), so you want to reduce number of DB calls during each request, because each DB request happens via network. With Lazy loading, every time you using relational property, it makes another call to DB to load this related data to your entity collection. So during one http request you can make a lot of additional requests to DB in such way, what from the performance prospective really hurts your application. In normal scenarios when you initially fetching your data from DB you already would know what related data do you need, so you can use eager loading of related entities to load everything that you need in one request to DB to handle particular http request.



回答3:

I had one legacy project. I was really surprised when I was checking in sql profile how many requests go to the database. It was about 180(!) for the home page! The home page has only two list with 20-30 items in each. So, you should understand N+1 requests very well. You should check carefully on code review it. For me, lazy loading gives a lot of problems. You never know how many requests go to the database when you use that feature.



回答4:

Lazy loading will create N+1 issue.

What does it mean?

It means that it will hit the DB once (at least) for each object loaded + the initial query.

Why is it bad?

Suppose you have these classes Movie and MovieGenre, and you have 100 movies with 30 genres between them in the DB

public class Movie
    {
        public int Id { get; set; }

        public string Name { get; set; }

        public virtual MovieGenre MovieGenre { get; set; }

        public byte MovieGenreId { get; set; }
}

public class MovieGenre
    {
        public byte Id { get; set; }

        public string Name { get; set; }
    }

Now suppose you are navigating to a page that will show all the 100 movies (remember the 30 movie genres?), the db will execute 31 queries (30 for each movie genre + 1 for movies) and the queries will be something like this

Initial query (+1 part):

SELECT Id, Name, MovieGenreId
From Movie

Additional queries (N part): -- 1 SELECT Id, Name From MovieGenre Where Id = 1

-- 2
SELECT Id, Name
From MovieGenre
Where Id = 2

-- 3
SELECT Id, Name
From MovieGenre
Where Id = 3

-- 4
SELECT Id, Name
From MovieGenre
Where Id = 4
.
.
.

Eager loading avoids all this mess and will use one query with the correct joins.

since you are using C# here is a tool called glimpse that you might want to use to understand the issue further.



回答5:

I think you need to start by asking few questions, before choosing one and rejecting other way:

  1. How big is my data set and do I need related data immediately? (This again is specific to your need)
  2. As other people have already told about N+1 issue; however, this becomes important depending on size of your dataset.
  3. Will it be pragmatic to have a round-trip to server to fetch related data?
  4. Then there is need of data. Whether you want it realtime or cached version will do?

My contribution to all important inputs by others.



回答6:

I would say that lazy loading will certainly help the user not to fetch the data when it's usage is really seldom.

Imaging you have Master => Details scenario in the web application and by the time you realized that users are not that much interested in details. ( You can analyze and audit requests initiated by Users ). There would be no need to load details for each master record upfront.

On the other hand, if details are main part to interact with, Just do the eager loading and fetch the whole Master => Detail for each.

Apart from lazy/eager loading, make sure following:

Always enable paging at server side and load the limited data/ Load the data on user request as they navigates page by page.

If there are read requests mostly, disable AutoTracking.