What is the “N+1 selects problem” in object relati

2018-12-30 23:19发布

The "N+1 selects problem" is generally stated as a problem in Object-Relational mapping (ORM) discussions, and I understand that it has something do to with having to make a lot of database queries for something that seems simple in the object world.

Does anybody have a more detailed explanation of the problem?

16条回答
墨雨无痕
2楼-- · 2018-12-30 23:36
SELECT 
table1.*
, table2.*
INNER JOIN table2 ON table2.SomeFkId = table1.SomeId

That gets you a result set where child rows in table2 cause duplication by returning the table1 results for each child row in table2. O/R mappers should differentiate table1 instances based on a unique key field, then use all the table2 columns to populate child instances.

SELECT table1.*

SELECT table2.* WHERE SomeFkId = #

The N+1 is where the first query populates the primary object and the second query populates all the child objects for each of the unique primary objects returned.

Consider:

class House
{
    int Id { get; set; }
    string Address { get; set; }
    Person[] Inhabitants { get; set; }
}

class Person
{
    string Name { get; set; }
    int HouseId { get; set; }
}

and tables with a similar structure. A single query for the address "22 Valley St" may return:

Id Address      Name HouseId
1  22 Valley St Dave 1
1  22 Valley St John 1
1  22 Valley St Mike 1

The O/RM should fill an instance of Home with ID=1, Address="22 Valley St" and then populate the Inhabitants array with People instances for Dave, John, and Mike with just one query.

A N+1 query for the same address used above would result in:

Id Address
1  22 Valley St

with a separate query like

SELECT * FROM Person WHERE HouseId = 1

and resulting in a separate data set like

Name    HouseId
Dave    1
John    1
Mike    1

and the final result being the same as above with the single query.

The advantages to single select is that you get all the data up front which may be what you ultimately desire. The advantages to N+1 is query complexity is reduced and you can use lazy loading where the child result sets are only loaded upon first request.

查看更多
还给你的自由
3楼-- · 2018-12-30 23:41

It is much faster to issue 1 query which returns 100 results than to issue 100 queries which each return 1 result.

查看更多
呛了眼睛熬了心
4楼-- · 2018-12-30 23:41

Take Matt Solnit example, imagine that you define an association between Car and Wheels as LAZY and you need some Wheels fields. This means that after the first select, hibernate is going to do "Select * from Wheels where car_id = :id" FOR EACH Car.

This makes the first select and more 1 select by each N car, that's why it's called n+1 problem.

To avoid this, make the association fetch as eager, so that hibernate loads data with a join.

But attention, if many times you don't access associated Wheels, it's better to keep it LAZY or change fetch type with Criteria.

查看更多
怪性笑人.
5楼-- · 2018-12-30 23:43

Here's a good description of the problem - http://www.realsolve.co.uk/site/tech/hib-tip-pitfall.php?name=why-lazy

Now that you understand the problem it can typically be avoided by doing a join fetch in your query. This basically forces the fetch of the lazy loaded object so the data is retrieved in one query instead of n+1 queries. Hope this helps.

查看更多
有味是清欢
6楼-- · 2018-12-30 23:44

Supplier with a one-to-many relationship with Product. One Supplier has (supplies) many Products.

***** Table: Supplier *****
+-----+-------------------+
| ID  |       NAME        |
+-----+-------------------+
|  1  |  Supplier Name 1  |
|  2  |  Supplier Name 2  |
|  3  |  Supplier Name 3  |
|  4  |  Supplier Name 4  |
+-----+-------------------+

***** Table: Product *****
+-----+-----------+--------------------+-------+------------+
| ID  |   NAME    |     DESCRIPTION    | PRICE | SUPPLIERID |
+-----+-----------+--------------------+-------+------------+
|1    | Product 1 | Name for Product 1 |  2.0  |     1      |
|2    | Product 2 | Name for Product 2 | 22.0  |     1      |
|3    | Product 3 | Name for Product 3 | 30.0  |     2      |
|4    | Product 4 | Name for Product 4 |  7.0  |     3      |
+-----+-----------+--------------------+-------+------------+

Factors:

  • Lazy mode for Supplier set to “true” (default)

  • Fetch mode used for querying on Product is Select

  • Fetch mode (default): Supplier information is accessed

  • Caching does not play a role for the first time the

  • Supplier is accessed

Fetch mode is Select Fetch (default)

// It takes Select fetch mode as a default
Query query = session.createQuery( "from Product p");
List list = query.list();
// Supplier is being accessed
displayProductsListWithSupplierName(results);

select ... various field names ... from PRODUCT
select ... various field names ... from SUPPLIER where SUPPLIER.id=?
select ... various field names ... from SUPPLIER where SUPPLIER.id=?
select ... various field names ... from SUPPLIER where SUPPLIER.id=?

Result:

  • 1 select statement for Product
  • N select statements for Supplier

This is N+1 select problem!

查看更多
若你有天会懂
7楼-- · 2018-12-30 23:47

The N+1 query issue happens when you forget to fetch an association and then you need to access it:

List<PostComment> comments = entityManager.createQuery(
    "select pc " +
    "from PostComment pc " +
    "where pc.review = :review", PostComment.class)
.setParameter("review", review)
.getResultList();

LOGGER.info("Loaded {} comments", comments.size());

for(PostComment comment : comments) {
    LOGGER.info("The post title is '{}'", comment.getPost().getTitle());
}

Which generates the following SQL statements:

SELECT pc.id AS id1_1_, pc.post_id AS post_id3_1_, pc.review AS review2_1_
FROM   post_comment pc
WHERE  pc.review = 'Excellent!'

INFO - Loaded 3 comments

SELECT pc.id AS id1_0_0_, pc.title AS title2_0_0_
FROM   post pc
WHERE  pc.id = 1

INFO - The post title is 'Post nr. 1'

SELECT pc.id AS id1_0_0_, pc.title AS title2_0_0_
FROM   post pc
WHERE  pc.id = 2

INFO - The post title is 'Post nr. 2'

SELECT pc.id AS id1_0_0_, pc.title AS title2_0_0_
FROM   post pc
WHERE  pc.id = 3

INFO - The post title is 'Post nr. 3'

First, Hibernate executes the JPQL query, and a list of PostComment entities is fetched.

Then, for each PostComment, the associated post property is used to generate a log message containing the Post title.

Because the post association is not initialized, Hibernate must fetch the Post entity with a secondary query, and for N PostComment entities, N more queries are going to be executed (hence the N+1 query problem).

First, you need proper SQL logging and monitoring so that you can spot this issue.

Second, this kind of issue is better to be caught by integration tests. You can use an automatic JUnit assert to validate the expected count of generated SQL statements. The db-unit project already provides this functionality, and it's open source.

When you identified the N+1 query issue, you need to use a JOIN FETCH so that child associations are fetched in one query, instead of N. If you need to fetch multiple child associations, it's better to fetch one collection in the initial query and the second one with a secondary SQL query.

查看更多
登录 后发表回答