-->

Designing and Querying Product / Review system

2019-09-10 09:09发布

问题:

I created a product / review system from scratch and I´m having a hard time to do the following query in SQL Server.

My schema has different tables, for: products, reviews, categories, productPhotos and Brand. I have to query them all to find the brand and category name, photos details, Average Rating and Number of Reviews.

I´m having a hard time to get No. of reviews and average rating.

Reviews can be hidden (user has deleted) or blocked (waiting for moderation). My product table doesn't have No. of Reviews or Average Rating columns, so I need to count it on that query, but not counting the blocked and hidden ones (r.bloqueado=0 and r.hidden=0).

I have the query below, but it´s counting the blocked and hidden. If I uncomment the "and r.bloqueado=0 and r.hidden=0" part I get the right counting, but then it doesn't show products that has 0 reviews (something I need!).

select top 20 
    p.id, p.brand, m.nome, c.name, 
    count(r.product) AS NoReviews, Avg(r.nota) AS AvgRating, 
    f.id as cod_foto,f.nome as nome_foto 
from
    tblBrands AS m 
inner join 
    (tblProducts AS p 
left join 
    tblProductsReviews AS r ON p.id = r.product) ON p.brand = m.id 
left join 
    tblProductsCategorias as c on p.categoria = c.id 
left join 
    (select 
         id_product, id, nome 
     from 
         tblProductsFotos O 
     where 
         id = (SELECT min(I.id) 
               FROM tblProductsFotos I 
               WHERE I.id_product = O.id_product)) as f on p.id = f.id_product 
where 
    p.bloqueado = 0
    //Problem - and r.bloqueado=0 and r.hidden=0    
group by 
    p.id, p.brand, p.modalidade, m.nome, c.name, f.id,f.nome"

Need your advice:

I have seen other systems that has Avg Rating and No. of Reviews in the product table. This would help a lot in the complexity of this query (probably also performance), but then I have to do extra queries in every new review, blocked and hidden actions. I can easily to that. Considering that includes and updates occurs much much less than showing the products, this sounds nice. Would be a better idea to do that ?

Or is it better to find a way to fix this query ? Can you help me find a solution ?

Thanks

回答1:

For count the number of product you can use case when and sum assigning 1 there the value is not r.bloqueado=0 or r.hidden=0 and 0 for these values (so you can avoid the filter in where)

  "select top 20 p.id, p.brand, m.nome, c.name, sum(
                              case  when  r.bloqueado=0 then 0 
                                    when  r.hidden=0 then 0 
                                    else 1 
                              end )  AS NoReviews, 
  Avg(r.nota) AS AvgRating, f.id as cod_foto,f.nome as nome_foto 
  from tblBrands AS m 
  inner join (tblProducts AS p 
  left join tblProductsReviews AS r ON p.id=r.product ) ON p.brand = m.id 
  left join tblProductsCategorias as c on p.categoria=c.id 
  left join (select id_product,id,nome from tblProductsFotos O 
  where id = (SELECT min(I.id) FROM tblProductsFotos I 
            WHERE I.id_product = O.id_product)) as f on p.id = f.id_product where p.bloqueado=0
  group by p.id, p.brand, p.modalidade, m.nome, c.name, f.id,f.nome"

for avg could be you can do somethings similar



回答2:

It's very easy to lose records when combining a where clause with an outer join. Rows that do not exist in the outer table are returned as NULL. Your filter has accidentally excluded these nulls.

Here's an example that demonstrates what's happening:

/* Sample data.
 * There are two tables: product and review.
 * There are two products: 1 & 2.
 * Only product 1 has a review.
 */
DECLARE @Product TABLE
    (
        ProductId   INT
    )
;

DECLARE @Review TABLE
    (
        ReviewId    INT,
        ProductId   INT,
        Blocked     BIT
    )
;

INSERT INTO @Product
    (
        ProductId
    )
VALUES
    (1),
    (2)
;

INSERT INTO @Review
    (
        ReviewId,
        ProductId,
        Blocked
    )
VALUES
    (1, 1, 0)
;

Outer joining the tables, without a where clause, returns:

Query

-- No where.
SELECT
    p.ProductId,
    r.ReviewId,
    r.Blocked
FROM
    @Product AS p   
        LEFT OUTER JOIN @Review AS r        ON r.ProductId = p.ProductId
;

Result

ProductId   ReviewId    Blocked
1           1           0
2           NULL        NULL

Filtering for Blocked = 0 would remove the second record, and therefore ProductId 2. Instead:

-- With where.
SELECT
    p.ProductId,
    r.ReviewId,
    r.Blocked
FROM
    @Product AS p   
        LEFT OUTER JOIN @Review AS r        ON r.ProductId = p.ProductId
WHERE
    r.Blocked = 0
    OR r.Blocked IS NULL
;

This query retains the NULL value, and ProductId 2. Your example is a little more complicated because you have two fields.

SELECT
    ...
WHERE
    (
        Blocked = 0
        AND Hidden = 0
    )
    OR Blocked IS NULL
;

You do not need to check both fields for NULL, as they appear in the same table.