sql having get only the first recorded row in tabl

2019-03-03 02:13发布

问题:

Somebody already helped me with this query but I made an adaptation and I get a problem :

    SELECT 
        AVG(tyd.price) AS avg_price, COUNT(tyd.id_product) AS cnt, 
        tyd.id_marchand, tyd.id_product, 
        catalog.price AS c_price, tyd.price AS t_price, 
        tyd.amount AS t_am, pro_tyd.amount AS p_am, 
        pro_tyd.price AS p_price, catalog.img_src,  
        tyd.step, tyd.login AS tyd_l
    FROM catalog 
    INNER JOIN tyd ON catalog.id_marchand = tyd.id_marchand 
                   AND catalog.id_product =   tyd.id_product
                   AND tyd.step = "1" 
    INNER JOIN pro_tyd  ON tyd.id_marchand = pro_tyd.id_marchand 
                        AND tyd.id_product = pro_tyd.id_product
    GROUP BY 
         catalog.id_product, catalog.id_marchand
    HAVING 
         tyd.login = "user1@tyd.fr"

and it only works when tyd.login = "user3@tyd.fr" which is the lower id. doesn't work with user1 or user2...I just can't figure why...! Also it work if i group by tyd.login but in this case the AVG and COUNT function work only on a single line...

Here is the table :

id  id_marchand  id_product   login       price  amount  delay  step    time   
29      1           1       user3@tyd.fr   344     1       0      1   1343297500
120     1           1       user1@tyd.fr   54      1       0      1   1343297504
109     1           1       user10@tyd.fr  34      1       0      1   1343298598

When HAVING tyd.login = "user3@tyd.fr" it works perfectly. When user1 or user2 i got 0 lines.

thanks for your help

First subject : Sql, get the average on a group while escaping the where condition

回答1:

The problem is that your query is non deterministic. You are selecting more columns that you are grouping by where the additional columns are not determined by the columns you are grouping by. If the latter had been true, then this would fall into the ANSII standards of SQL, since it is not it is (in my opinion) a failure of MySQL that it allows the statement to run at all. Other DBMS have gone the opposite way, since they cannot determine if certain columns are functions of other columns not contained in the group by, they will now allow any statement with columns in the select list that are not contained within the group by.

To try and simplify the problem take the following dataset (Table T)

ID    Col1    Col2
1     1       1
2     1       3

Running this:

SELECT Col1, MAX(Col2) AS MaxCol2, MIN(Col2) AS MinCol2, AVG(Col2) AS AvgCol2
FROM T
GROUP BY Col1

Will always return

Col1    MaxCol2    MinCol2    AvgCol2
1       3          1          2

However, if you throw ID into the mix

SELECT ID, Col1, MAX(Col2) AS MaxCol2, MIN(Col2) AS MinCol2, AVG(Col2) AS AvgCol2
FROM T
GROUP BY Col1

There is no way of determining which ID will be returned, 1 or 2, the most likely result is

ID    Col1    MaxCol2    MinCol2    AvgCol2
1    1       3          1          2

However there is nothing defined in the SQL to state that the result could not be:

ID    Col1    MaxCol2    MinCol2    AvgCol2
2     1       3          1          2

So if if the above was the result set, and you added HAVING ID = 1 to the query, you would get no results due to the point at which the HAVING clause is applied to the data. If you were to Add ID to the GROUP BY you would end up with 2 rows, which as I understand it is not what you want, and if you were to add it to the WHERE your MIN, MAX and AVG functions would be affected. So you need to use a subquery. So in this example I would use

SELECT  T.ID, T.Col1, MaxCol2, MinCol2, AvgCol2
FROM    T
        INNER JOIN
        (   SELECT Col1, MAX(Col2) AS MaxCol2, MIN(Col2) AS MinCol2, AVG(Col2) AS AvgCol2
            FROM T
            GROUP BY Col1
        ) T2
            ON T.Col1 = T2.Col1
WHERE   ID = 1 -- OR ID = 2 DEPENDING ON REQUIREMENTS

To apply this to your situation, the database engine has deteremined that the row that will be returned for the columns not in the group by is the row containing ID = 29. So your HAVING Clause is only being applied to this row, the rows with user1@tyd.fr, and user10@tyd.fr in have already been removed from the results by the time your HAVING clause is applied. You need to perform the aggregate functions separately to the filtering.

Now I haven't fully got my head around your schema, but I am hoping I have explained the issue of the non deterministic statement well enough that you can make any ammendments required to my attempt at rewriting your query

SELECT  Avg_Price,
        Cnt,
        tyd.id_marchand, 
        tyd.id_product, 
        catalog.price AS c_price, 
        tyd.price AS t_price, 
        tyd.amount AS t_am, 
        pro_tyd.amount AS p_am, 
        pro_tyd.price AS p_price, 
        catalog.img_src,  
        tyd.step, 
        tyd.login AS tyd_l
FROM    Catalog
        INNER JOIN tyd
            ON catalog.id_marchand = tyd.id_marchand 
            AND catalog.id_product = tyd.id_product
        INNER JOIN Pro_tyd
            ON tyd.id_marchand = pro_tyd.id_marchand 
            AND tyd.id_product = pro_tyd.id_product
        INNER JOIN
        (   SELECT  ID_Marchand, ID_Product, Step, AVG(tyd.price) AS avg_price, COUNT(tyd.id_product) AS cnt
            FROM    Tyd
            WHERE   Step = '1'
            GROUP BY ID_Marchand, ID_Product, Step
        ) Agg
            ON Agg.id_marchand = pro_tyd.id_marchand 
            AND Agg.id_product = pro_tyd.id_product
            AND Agg.Step = tyd.Step
WHERE   tyd.Login = 'user1@tyd.fr'


回答2:

i just want to add that this query :

SELECT * FROM  tyd WHERE step = "1" 
GROUP BY tyd.id_product, tyd.id_marchand 
HAVING tyd.login = "user1@tyd.fr" 

Has the same problem. Works when tyd.login = "user3" but not when an other user is ask...