SQL - Using MAX in a WHERE clause

2019-04-29 05:53发布

问题:

Assume value is an int and the following query is valid:

SELECT blah
FROM table
WHERE attribute = value

Though MAX(expression) returns int, the following is not valid:

SELECT blah
FROM table
WHERE attribute = MAX(expression)

OF course the desired effect can be achieved using a subquery, but my question is why was SQL designed this way - is there some reason why this sort of thing is not allowed? Students coming from programming languages where you can always replace a data-type by a function call that returns that type find this issue confusing. Is there an explanation one can give them rather than just saying "that's the way it is"?

回答1:

It's just because of the order of operations of a query.

  1. FROM clause
  2. WHERE clause
  3. GROUP BY clause
  4. HAVING clause
  5. SELECT clause
  6. ORDER BY clause

WHERE just filters the rows returned by FROM. An aggregate function like MAX() can't have a result returned because it hasn't even been applied to anything.

That's also the reason, why you can't use aliases defined in the SELECT clause in a WHERE clause, but you can use aliases defined in FROM clause.



回答2:

A where clause checks every row to see if it matches the conditions specified.

A max computes a single value from a row set. If you put a max, or any other aggregate function into a where clause, how can SQL server figure out what rows the max function can use until the where clause has finished it filter?

This deals with the order that SQL Server processes commands in. It runs the WHERE clause before a GROUP BY or any aggregate. Since a where clause runs first, SQL Server can't tell if a row will be included in an aggregate until it processes the where. That is what the HAVING clause is for. HAVING runs after the GROUP BY and the WHERE and can include MAX since you have already filtered out the rows you don't want to use. See http://www.bennadel.com/blog/70-SQL-Query-Order-of-Operations.htm for a good explanation of the order in which SQL commands run.



回答3:

The WHERE clause is specifically designed to test conditions against raw data (individual rows of the table). However, MAX is an aggregate function over multiple rows of data. Basically, without a sub-select, the WHERE clause knows nothing about any rows in the table except for the current row. So how can you determine the maximum value over a whole bunch of rows when you don't even know what those rows are?

Yes, it's a little bit of a simplification, especially when dealing with joins, but the same principle applies. WHERE is always row-by-row, so that's all it really knows about.

Even if you have a GROUP BY clause, the WHERE clause still only processes one row at a time in the raw data before grouping. It doesn't know the value of a column in any other rows, so it has no way of knowing which row has the maximum value.



回答4:

Maybe this work

SELECT blah
FROM table
WHERE attribute = (SELECT MAX(expresion) FROM table1)


回答5:

Assuming this is MS SQL Server, the following would work.

SELECT TOP 1 blah
FROM table
ORDER BY expression DESC


标签: sql max where