Finding if current row is last row to be selected

2019-08-02 18:38发布

问题:

I am selecting list of periods from database. If current row is first row then the period starts with date and I can find the interval between period start like this:

SELECT
...
CASE WHEN row_number() OVER(ORDER BY r.created_at ASC) = 1 THEN r.created_at - r.created_at::date ELSE NULL END AS period
...
FROM mytable r

How can I do the same to last row? To find the time between the r.created_at of last row and midnight of its date.

I am aware of first and last functions in PostgreSQL (https://wiki.postgresql.org/wiki/First/last_(aggregate)), but they are aggregate functions and do not help in this case.

Edit: This question has 2 great answers. Neither of them help in my case, as this single line i presented as part of my question is part of bigger query, put together programmatically and using the solutions offered would force me to alter alot of code, which i am not willing to do at this point. Should the scaling problems hit - then i will certainly reconsider.

回答1:

This might be faster than window functions:

with r as (
    select
        min(created_at) as min_created_at,
        max(created_at) as max_created_at
    from mytable
)
select
    case when (select min_created_at from r) = created_at
    then created_at - created_at::date else null
    end as period_min,
    case when (select max_created_at from r) = created_at
    then created_at - created_at::date else null
    end as period_max
from mytable


回答2:

You could use the window functions first_value() and last_value() in a single CASE statement:

SELECT *
     , CASE WHEN ts IN ( first_value(created_at) OVER w
                       , last_value(created_at)  OVER w)
       THEN created_at::time::interval ELSE NULL END AS period
FROM   tbl
WINDOW w AS (ORDER BY created_at rows BETWEEN UNBOUNDED PRECEDING
                                          AND UNBOUNDED FOLLOWING);

Special requirement here: you need to adjust the frame definition for the last_value() call. By default it is:

RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

But you need:

ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

The first_value() call would work with the default frame, but can use the same as well.

I also simplified computation of period. Your definition coincides with the time component of a timestamp. Just cast to time to "truncate" the date part: created_at::time.
Casting to interval after that is just to return the same data type as your original query.

The result will be ordered by created_at automatically due to the current implementation of window functions. But do not rely on that. If you need sorted output, add to the end explicitly:

ORDER BY created_at