SQL: Count of rows between first and last occurren

2019-08-17 03:35发布

问题:

I want to find the count of rows between first and last occurrence of a value. However when there are five or more records of a different value between them, stop counting.

So if last occurrence is today and first occurrence is yesterday, the result would be 2 (today plus yesterday).

If last occurrence is today and first occurrence is 8 days ago AND there is no occurrence in between the two, the result would be '1'. If however there would be another occurrence 3 days ago, the result would be 4 (3+2+1 days ago plus today).

I hope that makes sense.

Here's my data

Date        City    Weather
==============================
2018-08-11  Ankara  Sun
2018-08-10  Ankara  Sun
2018-08-09  Ankara  Sun
2018-08-08  Ankara  Sun
2018-08-07  Ankara  Sun
2018-08-06  Ankara  Sun
2018-08-05  Ankara  Rain
2018-08-04  Ankara  Clouds
2018-08-03  Ankara  Rain
2018-08-02  Ankara  Sun
2018-08-01  Ankara  Sun
2018-08-11  Cairo   Clouds
2018-08-10  Cairo   Sun
2018-08-09  Cairo   Sun
2018-08-08  Cairo   Sun
2018-08-07  Cairo   Sun
2018-08-06  Cairo   Sun
2018-08-05  Cairo   Clouds
2018-08-04  Cairo   Sun
2018-08-03  Cairo   Sun
2018-08-02  Cairo   Sun
2018-08-01  Cairo   Sun

What I'm after is a query that returns for a given city and date the weather that day and the number of days since this weather first occurred. However, when there are gaps of five days or more, the count restarts from 1.

Like when queried for Ankara on the 11th Aug it would return 11 since it's been 11 days (including today) since Sun first occurred.

However for Cairo on the 11th Aug it would return 1 and not 7 because it's been 5+ days between Clouds on the 5th Aug and Clouds today.

I've tried many things with first_value(), LEAD, LAG and ROW_NUMBER but there is nothing that makes sense to how as it all failed miserably.

Here it is anyway...

select 
 city, val,datediff(day, min(datadate), '2018-10-30') + 1 as DaysPresent
from d
where val = last_val
group by city,val;

or...

select 
        date, city, weather, datediff(day,ca.prior,d.date)+1 as daysPresent
from d
cross apply (
    select min(prev.date) as prior
    from d as prev 
    where prev.city = d.city
    and prev.date between dateadd(day,-4,d.date) and dateadd(day,0,d.date)
    and prev.weather = d.weather
    ) ca

order by city,date

Expected Result

+----+---------------------+--------+---------+----------------+-------------+----------+----------------------------
|    |        date         |  city  | weather | prior_the_same | prior_types |expected  | why?
+----+---------------------+--------+---------+----------------+-------------+----------+----------------------------
|  1 | 11.08.2018 00:00:00 | Ankara | Sun     |              5 |           2 |       11 | 11t day since 1st time Sun
|  2 | 10.08.2018 00:00:00 | Ankara | Sun     |              4 |           3 |       10 | 10t day since 1st time Sun
|  3 | 09.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 |        9 | 9th day since 1st time Sun
|  4 | 08.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 |        8 | 8th day since 1st time Sun
|  5 | 07.08.2018 00:00:00 | Ankara | Sun     |              6 |           3 |        7 | 7th day since 1st time Sun
|  6 | 06.08.2018 00:00:00 | Ankara | Sun     |              5 |           3 |        6 | 6th day since 1st time Sun ( <5 days gap since last Sun keeps counting )
|  7 | 05.08.2018 00:00:00 | Ankara | Rain    |              2 |           3 |        3 | 3rd day since 1st time Rain
|  8 | 04.08.2018 00:00:00 | Ankara | Clouds  |              0 |           3 |        1 | 1st day Clouds
|  9 | 03.08.2018 00:00:00 | Ankara | Rain    |              0 |           2 |        1 | 1st day Rain 
| 10 | 02.08.2018 00:00:00 | Ankara | Sun     |              1 |           0 |        2 | 2nd day since 1st time Sun
| 11 | 01.08.2018 00:00:00 | Ankara | Sun     |              0 |           0 |        1 | 1st day Sun
| 12 | 11.08.2018 00:00:00 | Cairo  | Clouds  |              6 |           6 |        1 | 1st time Clouds ( >5 days gap since last Clouds resets the count )
| 13 | 10.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |       10 | 10t day since 1st time Sun
| 14 | 09.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |        9 | 9th day since 1st time Sun
| 15 | 08.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |        8 | 8th day since 1st time Sun
| 16 | 07.08.2018 00:00:00 | Cairo  | Sun     |              6 |           1 |        7 | 7th day since 1st time Sun
| 17 | 06.08.2018 00:00:00 | Cairo  | Sun     |              5 |           1 |        6 | 6th day since 1st time Sun ( <5 days gap since last Sun keeps counting )
| 18 | 05.08.2018 00:00:00 | Cairo  | Clouds  |              0 |           4 |        1 | 1st time Clouds
| 19 | 04.08.2018 00:00:00 | Cairo  | Sun     |              3 |           0 |        4 | 4th day since 1st time Sun
| 20 | 03.08.2018 00:00:00 | Cairo  | Sun     |              2 |           0 |        3 | 3rd day since 1st time Sun
| 21 | 02.08.2018 00:00:00 | Cairo  | Sun     |              1 |           0 |        2 | 2nd day since 1st time Sun
| 22 | 01.08.2018 00:00:00 | Cairo  | Sun     |              0 |           0 |        1 | 1st day Sun
+----+---------------------+--------+---------+----------------+-------------+----------+----------------------------

回答1:

Latest

declare @day_range integer = 5;

select 
        t.date, t.city, t.weather
      , datediff(day,ca1.prior_dt,t.date)+1 as prior_the_same
      , twist.prior_types
      , twist.prior_mx_dt
from mytable t
cross apply (
    select count(prev.weather) as prior_types, max(prev.date) as prior_mx_dt
    from mytable as prev 
    where prev.city = t.city
    and prev.date between dateadd(day,-@day_range,t.date) and t.date
    and prev.weather <> t.weather
    ) twist
cross apply (
    select min(prev.date) as prior_dt
    from mytable as prev 
    where prev.city = t.city
    and (twist.prior_types < @day_range or prev.date >= twist.prior_mx_dt)
    and prev.weather = t.weather
    ) ca1

order by t.city, t.date DESC

result:

+----+---------------------+--------+---------+----------------+-------------+---------------------+
|    |        date         |  city  | weather | prior_the_same | prior_types |     prior_mx_dt     |
+----+---------------------+--------+---------+----------------+-------------+---------------------+
|  1 | 11.08.2018 00:00:00 | Ankara | Sun     |             11 |           0 | NULL                |
|  2 | 10.08.2018 00:00:00 | Ankara | Sun     |             10 |           1 | 05.08.2018 00:00:00 |
|  3 | 09.08.2018 00:00:00 | Ankara | Sun     |              9 |           2 | 05.08.2018 00:00:00 |
|  4 | 08.08.2018 00:00:00 | Ankara | Sun     |              8 |           3 | 05.08.2018 00:00:00 |
|  5 | 07.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 | 05.08.2018 00:00:00 |
|  6 | 06.08.2018 00:00:00 | Ankara | Sun     |              6 |           3 | 05.08.2018 00:00:00 |
|  7 | 05.08.2018 00:00:00 | Ankara | Rain    |              3 |           3 | 04.08.2018 00:00:00 |
|  8 | 04.08.2018 00:00:00 | Ankara | Clouds  |              1 |           3 | 03.08.2018 00:00:00 |
|  9 | 03.08.2018 00:00:00 | Ankara | Rain    |              1 |           2 | 02.08.2018 00:00:00 |
| 10 | 02.08.2018 00:00:00 | Ankara | Sun     |              2 |           0 | NULL                |
| 11 | 01.08.2018 00:00:00 | Ankara | Sun     |              1 |           0 | NULL                |
| 12 | 11.08.2018 00:00:00 | Cairo  | Clouds  |              1 |           5 | 10.08.2018 00:00:00 |
| 13 | 10.08.2018 00:00:00 | Cairo  | Sun     |             10 |           1 | 05.08.2018 00:00:00 |
| 14 | 09.08.2018 00:00:00 | Cairo  | Sun     |              9 |           1 | 05.08.2018 00:00:00 |
| 15 | 08.08.2018 00:00:00 | Cairo  | Sun     |              8 |           1 | 05.08.2018 00:00:00 |
| 16 | 07.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 | 05.08.2018 00:00:00 |
| 17 | 06.08.2018 00:00:00 | Cairo  | Sun     |              6 |           1 | 05.08.2018 00:00:00 |
| 18 | 05.08.2018 00:00:00 | Cairo  | Clouds  |              1 |           4 | 04.08.2018 00:00:00 |
| 19 | 04.08.2018 00:00:00 | Cairo  | Sun     |              4 |           0 | NULL                |
| 20 | 03.08.2018 00:00:00 | Cairo  | Sun     |              3 |           0 | NULL                |
| 21 | 02.08.2018 00:00:00 | Cairo  | Sun     |              2 |           0 | NULL                |
| 22 | 01.08.2018 00:00:00 | Cairo  | Sun     |              1 |           0 | NULL                |

see it online: https://rextester.com/ZSHT63407


Original

with sample data of:

CREATE TABLE mytable(
   Date    DATE  NOT NULL
  ,City    VARCHAR(6) NOT NULL
  ,Weather VARCHAR(6) NOT NULL
);
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-11','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-10','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-09','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-08','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-07','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-06','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-05','Ankara','Rain');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-04','Ankara','Clouds');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-03','Ankara','Rain');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-02','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-01','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-11','Cairo','Clouds');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-10','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-09','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-08','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-07','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-06','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-05','Cairo','Clouds');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-04','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-03','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-02','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-01','Cairo','Sun');

Using this query:

declare @day_range integer = 7;
declare @ignore_range integer = 5;

select 
        t.date, t.city, t.weather
      , datediff(day,ca1.prior_dt,t.date) as prior_the_same
      , ca2.prior_types
from mytable t
cross apply (
    select min(prev.date) as prior_dt
    from mytable as prev 
    where prev.city = t.city
    and prev.date between dateadd(day,-@day_range,t.date) and t.date
    and prev.weather = t.weather
    ) ca1
cross apply (
    select count(prev.weather) as prior_types
    from mytable as prev 
    where prev.city = t.city
    and prev.date between dateadd(day,-@day_range,t.date) and t.date
    and prev.weather <> t.weather
    ) ca2
order by t.city, t.date DESC

The following is the result:

+----+---------------------+--------+---------+----------------+-------------+----------+
|    |        date         |  city  | weather | prior_the_same | prior_types |expected? |
+----+---------------------+--------+---------+----------------+-------------+----------+
|  1 | 11.08.2018 00:00:00 | Ankara | Sun     |              5 |           2 |          |
|  2 | 10.08.2018 00:00:00 | Ankara | Sun     |              4 |           3 |          |
|  3 | 09.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 |          |
|  4 | 08.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 |          |
|  5 | 07.08.2018 00:00:00 | Ankara | Sun     |              6 |           3 |          |
|  6 | 06.08.2018 00:00:00 | Ankara | Sun     |              5 |           3 |          |
|  7 | 05.08.2018 00:00:00 | Ankara | Rain    |              2 |           3 |          |
|  8 | 04.08.2018 00:00:00 | Ankara | Clouds  |              0 |           3 |          |
|  9 | 03.08.2018 00:00:00 | Ankara | Rain    |              0 |           2 |          |
| 10 | 02.08.2018 00:00:00 | Ankara | Sun     |              1 |           0 |          |
| 11 | 01.08.2018 00:00:00 | Ankara | Sun     |              0 |           0 |          |
| 12 | 11.08.2018 00:00:00 | Cairo  | Clouds  |              6 |           6 |          |
| 13 | 10.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |          |
| 14 | 09.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |          |
| 15 | 08.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |          |
| 16 | 07.08.2018 00:00:00 | Cairo  | Sun     |              6 |           1 |          |
| 17 | 06.08.2018 00:00:00 | Cairo  | Sun     |              5 |           1 |          |
| 18 | 05.08.2018 00:00:00 | Cairo  | Clouds  |              0 |           4 |          |
| 19 | 04.08.2018 00:00:00 | Cairo  | Sun     |              3 |           0 |          |
| 20 | 03.08.2018 00:00:00 | Cairo  | Sun     |              2 |           0 |          |
| 21 | 02.08.2018 00:00:00 | Cairo  | Sun     |              1 |           0 |          |
| 22 | 01.08.2018 00:00:00 | Cairo  | Sun     |              0 |           0 |          |
+----+---------------------+--------+---------+----------------+-------------+----------+

Over more than one question you have expanded on your requirements. May I suggest you consider the above and decide if you can use the 2 calculations to arrive at the wanted final result. If you are still unable to come to a conclusion use the text table format to include the "expected result" as a new column