我有follwing数据:
Product Price StartDate EndDate
Apples 4.9 2010-03-01 00:00:00.000 2010-03-01 00:00:00.000
Apples 4.9 2010-03-02 00:00:00.000 2010-03-02 00:00:00.000
Apples 2.5 2010-03-03 00:00:00.000 2010-03-03 00:00:00.000
Apples 4.9 2010-03-05 00:00:00.000 2010-03-05 00:00:00.000
Apples 4.9 2010-03-06 00:00:00.000 2010-03-06 00:00:00.000
Apples 4.9 2010-03-09 00:00:00.000 2010-03-09 00:00:00.000
Apples 2.5 2010-03-10 00:00:00.000 2010-03-10 00:00:00.000
Apples 4.9 2010-03-11 00:00:00.000 2010-03-11 00:00:00.000
Apples 4.9 2010-03-12 00:00:00.000 2010-03-12 00:00:00.000
Apples 4.9 2010-03-13 00:00:00.000 2010-03-13 00:00:00.000
Apples 4.9 2010-03-15 00:00:00.000 2010-03-15 00:00:00.000
Apples 4.9 2010-03-16 00:00:00.000 2010-03-16 00:00:00.000
要像组product, price, min(startdate), max(startdate)
,但应在开始日期和结束日期已分组及........类似下面
期望的结果
Apples 4.9 2010-03-01 00:00:00.000 2010-03-02 00:00:00.000
Apples 2.5 2010-03-03 00:00:00.000 2010-03-03 00:00:00.000
Apples 4.9 2010-03-05 00:00:00.000 2010-03-09 00:00:00.000
Apples 2.5 2010-03-10 00:00:00.000 2010-03-10 00:00:00.000
Apples 4.9 2010-03-11 00:00:00.000 2010-03-16 00:00:00.000
我的做法。
数据:
create table t ( producte varchar(50),
price money,
start_date date,
end_date date);
insert into t values
( 'apple', 4.9, '2012-01-01', '2012-01-01' ),
( 'apple', 4.9, '2012-01-02', '2012-01-02' ),
( 'apple', 8, '2012-01-04', '2012-01-04' ),
( 'cat', 5, '2012-01-01', '2012-01-01' ),
( 'cat', 6, '2012-01-02', '2012-01-02' ),
( 'cat', 6, '2012-01-03', '2012-01-03' );
查询:
with start_dates as (
select
t.producte, t.price, t.start_date, t.end_date, t.start_date as gr_date
from
t left outer join
t t1 on
t.price = t1.price and --new
t.producte = t1.producte and
t.start_date = dateadd(day,1, t1.end_date )
where t1.producte is null
union all
select
t.producte, t.price, t.start_date,t. end_date, gr_date
from
t inner join
start_dates t1 on
t.price = t1.price and --new
t.producte = t1.producte and
t.start_date = dateadd(day,1, t1.end_date )
)
select t.producte, t.price , min( t.start_date ), max( t.end_date )
from start_dates t
group by t.producte, gr_date ,t.price
结果 :
| PRODUCTE | PRICE | COLUMN_2 | COLUMN_3 |
----------------------------------------------
| apple | 4.9 | 2012-01-01 | 2012-01-02 |
| apple | 8 | 2012-01-04 | 2012-01-04 |
| cat | 5 | 2012-01-01 | 2012-01-01 |
| cat | 6 | 2012-01-02 | 2012-01-03 |
说明
这是一个递归CTE表达。 基本查询采取inital日期为每个组的价格。 递归查询查找与这个价格最后的数据。
SELECT product, price, MIN(start_date), MAX(end_date)
FROM (
SELECT product, price, start_date, end_date,
ROW_NUMBER() OVER (PARTITION BY product ORDER BY startDate) rn1,
ROW_NUMBER() OVER (PARTITION BY product, price ORDER BY startDate) rn2
FROM mytable
) q
GROUP BY
product, price, rn2 - rn1
ORDER BY
product, MIN(start_date), price
这里是一个SQLFiddle演示
with t2 as
(
select t1.*,
(select count(Price)
from t
where startdate<t1.startdate
and Price<>t1.price
and Product=t1.Product
)
rng
from t as t1
)
select Product,Price,min(startDate),max(EndDate)
from t2 group by Product,Price,RNG
order by 3
这里有一个建议:对于每一行,你必须找到最大的一个日期为它的价格是不同的,你组上。 例如,对于2010-03-11 2010-03-16和之间的线,你必须检索日期2010-03-10,因为这是它的价格是不同的最大以前的日期(2.5对4.9)。 第一行(S)将返回一个空的日期,但不应该是一个问题。
然而,在很长的表,这种查询可以变得很慢。 因此,如果你有一定的速度问题,你应该考虑增加一列的可能性,并使用游标逐步填充:按日期您遍历它,每次看到新的价格时,你改变它的价值。 最后的分组是那么微不足道。
这里的东西:
Select Product, Price, Min(StartDate) as StartDate, PreviousDate from (
Select product, price, StartDate, (Select max (StartDate) from table_2 t3 where t3.price <> t2.price and t3.StartDate < t2.StartDate and t3.Product = t2.Product) as previousDate
from table_2 t2) SQ
Group by Product, Price, PreviousDate
Order by PreviousDate
我相信这是表现最佳的解决方案至今:
WITH Calc AS (
SELECT *,
Grp = DateAdd(day, -Row_Number()
OVER (PARTITION BY Product, Price ORDER BY StartDate), StartDate
)
FROM dbo.PriceHistory
)
SELECT Product, Price, FromDate = Min(StartDate), ToDate = Max(StartDate)
FROM Calc
GROUP BY Product, Price, Grp
ORDER BY FromDate;
尝试了这一点自己