Overlapping Data

2019-07-28 03:34发布

I have a sql query to check overlapping of product records in table PRODUCTS. In most cases query works fine except for the following.

select * from products where 
product_reg_no = 'AL-NAPT' 
and (to_date('14-Aug-2001') BETWEEN to_date('27-Aug-2001') AND to_date('30-Aug-2001')
or to_date('31-Aug-2001') BETWEEN to_date('27-Aug-2001') AND to_date('30-Aug-2001'))

How to make this query to catch all records are overlapping either partially or completely?

If required I can provide table structure with sample records.

Thanks

Update 1

I have added table structure and records here or as below:

create table products
(product_reg_no varchar2(32),
 start_date date,
 end_date date);


Insert into products
   (product_reg_no, START_DATE, END_DATE)
 Values
   ('AL-NAPT', TO_DATE('08/14/2012', 'MM/DD/YYYY'), TO_DATE('08/31/2012', 'MM/DD/YYYY'));
Insert into products
   (product_reg_no, START_DATE, END_DATE)
 Values
   ('AL-NAPT', TO_DATE('08/27/2012', 'MM/DD/YYYY'), TO_DATE('08/30/2012', 'MM/DD/YYYY'));
COMMIT;

The first record which is from August, 14 2012 to August, 31 2012 is overlapping with second record which is from August, 27 2012 to August, 30 2012. So how can I modify my query to get the overlapping?

2条回答
Viruses.
2楼-- · 2019-07-28 04:02

See Determine whether two date ranges overlap.

You need to evaluate the following, or a minor variant on it using <= instead of <, perhaps:

Start1 < End2 AND Start2 < End1

Since you're working with a single table, you need to have a self-join:

SELECT p1.*, p2.*
  FROM products p1
  JOIN products p2
    ON p1.product_reg_no != p2.product_reg_no
   AND p1.start < p2.end
   AND p2.start < p1.end;

The not equal condition ensures that you don't get a record paired with itself (though the < conditions also ensure that, but if you used <=, the not equal condition would be a good idea.

This will generate two rows for each pair of products (one row with ProductA as p1 and ProductB as p2, the other with ProductB as p1 and ProductA as p2). To prevent that happening, change the != into either < or >.


And, looking more closely at the sample data, it might be that you're really interesting in rows where the registration numbers match and the dates overlap. In which case, you can ignore my wittering about != and < or > and replace the condition with = after all.

SELECT p1.*, p2.*
  FROM products p1
  JOIN products p2
    ON p1.product_reg_no = p2.product_reg_no
   AND p1.start < p2.end
   AND p2.start < p1.end;

SQL Fiddle (unsaved) shows that this works:

SELECT p1.product_reg_no p1_reg, p1.start_date p1_start, p1.end_date p1_end,
       p2.product_reg_no p2_reg, p2.start_date p2_start, p2.end_date p2_end
  FROM products p1
  JOIN products p2
    ON p1.product_reg_no = p2.product_reg_no
   AND p1.start_date < p2.end_date
   AND p2.start_date < p1.end_date
 WHERE (p1.start_date != p2.start_date OR p1.end_date != p2.end_date);

The WHERE clause eliminates the rows that are joined to themselves. With the duplicate column names in the SELECT-list eliminated, you get to see all the data. I added a row:

INSERT INTO products (product_reg_no, start_date, end_date)
VALUES ('AL-NAPT', TO_DATE('08/27/2011', 'MM/DD/YYYY'), TO_DATE('08/30/2011', 'MM/DD/YYYY'));

This was not selected — demonstrating that it does reject non-overlapping entries.

If you want to eliminate the double rows, then you have to add another fancy criterion:

SELECT p1.product_reg_no p1_reg, p1.start_date p1_start, p1.end_date p1_end,
       p2.product_reg_no p2_reg, p2.start_date p2_start, p2.end_date p2_end
  FROM products p1
  JOIN products p2
    ON p1.product_reg_no = p2.product_reg_no
   AND p1.start_date < p2.end_date
   AND p2.start_date < p1.end_date
 WHERE (p1.start_date != p2.start_date OR p1.end_date != p2.end_date)
   AND (p1.start_date < p2.start_date OR
       (p1.start_date = p2.start_date AND p1.end_date < p2.end_date));
查看更多
祖国的老花朵
3楼-- · 2019-07-28 04:17

This is a strange query. You check if 14-Aug-2001 is between 27-Aug-2001 and 30-Aug-2001 which is always false OR 31-Aug-2001 is between 27-Aug-2001 and 30-Aug-2001 which also always is false. So your where clause will always be false.

Edit: Thanks for clarification

SQL Fiddle Demo

select   p1.product_reg_no
       , p1.start_date p1s
       , p1.end_date   p1e
       , p2.start_date p2s
       , p2.end_date   p2e
from products p1, products p2
where p1.product_reg_no = p2.product_reg_no
  and not (    p1.end_date   < p2.start_date
           and p1.start_date > p2.end_date   );

What you want is the following scenarios (1 stands for the first row 2 for the second)

1    1
 2  2

 1  1
2    2

1    1
2    2

1   1
 2   2

 1   1
2   2

That you could also be turned around and say you do not want this:

1 1
   2 2

   1 1
2 2

I assumed you also do want this

1 1
  2 2

  1 1
2 2

The WHERE clause could also be written differently

not (    p1.end_date   < p2.start_date and p1.start_date > p2.end_date   )

is the same as

        p1.end_date   >= p2.start_date or p1.start_date <= p2.end_date   

I think it was called De Morgan's law when I had that in school eons ago.

You must probably think about what would happen if you have more than 2 rows.

查看更多
登录 后发表回答