How do I create a period date range from a mysql t

2019-06-08 05:58发布

My goal is to return a start and end date having same value in a column. Here is my table. The (*) have been marked to give you the idea of how I want to get "EndDate" for every similar sequence value of A & B columns

ID | DayDate   |  A  |  B
-----------------------------------------------
1  | 2010/07/1 | 200 |  300
2  | 2010/07/2 | 200 |  300 *
3  | 2010/07/3 | 150 |  250
4  | 2010/07/4 | 150 |  250 *
8  | 2010/07/5 | 150 |  350 *
9  | 2010/07/6 | 200 |  300
10 | 2010/07/7 | 200 |  300 *
11 | 2010/07/8 | 100 |  200
12 | 2010/07/9 | 100 |  200 *

and I want to get the following result table from the above table

| DayDate   |EndDate   |  A  |  B
-----------------------------------------------
| 2010/07/1 |2010/07/2 | 200 |  300
| 2010/07/3 |2010/07/4 | 150 |  250
| 2010/07/5 |2010/07/5 | 150 |  350
| 2010/07/6 |2010/07/7 | 200 |  300
| 2010/07/8 |2010/07/9 | 100 |  200

UPDATE:

Thanks Mike, The approach of yours seems to work in your perspective of considering the following row as a mistake.

8  | 2010/07/5 | 150 |  350 * 

However it is not a mistake. The challenge I am faced with this type of data is like a scenario of logging a market price change with date. The real problem in mycase is to select all rows with the beginning and ending date if both A & B matches in all these rows. Also to select the rows which are next to previously selected, and so on like that no data is left out in the table.

I can explain a real world scenario. A Hotel with Room A and B has room rates for each day entered in to table as explained in my question. Now the hotel needs to get a report to show the price calendar in a shorter way using start and end date, instead of listing all the dates entered. For example, on 2010/07/01 to 2010/07/02 the price of A is 200 and B is 300. This price is changed from 3rd to 4th and on 5th there is a different price only for that day where the Room B is price is changed to 350. So this is considered as a single day difference, thats why start and end dates are same.

I hope this explained the scenario of the problem. Also note that this hotel may be closed for a specific time period, lets say this is an additional problem to my first question. The problem is what if the rate is not entered on specific dates, for example on Sundays the hotel do not sell these two rooms so they entered no price, meaning the row will not exist in the table.

3条回答
够拽才男人
2楼-- · 2019-06-08 06:36

I think I have found a solution which does produce the table desired.

SELECT  
  a.DayDate AS StartDate,  

  ( SELECT b.DayDate  
    FROM Dates AS b  
    WHERE b.DayDate > a.DayDate AND (b.B = a.B OR b.B IS NULL)  
    ORDER BY b.DayDate ASC LIMIT 1 
  ) AS StopDate,
a.A as A,
    a.B AS B

FROM Dates AS a 
WHERE Coalesce( 
               (SELECT c.B  
                FROM Dates AS c  
                WHERE c.DayDate <= a.DayDate  
                ORDER BY c.DayDate DESC LIMIT 1,1  
               ), -99999  
              ) <> a.B 
  AND a.B IS NOT NULL 
ORDER BY a.DayDate ASC; 

is able to generate the following table result

StartDate   StopDate    A   B
2010-07-01  2010-07-02  200 300
2010-07-03  2010-07-04  150 250
2010-07-05  NULL        150 350
2010-07-06  2010-07-07  200 300
2010-07-08  2010-07-09  100 200

But I need a way to replace the NULL with the same date of the start date.

查看更多
淡お忘
3楼-- · 2019-06-08 06:37
  • My general approach is to join the table onto itself based on DayDate = DayDate+1 and the A or B values not being equal
  • This will find the end dates for each period (where the value is going to be different on the following day)
  • The only problem is, that won't find an end date for the final period. To get around this, I selct the max date from the table and union that into my list of end dates
  • Once you have the list of end dates defined, you can join them to the original table based on the end date being greater than or equal to the original date
  • From this final list, select the minimum daydate grouped by the other fields

    select
    min(DayDate) as DayDate,EndDate,A,B from
    (SELECT DayDate, A, B, min(ends.EndDate) as EndDate
    FROM yourtable
    LEFT JOIN
    (SELECT max(DayDate) as EndDate FROM yourtable UNION
    SELECT t1.DayDate as EndDate 
    FROM yourtable t1
    JOIN yourtable t2
    ON date_add(t1.DayDate, INTERVAL 1 DAY) = t2.DayDate 
    AND (t1.A<>t2.A OR t1.B<>t2.B)) ends
    ON ends.EndDate>=DayDate
    GROUP BY DayDate, A, B) x
    GROUP BY EndDate,A,B
    
查看更多
小情绪 Triste *
4楼-- · 2019-06-08 06:39

Creating related tables allows you much greater freedom to query and extract relevant information. Here's a few links that you might find useful:

You could start with these tutorials:
http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html
http://net.tutsplus.com/tutorials/databases/sql-for-beginners/

There are also a couple of questions here on stackoverflow that might be useful:
Normalization in plain English
What exactly does database normalization do?

Anyway, on to a possible solution. The following examples use your hotel rooms analogy.

First, create a table to hold information about the hotel rooms. This table just contains the room ID and its name, but you could store other information in here, such as the room type (single, double, twin), its view (ocean front, ocean view, city view, pool view), and so on:

CREATE TABLE `room` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(45) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE INDEX `name_UNIQUE` (`name` ASC) )
ENGINE = InnoDB;

Now create a table to hold the changing room rates. This table links to the room table through the room_id column. The foreign key constraint prevents records being inserted into the rate table which refer to rooms that do not exist:

CREATE TABLE `rate` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `room_id` INT UNSIGNED NOT NULL,
  `date` DATE NOT NULL,
  `rate` DECIMAL(6,2) UNSIGNED NOT NULL,
  PRIMARY KEY (`id`),
  INDEX `fk_room_rate` (`room_id` ASC),
  CONSTRAINT `fk_room_rate`
    FOREIGN KEY (`room_id` )
    REFERENCES `room` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

Create two rooms, and add some daily rate information about each room:

INSERT INTO `room` (`id`, `name`) VALUES (1, 'A'), (2, 'B');

INSERT INTO `rate` (`id`, `room_id`, `date`, `rate`) VALUES
( 1, 1, '2010-07-01', 200),
( 2, 1, '2010-07-02', 200),
( 3, 1, '2010-07-03', 150),
( 4, 1, '2010-07-04', 150),
( 5, 1, '2010-07-05', 150),
( 6, 1, '2010-07-06', 200),
( 7, 1, '2010-07-07', 200),
( 8, 1, '2010-07-08', 100),
( 9, 1, '2010-07-09', 100),
(10, 2, '2010-07-01', 300),
(11, 2, '2010-07-02', 300),
(12, 2, '2010-07-03', 250),
(13, 2, '2010-07-04', 250),
(14, 2, '2010-07-05', 350),
(15, 2, '2010-07-06', 300),
(16, 2, '2010-07-07', 300),
(17, 2, '2010-07-08', 200),
(18, 2, '2010-07-09', 200);

With that information stored, a simple SELECT query with a JOIN will show you the all the daily room rates:

SELECT
    room.name,
    rate.date,
    rate.rate
FROM room
JOIN rate
ON rate.room_id = room.id;

+------+------------+--------+
| A    | 2010-07-01 | 200.00 |
| A    | 2010-07-02 | 200.00 |
| A    | 2010-07-03 | 150.00 |
| A    | 2010-07-04 | 150.00 |
| A    | 2010-07-05 | 150.00 |
| A    | 2010-07-06 | 200.00 |
| A    | 2010-07-07 | 200.00 |
| A    | 2010-07-08 | 100.00 |
| A    | 2010-07-09 | 100.00 |
| B    | 2010-07-01 | 300.00 |
| B    | 2010-07-02 | 300.00 |
| B    | 2010-07-03 | 250.00 |
| B    | 2010-07-04 | 250.00 |
| B    | 2010-07-05 | 350.00 |
| B    | 2010-07-06 | 300.00 |
| B    | 2010-07-07 | 300.00 |
| B    | 2010-07-08 | 200.00 |
| B    | 2010-07-09 | 200.00 |
+------+------------+--------+

To find the start and end dates for each room rate, you need a more complex query:

SELECT 
    id,
    room_id,
    MIN(date) AS start_date,
    MAX(date) AS end_date,
    COUNT(*) AS days,
    rate
FROM (
    SELECT
        id,
        room_id,
        date,
        rate, 
        (
            SELECT COUNT(*)
            FROM rate AS b
            WHERE b.rate <> a.rate
            AND b.date <= a.date
            AND b.room_id = a.room_id
        ) AS grouping
    FROM rate AS a
    ORDER BY a.room_id, a.date
) c
GROUP BY rate, grouping
ORDER BY room_id, MIN(date);

+----+---------+------------+------------+------+--------+
| id | room_id | start_date | end_date   | days | rate   |
+----+---------+------------+------------+------+--------+
|  1 |       1 | 2010-07-01 | 2010-07-02 |    2 | 200.00 |
|  3 |       1 | 2010-07-03 | 2010-07-05 |    3 | 150.00 |
|  6 |       1 | 2010-07-06 | 2010-07-07 |    2 | 200.00 |
|  8 |       1 | 2010-07-08 | 2010-07-09 |    2 | 100.00 |
| 10 |       2 | 2010-07-01 | 2010-07-02 |    2 | 300.00 |
| 12 |       2 | 2010-07-03 | 2010-07-04 |    2 | 250.00 |
| 14 |       2 | 2010-07-05 | 2010-07-05 |    1 | 350.00 |
| 15 |       2 | 2010-07-06 | 2010-07-07 |    2 | 300.00 |
| 17 |       2 | 2010-07-08 | 2010-07-09 |    2 | 200.00 |
+----+---------+------------+------------+------+--------+

You can find a good explanation of the technique used in the above query here:
http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data

查看更多
登录 后发表回答