Select only rows with last modified timestamp, wit

2020-03-31 06:56发布

问题:

I could not find solution for my problem. I have two tables Order and OrderDetail.

Order table (simple version)

| ID | modified  |
| 1  | 7.1.2018. |
| 2  | 10.1.2018.|
| 3  | 15.1.2018.|
| 4  | 20.1.2018.|
| 5  | 25.1.2018.|

OrderDetails (simple version)

| order_id | detail_id | base_price | buy_price | sell_price|
| 1        | 1         | 99.00      | 111.00    | 122.00    |
| 1        | 2         | 82.00      | 95.00     | 117.00    | 
| 1        | 3         | 82.00      | 95.00     | 117.00    |
| 2        | 4         | 95.00      | 108.00    | 119.00    | 
| 2        | 5         | 86.00      | 94.00     | 115.00    | 
| 2        | 1         | 82.00      | 95.00     | 117.00    |
| 3        | 1         | 92.00      | 106.00    | 116.00    | 
| 3        | 4         | 90.00      | 100.00    | 120.00    | 
| 3        | 5         | 82.00      | 95.00     | 117.00    |
| 4        | 2         | 92.00      | 106.00    | 116.00    | 
| 4        | 3         | 90.00      | 100.00    | 120.00    | 
| 4        | 1         | 82.00      | 95.00     | 117.00    |
| 5        | 1         | 92.00      | 106.00    | 116.00    | 
| 5        | 5         | 90.00      | 100.00    | 120.00    | 
| 5        | 3         | 82.00      | 95.00     | 117.00    |

How I can get rows from OrderDetails table which are connected with last modified timestamp from Order table?

Result should be:

| order_id | detail_id | base_price | buy_price | sell_price | modified  |
| 5        | 1         | 92.00      | 106.00    | 116.00     | 25.1.2018.|
| 4        | 2         | 92.00      | 106.00    | 116.00     | 20.1.2018.|
| 5        | 3         | 82.00      | 95.00     | 117.00     | 25.1.2018.|
| 3        | 4         | 90.00      | 100.00    | 120.00     | 15.1.2018.|
| 5        | 5         | 90.00      | 100.00    | 120.00     | 25.1.2018.|

I know to join tables, and to get all rows from joint tables with desired columns, but I don't know how to filter only those rows with latest timestamp from every order_id, detail_id pair. Please, any help would be really appreciated.

Edit

Query is needed for Firebird database.

Edit 2.

First sample data was somehow misleading. Please take a look again on extended tables and desirable result. I need all distinct rows (based on "details_id") with it's last modified data. How to exclude "duplicate" rows for every "detail_id" with it's older timestamp and only keep "detail_id" rows with latest timestamp???

回答1:

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
 , mx as (select max(modified) as modified, detail_id
          from x group by detail_id)
Select x.* from x, mx
Where x.detail_id = mx.detail_id and x.modified=mx.modified

Here we use Common Table Expressions, so we join the two tables only once. At least we did it only once when writing the query - thus we would have less chances to do typo or copy-paste errors. We also hinted the SQL server to do the join only once and then to reuse it, but would it follow this hint or not - depends on its internal implementation.

Another good thing about CTE: it helps you to build your query incrementally, step by step, from simple to complex. Read about Read–eval–print loop at https://en.wikipedia.org/wiki/REPL
I would add a bit more of it later.

You can find many articles about CTE in Google. The Firebird implementation is documented here: https://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-dml-select.html#fblangref25-dml-select-cte

Since I used only very basic SQL I believe it to work in almost any practical SQL server, Firebird including.

Here is the result of the query and the output data: SQL Fiddle

PostgreSQL 9.6 Schema Setup:

create table orders
 (id integer primary key,
  modified timestamp);
create index o_m on orders(modified);  

create table OrderDetails(
  order_id integer references orders(id),
  detail_id integer not null,
  base_price float,
  buy_price float,
  sell_price float );
create index od_do on OrderDetails(detail_id, order_id);

Insert into orders values
( 1, '2018-1-07'),
( 2, '2018-1-10'),
( 3, '2018-1-15'),
( 4, '2018-1-20'),
( 5, '2018-1-25');

Insert into OrderDetails values
(   1   ,   1   ,   99.00   ,   111.00  ,   122.00  ),
(   1   ,   2   ,   82.00   ,   95.00   ,   117.00  ),
(   1   ,   3   ,   82.00   ,   95.00   ,   117.00  ),
(   2   ,   4   ,   95.00   ,   108.00  ,   119.00  ),
(   2   ,   5   ,   86.00   ,   94.00   ,   115.00  ),
(   2   ,   1   ,   82.00   ,   95.00   ,   117.00  ),
(   3   ,   1   ,   92.00   ,   106.00  ,   116.00  ),
(   3   ,   4   ,   90.00   ,   100.00  ,   120.00  ),
(   3   ,   5   ,   82.00   ,   95.00   ,   117.00  ),
(   4   ,   2   ,   92.00   ,   106.00  ,   116.00  ),
(   4   ,   3   ,   90.00   ,   100.00  ,   120.00  ),
(   4   ,   1   ,   82.00   ,   95.00   ,   117.00  ),
(   5   ,   1   ,   92.00   ,   106.00  ,   116.00  ),
(   5   ,   5   ,   90.00   ,   100.00  ,   120.00  ),
(   5   ,   3   ,   82.00   ,   95.00   ,   117.00  );

Query 1:

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
 , mx as (select max(modified) as modified, detail_id
          from x group by detail_id)
Select x.* from x, mx
Where x.detail_id = mx.detail_id and x.modified=mx.modified
Order by detail_id

Results:

|             modified | order_id | detail_id | base_price | buy_price | sell_price |
|----------------------|----------|-----------|------------|-----------|------------|
| 2018-01-25T00:00:00Z |        5 |         1 |         92 |       106 |        116 |
| 2018-01-20T00:00:00Z |        4 |         2 |         92 |       106 |        116 |
| 2018-01-25T00:00:00Z |        5 |         3 |         82 |        95 |        117 |
| 2018-01-15T00:00:00Z |        3 |         4 |         90 |       100 |        120 |
| 2018-01-25T00:00:00Z |        5 |         5 |         90 |       100 |        120 |

Note, it would have different output if you would have two or more orders with the same timestamp! It seems you did not even thought about this possibility - but since it is possible, it would eventually happen.

Now, Back to CTE and REPL.

As you incrementally build your query, from the first vague idea to specific rows, it would be nice to check that the output data is exactly what you expect. "Big elephant it better to be eaten by small pieces".

And here I will show you the step by step building of the query. It would be useful if you would repeat those steps in the SQL Fiddle linked above.

First of all I created and populated the tables.

Then I issued the first queries just to check I did populated them correctly.

1: select * from orders - try this and further queries in SQL fiddle (or in IBExpert, FlameRobin, etc

2: select * from orderDetails

3: Then I issued the joining query to check if my cross-tables query really gives meaningful output. It did.

select o.modified, od.* 
from orderDetails od, orders o
where o.id=od.order_id

4: Then I wondered, can I get the last timestamp for detail out of that query? To check it what I did was the following: 1) saved the aforementioned query I did and tested earlier, and 2) wrote a secondary query on top of it. It did extracting the last change date okay. Written and tested.

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
Select max(modified) as modified, detail_id
  from x group by detail_id

5: And the last step was saving the test secondary query too, and writing the final, tertiary query on top of them both, giving the final filtered data


Yet more efficient solution can be using the one-run join query (the one I introduced above in Step 3. and saved as x) with adding order by detail_id, modified desc and then using Window Functions introduced in Firebird 3.

Here is the answer to a similar question using that approach - Firebird select from table distinct one field

Window Functions are not available in Firebird 2.x though.



回答2:

This addresses the first two versions of the question.

For each detail record, you want the most recent order record. As your data is laid out, this is equivalent to the biggest order_id. It is simpler to use that than the date:

select od.*
from orderdetail od
where od.order_id = (select max(od2.order_id)
                     from orderdetail od2
                     where od2.detail_id = od.detail_id
                    );


回答3:

You can try this query. get topmost row from order table based on modified date and then inner join that row with orderdetails table.

SELECT od.*, o.modified 
FROM OrderDetails od
Inner join (Select top 1 * -- get topmost row
           from [Order] 
           order by modified desc ) O on o.id = od.order_id


标签: sql firebird