What's the most efficient way to generate this

2019-05-26 15:38发布

问题:

Given a table (daily_sales) with say 100k rows of the following data/columns:

id    rep   sales  date  
1     a     123    12/15/2011  
2     b     153    12/15/2011  
3     a      11    12/14/2011  
4     a     300    12/13/2011  
5     a     120    12/12/2011  
6     b     161    11/15/2011  
7     a       3    11/14/2011  
8     c      13    11/14/2011  
9     c      44    11/13/2011  

What would be the most efficient way to write a report (completely in SQL) showing the two most recent entries (rep, sales, date) for each name, so the output would be:

a  123  12/15/2011  
a   11  12/14/2011  
b  153  12/15/2011  
b  161  11/15/2011  
c   13  11/14/2011  
c   44  11/13/2011  

Thanks!

回答1:

For MySQL, explained in @Quassnoi's blog, an index on (name, date) and using this:

SELECT  t.*
FROM    (
        SELECT  name,
                COALESCE(
                (
                SELECT  date
                FROM    tableX ti
                WHERE   ti.name = dto.name
                ORDER BY
                        ti.name, ti.date DESC
                LIMIT 1 
                OFFSET 1                      --- this is set to 2-1 
                ), CAST('1000-01-01' AS DATE)) AS mdate
        FROM    (
                SELECT  DISTINCT name
                FROM    tableX dt
                ) dto
        ) tg
        , tableX t
WHERE   t.name >= tg.name
  AND   t.name <= tg.name
  AND   t.date >= tg.mdate


回答2:

FYI, your example is using mostly reserved words and makes it horrid for us to attempt to program against. If you've got the real table columns, gives those to us. This is postgres:

select name,value, max(date)
from the_table_name_you_neglect_to_give_us
group by 1,2

That'll give you a list of first name,value,max(date)...though I gotta ask why give us a column called value if it doesn't change in the example?

Lets say you do have an id column...we'll be consistent with your scheme and call it 'ID'...

 select b.id from
(select name,value, max(date) date
from the_table_name_you_neglect_to_give_us
group by 1,2) a
inner join the_table_name_you_neglect_to_give_us b on a.name=b.name and a.value=b.value and a.date = b.date

This gives a list of all ID's that are the max...put it together:

select name,value, max(date)
from the_table_name_you_neglect_to_give_us
group by 1,2

union all

select name,value, max(date)
from the_table_name_you_neglect_to_give_us
where id not in
(select b.id from
(select name,value, max(date) date
from the_table_name_you_neglect_to_give_us
group by 1,2) a
inner join the_table_name_you_neglect_to_give_us b on a.name=b.name and a.value=b.value      and a.date = b.date)

Hoping my syntax is right...should be close at any rate. I'd put a bracket around that entire thing then select * from (above query) order by name...gives you the order you want.



回答3:

If I understand what you mean.. Then this MIGHT be helpful:

SELECT main.name, main.value, main.date 
FROM tablename AS main
LEFT OUTER JOIN tablename AS ctr
    ON main.name = ctr.rname
    AND main.date <= ctr.rdate
GROUP BY main.name, main.date
HAVING COUNT(*) <= 2
ORDER BY main.name ASC, main.date DESC

I know the SQL is shorter than the other posts, but just give it a try first..