TSQL Select Max

2019-05-07 07:47发布

问题:

Userid   FirstName   LastName        UserUpdate 
1        Dan         Kramer          1/1/2005  
1        Dan         Kramer          1/1/2007  
1        Dan         Kramer          1/1/2009  
2        Pamella     Slattery        1/1/2005  
2        Pam         Slattery        1/1/2006  
2        Pam         Slattery        1/1/2008  
3        Samamantha  Cohen           1/1/2008  
3        Sam         Cohen           1/1/2009  

I need to extract the latest updated for all these users, basically here's what I'm looking for:

Userid   FirstName   LastName        UserUpdate  
1        Dan         Kramer          1/1/2009     
2        Pam         Slattery        1/1/2008   
3        Sam         Cohen           1/1/2009  

Now when I run the following:

SELECT Userid, FirstName, LastName, Max(UserUpdate) AS MaxDate FROM Table GROUP BY Userid, FirstName, LastName

I still get duplicates, something like this:

Userid   FirstName   LastName        UserUpdate 
1        Dan         Kramer          1/1/2009  
2        Pamella     Slattery        1/1/2005  
2        Pam         Slattery        1/1/2008  
3        Samamantha  Cohen           1/1/2008  
3        Sam         Cohen           1/1/2009 

回答1:

try:

declare @Table table (userid int,firstname varchar(10),lastname varchar(20), userupdate datetime)
INSERT @Table VALUES (1, 'Dan'         ,'Kramer'          ,'1/1/2005')  
INSERT @Table VALUES (1, 'Dan'         ,'Kramer'          ,'1/1/2007')  
INSERT @Table VALUES (1, 'Dan'         ,'Kramer'          ,'1/1/2009')  
INSERT @Table VALUES (2, 'Pamella'     ,'Slattery'        ,'1/1/2005')  
INSERT @Table VALUES (2, 'Pam'         ,'Slattery'        ,'1/1/2006')  
INSERT @Table VALUES (2, 'Pam'         ,'Slattery'        ,'1/1/2008')  
INSERT @Table VALUES (3, 'Samamantha'  ,'Cohen'           ,'1/1/2008')
INSERT @Table VALUES (3, 'Sam'         ,'Cohen'           ,'1/1/2009') 

SELECT
    dt.Userid,dt.MaxDate
        ,MIN(a.FirstName) AS FirstName, MIN(a.LastName) AS LastName
    FROM (SELECT 
              Userid, Max(UserUpdate) AS MaxDate 
              FROM @Table GROUP BY Userid
         ) dt
        INNER JOIN @Table a ON dt.Userid=a.Userid and dt.MaxDate =a.UserUpdate
    GROUP BY dt.Userid,dt.MaxDate

OUTPUT:

Userid      MaxDate                 FirstName  LastName
----------- ----------------------- ---------- --------------------
1           2009-01-01 00:00:00.000 Dan        Kramer
2           2008-01-01 00:00:00.000 Pam        Slattery
3           2009-01-01 00:00:00.000 Sam        Cohen


回答2:

You aren't getting duplicates. 'Pam' is not equal to 'Pamella' from the perspective of the database; the fact that one is a colloquial shortening of the other doesn't mean anything to the database engine. There really is no reliable, universal way to do this (since there are names that have multiple abbreviations, like "Rob" or "Bob" for "Robert", as well as abbreviations that can suit multiple names like "Kel" for "Kelly" or "Kelsie", let alone the fact that names can have alternate spellings).

For your simple example, you could simply select and group by SUBSTRING(FirstName, 1, 3) instead of FirstName, but that's just a coincidence based upon your sample data; other name abbreviations would not fit this pattern.



回答3:

Or use a subquery...

SELECT
   a.userID,
   a.FirstName,
   a.LastName,
   b.MaxDate
FROM
      myTable a
   INNER JOIN
      (   SELECT
             UserID,
             Max(ISNULL(UserUpdate,GETDATE())) as MaxDate
          FROM
             myTable
          GROUP BY
             UserID
      ) b
   ON
          a.UserID = b.UserID
      AND a.UserUpdate = b.MaxDate

The subquery (named "b") returns the following:

Userid   UserUpdate  
1        1/1/2009     
2        1/1/2008   
3        1/1/2009 

The INNER JOIN between the subquery and the original table causes the original table to be filtered for matching records only -- i.e., only records with a UserID/UserUpdate pair that matches a UserID/MaxDate pair from the subquery will be returned, giving you the unduplicated result set you were looking for:

Userid   FirstName   LastName        UserUpdate  
1        Dan         Kramer          1/1/2009     
2        Pam         Slattery        1/1/2008   
3        Sam         Cohen           1/1/2009  

Of course, this is just a work-around. If you really want to solve the problem for the long-term, you should normalize your original table by splitting it into two.

Table1:

Userid   FirstName   LastName 
1        Dan         Kramer   
2        Pam         Slattery 
3        Sam         Cohen

Table2:

Userid   UserUpdate  
1        1/1/2007     
2        1/1/2007   
3        1/1/2007  
1        1/1/2008     
2        1/1/2008   
3        1/1/2008 
1        1/1/2009     
2        1/1/2009   
3        1/1/2009 

This would be a more standard way to store data, and would be much easier to query (without having to resort to a subquery). In that case, the query would look like this:

SELECT
   T1.UserID,
   T1.FirstName,
   T1.LastName,
   MAX(ISNULL(T2.UserUpdate,GETDATE()))
FROM
      Table1 T1
   LEFT JOIN
      Table2 T2
   ON
      T1.UserID = T2.UserID
GROUP BY
   T1.UserID,
   T1.FirstName,
   T1.LastName


回答4:

Another alternative if you have SQL 2005(I think ?) or later would be to use a Common Table Expression and pull out the user id and max date from the table then join against that to get the matching firstname and lastname on the max date. NOTE - this assumes that userid + date would always be unique, the query will break if you get 2 rows with same userid and date. As others have already pointed out this is pretty awful database design - but sometimes thats life, the problem must still be solved. e.g.

declare @Table table (userid int,firstname varchar(10),lastname varchar(20), userupdate datetime) 
INSERT @Table VALUES (1, 'Dan'         ,'Kramer'          ,'1/1/2005')   
INSERT @Table VALUES (1, 'Dan'         ,'Kramer'          ,'1/1/2007')   
INSERT @Table VALUES (1, 'Dan'         ,'Kramer'          ,'1/1/2009')   
INSERT @Table VALUES (2, 'Pamella'     ,'Slattery'        ,'1/1/2005')  
INSERT @Table VALUES (2, 'Pam'         ,'Slattery'        ,'1/1/2006')   
INSERT @Table VALUES (2, 'Pam'         ,'Slattery'        ,'1/1/2008')   
INSERT @Table VALUES (3, 'Samamantha'  ,'Cohen'           ,'1/1/2008') 
INSERT @Table VALUES (3, 'Sam'         ,'Cohen'           ,'1/1/2009'); 



with cte ( userid , maxdt ) as 
    (select userid, 
            max(userupdate) 
    from @table 
    group by userid)


SELECT  dt.Userid,
        dt.firstname,
        dt.lastname,
        cte.maxdt

FROM    
    @Table dt
    join cte on cte.userid = dt.userid and dt.userupdate = cte.maxdt

Output

Userid      firstname  lastname             maxdt
----------- ---------- -------------------- -----------------------
3           Sam        Cohen                2009-01-01 00:00:00.000
2           Pam        Slattery             2008-01-01 00:00:00.000
1           Dan        Kramer               2009-01-01 00:00:00.000


标签: tsql max