Removing duplicates from a SQL query (not just “us

2020-02-10 17:37发布

It's probably simple, here is my query:

SELECT DISTINCT U.NAME, P.PIC_ID
FROM USERS U, PICTURES P, POSTINGS P1
WHERE U.EMAIL_ID = P1.EMAIL_ID AND P1.PIC_ID = P.PIC_ID AND P.CAPTION LIKE '%car%';

but this will only remove duplicates where a row has both the same u.name and p.pic_id. I want it so if there is any duplicates of the names, it just leaves out the other rows. It's a weird query, but in general, how can I apply the distinct to a single column of the SELECT clause?

5条回答
Root(大扎)
2楼-- · 2020-02-10 18:17

Arbitrarily choosing to keep the minimum PIC_ID. Also, avoid using the implicit join syntax.

SELECT U.NAME, MIN(P.PIC_ID)
    FROM USERS U
        INNER JOIN POSTINGS P1
            ON U.EMAIL_ID = P1.EMAIL_ID
        INNER JOIN PICTURES P
            ON P1.PIC_ID = P.PIC_ID
    WHERE P.CAPTION LIKE '%car%'
    GROUP BY U.NAME;
查看更多
劳资没心,怎么记你
3楼-- · 2020-02-10 18:18

If I understand you correctly, you want to list to exclude duplicates on one column only, inner join to a sub-select

select u.* [whatever joined values]
from users u
inner join
(select name from users group by name having count(*)=1) uniquenames
on uniquenames.name = u.name
查看更多
对你真心纯属浪费
4楼-- · 2020-02-10 18:22

You need to tell the query what value to pick for the other columns, MIN or MAX seem like suitable choices.

 SELECT
   U.NAME, MIN(P.PIC_ID)
 FROM
   USERS U,
   PICTURES P,
   POSTINGS P1
 WHERE
   U.EMAIL_ID = P1.EMAIL_ID AND
   P1.PIC_ID = P.PIC_ID AND
   P.CAPTION LIKE '%car%'
 GROUP BY
   U.NAME;
查看更多
叛逆
5楼-- · 2020-02-10 18:24

If I understand you correctly, you want a list of all pictures with the same name (and their different ids) such that their name occurs more than once in the table. I think this will do the trick:

SELECT U.NAME, P.PIC_ID
FROM USERS U, PICTURES P, POSTINGS P1
WHERE U.EMAIL_ID = P1.EMAIL_ID AND P1.PIC_ID = P.PIC_ID AND U.Name IN (
SELECT U.Name 
FROM USERS U, PICTURES P, POSTINGS P1
WHERE U.EMAIL_ID = P1.EMAIL_ID AND P1.PIC_ID = P.PIC_ID AND P.CAPTION LIKE '%car%';
GROUP BY U.Name HAVING COUNT(U.Name) > 1)

I haven't executed it, so there may be a syntax error or two there.

查看更多
Animai°情兽
6楼-- · 2020-02-10 18:30

Your question is kind of confusing; do you want to show only one row per user, or do you want to show a row per picture but suppress repeating values in the U.NAME field? I think you want the second; if not there are plenty of answers for the first.

Whether to display repeating values is display logic, which SQL wasn't really designed for. You can use a cursor in a loop to process the results row-by-row, but you will lose a lot of performance. If you have a "smart" frontend language like a .NET language or Java, whatever construction you put this data into can be cheaply manipulated to suppress repeating values before finally displaying it in the UI.

If you're using Microsoft SQL Server, and the transformation HAS to be done at the data layer, you may consider using a CTE (Computed Table Expression) to hold the initial query, then select values from each row of the CTE based on whether the columns in the previous row hold the same data. It'll be more performant than the cursor, but it'll be kinda messy either way. Observe:

USING CTE (Row, Name, PicID)
AS
(
    SELECT ROW_NUMBER() OVER (ORDER BY U.NAME, P.PIC_ID),
       U.NAME, P.PIC_ID
    FROM USERS U
        INNER JOIN POSTINGS P1
            ON U.EMAIL_ID = P1.EMAIL_ID
        INNER JOIN PICTURES P
            ON P1.PIC_ID = P.PIC_ID
    WHERE P.CAPTION LIKE '%car%'
    ORDER BY U.NAME, P.PIC_ID 
)
SELECT
    CASE WHEN current.Name == previous.Name THEN '' ELSE current.Name END,
    current.PicID
FROM CTE current
LEFT OUTER JOIN CTE previous
   ON current.Row = previous.Row + 1
ORDER BY current.Row

The above sample is TSQL-specific; it is not guaranteed to work in any other DBPL like PL/SQL, but I think most of the enterprise-level SQL engines have something similar.

查看更多
登录 后发表回答