Microsoft SQL - Remove duplicate data from query r

2019-08-26 22:00发布

问题:

I am new to SQL Server and need help with one of my SQL query.

I have 2 tables (Rating and LikeDislike).

I am trying to get data from both of these tables using a LEFT JOIN like this:

SELECT distinct LD.TopicID, R.ID, R.Topic, R.CountLikes, R.CountDisLikes, LD.UserName, LD.Clikes

FROM Rating As R

LEFT JOIN LikeDislike AS LD on LD.TopicID = R.ID

The above SELECT statement displays results fine but also includes duplicates. I want to remove duplicates when the data is displayed, I tried using DISTINCT and GROUP BY, but with no luck, maybe because I am not using it correctly.

To be more clear and less confusing let me tell you what exactly each table does and what I am trying to achieve.

The Rating table has following columns (ID, Topic, CountLikes, CountDisLikes, Extra, CreatedByUser). It stores topic information and number of likes and dislikes for each topics and the UserID of the user who created that topic.

Rating table with sample data

ID  Topic               CountLikes  CountDisLikes  Extra        CreatedByUser
1   Do You Like This       211          58          YesId                2
2   Or This                17           25          This also            3
79  Testing at home        1             0          Testing at home      2
80  Testing at home again  1             0          Testing              2
82  testing dislikes       0             1          Testing              2
76  Testing part 3         7             5          Testing 3            4
77  Testing part 4         16            6          Testing 4            5

The LikeDisLike table has following columns (ID, TopicID, UserName, Clikes). TopicID is a FK to the ID column in Rating table.

LikeDislike table with sample data

ID  TopicID UserName    Clikes
213     77      2       TRUE
214     76      2       FALSE
215     77      5       TRUE
194     77      3       TRUE
195     76      3       FALSE
196     2       3       TRUE
197     1       3       FALSE

Now what I am trying to do is get information from both of this table without duplicate rows. I need to get data all the columns from Rating table + UserName and Clikes columns from LikeDislike table without any duplicate rows

Below are the results with duplicates

TopicID ID  Topic            CountLikes   CountDislikes UserName    Clikes
NULL    79  Testing at home    1           0             NULL       NULL
NULL    80  Testing at home2   1           0             NULL       NULL 
NULL    82  testing dislikes   0           1             NULL       NULL
1       1   Do You Like This   211         58            3          FALSE
2       2   Or This            17          25            3          TRUE
76      76  Testing part 3     7           5             2          FALSE
76      76  Testing part 3     7           5             3          FALSE
77      77  Testing part 4     16          6             2          TRUE
77      77  Testing part 4     16          6             3          TRUE
77      77  Testing part 4     16          6             5          TRUE

回答1:

Just like in yesterday's post, I don't think you understand what DISTINCT is suppose to return you. Because you have different values in your LikeDislike table, you are returning the DISTINCT rows.

Let's take TopicId 77 for instance. It returns 3 DISTINCT rows because you have 3 matching records in your LikeDislike table. If your desired output is a single row where the UserName and Clikes are comma delimted, that is possible -- look into using for xml and perhaps stuff (here is a recent answer on the subject). Or if you want to return the first row that matches the TopicId, then that is possible as well -- look into using a subquery with row_number.

Please let us know your desired output and we can help provide a solution.

Good luck.



标签: sql tsql