I need to select only the first row from a query that joins tables A and B. On table B exist multiple records with same name. There are not identifiers in any of the two tables. I cannot change the scheme either because I do not own the DB.
TABLE A
NAME
TABLE B
NAME
DATA1
DATA2
Select Distinct A.NAME,B.DATA1,B.DATA2
From A
Inner Join B on A.NAME = B.NAME
This gives me
NAME DATA1 DATA2
sameName 1 2
sameName 1 3
otherName 5 7
otherName 8 9
but I need to retrieve only one row per name
NAME DATA1 DATA2
sameName 1 2
otherName 5 7
I was able to do this by adding the result into a temp table with a identity column and then select the minimum id per name.
The problem here is that I require to do this in one single statement.
If you can add to a temp table and then query from that, you can do it in one go.
There are many other ways to write this, but that's how I've been doing similar things.
The tag of this question indicates that it would be a solution for DB2, but this is very similar to MS-SQL server, if so try these solutions:
Using CROSS, it will be possible to display what exists only in both tables
But it is possible to change to OUTER to display what exists in A without the obligation to exist in B
In the structure of the apply statement, it would also be possible to include an ORDER statement, since there is no indication of the order of exits in table B
You can use row number to get one row for each name, try something like below
Try to dedupe B like this
Add more OR clauses if more DATAx columns exist.
If A contains duplicates too, simply use DISTINCT as in the OP.
This will work:
If you want to select the least value of data1 and within it data2, then use this variation:
Both the queries will give one row per name.
Not sure if this will solve your problem or not, but you could try using the GROUP BY clause and group by one of the name columns.
DB2 Group by tutorial