Filtering Database Results to Top n Records for Ea

2019-08-07 06:56发布

问题:

Let's say I have two tables in my database.

TABLE:Categories
ID|CategoryName
01|CategoryA
02|CategoryB
03|CategoryC

and a table that references the Categories and also has a column storing some random number.

TABLE:CategoriesAndNumbers
CategoryType|Number
   CategoryA|24
   CategoryA|22
   CategoryC|105
   .....(20,000 records)
   CategoryB|3

Now, how do I filter out this data? So, I want to know what the 3 smallest numbers are out of each category and delete the rest. The end result would be like this:

TABLE:CategoriesAndNumbers
CategoryType|Number
   CategoryA|2
   CategoryA|5
   CategoryA|18
   CategoryB|3
   CategoryB|500
   CategoryB|1601
   CategoryC|1
   CategoryC|4
   CategoryC|62

Right now, I can get the smallest numbers between all the categories, but I would like each category to be compared individually.

EDIT: I'm using Access and here's my code so far

SELECT TOP 10 cdt1.sourceCounty, cdt1.destCounty, cdt1.distMiles
FROM countyDistanceTable as cdt1, countyTable
WHERE cdt1.sourceCounty = countyTable.countyID
ORDER BY cdt1.sourceCounty, cdt1.distMiles, cdt1.destCounty

EDIT2: Thanks to Remou, here would be the working query that solved my problem. Thank you!

DELETE
FROM CategoriesAndNumbers a
WHERE a.Number NOT IN (
    SELECT Top 3 [Number] 
    FROM CategoriesAndNumbers b 
    WHERE b.CategoryType=a.CategoryType 
    ORDER BY [Number])

回答1:

You could use something like:

SELECT a.CategoryType, a.Number
FROM CategoriesAndNumbers a
WHERE a.Number IN (
    SELECT Top 3 [Number] 
    FROM CategoriesAndNumbers b 
    WHERE b.CategoryType=a.CategoryType 
    ORDER BY [Number])
ORDER BY a.CategoryType

The difficulty with this is that Jet/ACE Top selects duplicate values where they exist, so you will not necessarily get three values, but more, if there are ties. The problem can often be solved with a key field, if one exists :

WHERE a.Number IN (
    SELECT Top 3 [Number] 
    FROM CategoriesAndNumbers b 
    WHERE b.CategoryType=a.CategoryType 
    ORDER BY [Number], [KeyField])

However, I do not think it will help in this instance, because the outer table will include ties.



回答2:

Order it by number and take 3, find out what the biggest number is and then remove rows where Number is greater than the Number.

I imagine it would need to be two seperate queries as your business tier would hold the value for the biggest number out of the 3 results and dynamically build the query to delete the rest.