I've got an SQLite database with a bunch of test/exam questions. Each question belongs to one question category.
My table looks like this:
The goal
What I'm trying to do is select 5 random questions, but the result must contain at least one from each category. The goal is to select a random set of questions with questions from each category.
For example, the output could be question IDs 1, 2, 5, 7, 8
, or 2, 3, 6, 7, 8
or 8, 6, 3, 1, 7
.
ORDER BY category_id, RANDOM()
I can get a random list of questions from SQLite by executing the SQL below, but how would I make sure that the result contains a question from each of my categories?
Basically, I'm looking for something like this, the SQLite version.
I would like to get only 5 results, but one(or more) from each category, with all the categories represented in the result set.
Bounty
Added a bounty because I'm curious whether or not it is possible to accomplish this in SQLite only. I can do it in SQLite+Java, but is there a way to do this in SQLite only? :)
The key to the answer is that there are two kinds of questions in the result: for each category, one question that must be constrained to come from that category; and some remaining questions.
First, the constrained questions: we just select one record from each category:
(This query relies on a feature introduced in SQLite 3.7.11 (in Jelly Bean or later): in a query
SELECT a, max(b)
, the value ofa
is guaranteed to come from the record that has the maximumb
value.)We also have to get the non-constrained questions (filtering out the duplicates that are already in the constrained set will happen in the next step):
When we combine these two queries with
UNION
and then group by theid
, we have all the duplicates together. Selectingmax(constrained)
then ensures that for the groups that have duplicates, only the constrained question remains (while all the other questions have only one record per group anyway).Finally, the
ORDER BY
clause ensures that the constrained questions come first, followed by some random other questions:For earlier SQLite/Android versions, I haven't found a solution without using a temporary table (because the subquery for the constrained question must be used multiple times, but does not stay constant because of the
random()
):Since it's sqlite (thus local): How slow would it be to just query until you have 5 answers and four different categories, dropping the duplicate category rows each iteration.
I think, if each category is equally represented, that it would be highly unlikely that you need more than 3 iterations which should still be below a second.
It's not algorithmically nice, but to me using random() in a SQL statement isn't algorithmically nice anyway.
Basically what you are looking for is select top N max values. I spend 3-4 hours in the morning for searching it. ( still i haven't success in it, you may need to wait few more hours ).
For the temporary solution you can use group by option as follows,
String strQuery = "SELECT * FROM so_questions group by category_id;";
the output is as follows,
will be back with exact your requirement.