I'm trying to get data out of a table for a survey in a particular format. However all my attempts seems to hand the DB because of too many joins/too heavy on the DB.
My data looks like this:
id, user, question_id, answer_id,
1, 1, 1, 1
3, 1, 3, 15
4, 2, 1, 2
5, 2, 2, 12
6, 2, 3, 20
There are roughly 250,000 rows and each user has about 30 rows. I want the result to look like:
user0, q1, q2, q3
1, 1, NULL, 15
2, 2, 12, 20
So that each user has one row in the result, each with a separate column for each answer.
I'm using Postgres but answers in any SQL language would be appreciated as I could translate to Postgres.
EDIT: I also need to be able to deal with users not answering questions, i.e. in the example above q2 for user 1.
Consider the following demo:
Result:
user
is a reserved word. Don't use it as column name! I renamed it tousr
.You need to install the additional module tablefunc which provides the function
crosstab()
. Note that this operation is strictly per database. In PostgreSQL 9.1 you can simply:For older version you would execute a shell-script supplied in your
contrib
directory. In Debian, for PostgreSQL 8.4, that would be:Erwins answer is good, until missing answer for a user shows up. I'm going to make an assumption on you....you have a users table that has one row per user and you have a questions table that has one row per questions.
This statement will create a row for every user/question, and be in the same order. Turn it into a subquery and left join it to your data...
Plug that into Erwins crosstab statement and give him credit for the answer :P
I implemented a truly dynamic function to handle this problem without having to hard code any specific number of questions or use external modules/extensions. It also much simpler to use than
crosstab()
.You can find it here: https://github.com/jumpstarter-io/colpivot
Example that solves this particular problem:
Result: