Transposing an sql result so that one column goes

2020-01-27 07:55发布

I'm trying to get data out of a table for a survey in a particular format. However all my attempts seems to hand the DB because of too many joins/too heavy on the DB.

My data looks like this:

id, user, question_id, answer_id, 
1,   1,   1,           1
3,   1,   3,           15
4,   2,   1,           2
5,   2,   2,           12
6,   2,   3,           20

There are roughly 250,000 rows and each user has about 30 rows. I want the result to look like:

user0, q1, q2,   q3 
1,     1,  NULL, 15
2,     2,  12,   20 

So that each user has one row in the result, each with a separate column for each answer.

I'm using Postgres but answers in any SQL language would be appreciated as I could translate to Postgres.

EDIT: I also need to be able to deal with users not answering questions, i.e. in the example above q2 for user 1.

3条回答
趁早两清
2楼-- · 2020-01-27 08:20

Consider the following demo:

CREATE TEMP TABLE qa (id int, usr int, question_id int, answer_id int);
INSERT INTO qa VALUES
 (1,1,1,1)
,(2,1,2,9)
,(3,1,3,15)
,(4,2,1,2)
,(5,2,2,12)
,(6,2,3,20);

SELECT *
FROM   crosstab('
    SELECT usr::text
          ,question_id
          ,answer_id
    FROM qa
    ORDER BY 1,2')
 AS ct (
     usr text
    ,q1 int
    ,q2 int
    ,q3 int);

Result:

 usr | q1 | q2 | q3
-----+----+----+----
 1   |  1 |  9 | 15
 2   |  2 | 12 | 20
(2 rows)

user is a reserved word. Don't use it as column name! I renamed it to usr.

You need to install the additional module tablefunc which provides the function crosstab(). Note that this operation is strictly per database. In PostgreSQL 9.1 you can simply:

CREATE EXTENSION tablefunc;

For older version you would execute a shell-script supplied in your contrib directory. In Debian, for PostgreSQL 8.4, that would be:

psql mydb -f /usr/share/postgresql/8.4/contrib/tablefunc.sql
查看更多
萌系小妹纸
3楼-- · 2020-01-27 08:20

Erwins answer is good, until missing answer for a user shows up. I'm going to make an assumption on you....you have a users table that has one row per user and you have a questions table that has one row per questions.

select usr, question_id
from users u inner join questions q on 1=1
order by 1,

This statement will create a row for every user/question, and be in the same order. Turn it into a subquery and left join it to your data...

select usr,question_id,qa.answer_id
from
(select usr, question_id
from users u inner join questions q on 1=1
)a
left join qa on qa.usr = a.usr and qa.question_id = a.usr
order by 1,2

Plug that into Erwins crosstab statement and give him credit for the answer :P

查看更多
太酷不给撩
4楼-- · 2020-01-27 08:38

I implemented a truly dynamic function to handle this problem without having to hard code any specific number of questions or use external modules/extensions. It also much simpler to use than crosstab().

You can find it here: https://github.com/jumpstarter-io/colpivot

Example that solves this particular problem:

begin;

create temp table qa (id int, usr int, question_id int, answer_id int);
insert into qa values
 (1,1,1,1)
,(2,1,2,9)
,(3,1,3,15)
,(4,2,1,2)
,(5,2,2,12)
,(6,2,3,20);

select colpivot('_output', $$
    select usr, ('q' || question_id::text) question_id, answer_id from qa
$$, array['usr'], array['question_id'], '#.answer_id', null);

select * from _output;

rollback;

Result:

 usr | 'q1' | 'q2' | 'q3' 
-----+------+------+------
   1 |    1 |    9 |   15
   2 |    2 |   12 |   20
(2 rows)
查看更多
登录 后发表回答