Using CASE in PostgreSQL to affect multiple column

2019-03-12 01:43发布

问题:

I have a Postgres SELECT statement with these expressions:

,CASE WHEN (rtp.team_id = rtp.sub_team_id)
 THEN 'testing'
 ELSE TRIM(rtd2.team_name)
 END AS testing_testing
,CASE WHEN (rtp.team_id = rtp.sub_team_id)
 THEN 'test example'
 ELSE TRIM(rtd2.normal_data)
 END AS test_response
,CASE WHEN (rtp.team_id = rtp.sub_team_id)
 THEN 'test example #2'
 ELSE TRIM(rtd2.normal_data_2)
 END AS another_example

In my particular query there are 5 fields whose output depends on whether rtp.team_id = rtp.sub_team_id evaluates true. I'm repeating CASE statements with the same condition over and over.

Is there any way I can combine these CASE expressions to toggle the output of multiple columns in one shot?

回答1:

1. Standard-SQL: LEFT JOIN a single row of values

You could LEFT JOIN a row of values using the condition (thereby evaluating it once). Then you can add fallback values per column with COALESCE().

This syntax variant is shorter and slightly faster with multiple values - especially interesting for an expensive / lengthy condition:

SELECT COALESCE(x.txt1, trim(r2.team_name))     AS testing_testing
     , COALESCE(x.txt2, trim(r2.normal_data))   AS test_response
     , COALESCE(x.txt3, trim(r2.normal_data_2)) AS another_example
FROM   rtp
JOIN   rtd2 r2 ON <unknown condition> -- missing context in question
LEFT   JOIN (
   SELECT 'testing'::text         AS txt1
        , 'test example'::text    AS txt2
        , 'test example #2'::text AS txt3
   ) x ON rtp.team_id = rtp.sub_team_id;

Since the derived table x consists of a single row, joining without further conditions is fine.

Explicit type casts are necessary in the subquery. I use text in the example (which is the default for string literals anyway). Use your actual data types. The syntax shortcut value::type is Postgres-specific, use cast(value AS type) for standard SQL.

If the condition is not TRUE, all values in x are NULL, and COALESCE kicks in.

Or, since all candidate values come from table rtd2 in your particular case, LEFT JOIN to rtd2 using the original CASE condition and CROSS JOIN to a row with default values:

SELECT COALESCE(trim(r2.team_name),     x.txt1) AS testing_testing
     , COALESCE(trim(r2.normal_data),   x.txt2) AS test_response
     , COALESCE(trim(r2.normal_data_2), x.txt3) AS another_example
FROM   rtp
LEFT   JOIN rtd2 r2 ON <unknown condition>  -- missing context in question
                   AND rtp.team_id = rtp.sub_team_id
CROSS  JOIN (
   SELECT 'testing'::text         AS txt1
        , 'test example'::text    AS txt2
        , 'test example #2'::text AS txt3
   ) x;

It depends on the join conditions and the rest of the query.

2. PostgreSQL-specific

2a. Expand an array

If your various columns share the same data type, you can use an array in a subquery and expand it in the outer SELECT:

SELECT x.combo[1], x.combo[2], x.combo[3]
FROM  (
   SELECT CASE WHEN rtp.team_id = rtp.sub_team_id
            THEN '{test1,test2,test3}'::text[]
            ELSE ARRAY[trim(r2.team_name)
                     , trim(r2.normal_data)
                     , trim(r2.normal_data_2)]
          END AS combo
   FROM   rtp
   JOIN   rtd2 r2 ON <unknown condition>
   ) x;

It gets more complicated if the columns don't share the same data type. You can either cast them all to text (and optionally convert back in the outer SELECT), or you can ...

2b. Decompose a row type

You can use a custom composite type (row type) to hold values of various types and simply *-expand it in the outer SELECT. Say we have three columns: text, integer and date. For repeated use, create a custom composite type:

CREATE TYPE my_type (t1 text, t2 int, t3 date);

Or if the type of an existing table matches, you can just use the table name as composite type.

Or if you only need the type temporarily, you can create a TEMPORARY TABLE, which registers a temporary type for the duration of your session:

CREATE TEMP TABLE my_type (t1 text, t2 int, t3 date);

You could even do this for a single transaction:

CREATE TEMP TABLE my_type (t1 text, t2 int, t3 date) ON COMMIT DROP;

Then you can use this query:

SELECT (x.combo).*  -- parenthesis required
FROM  (
   SELECT CASE WHEN rtp.team_id = rtp.sub_team_id
             THEN ('test', 3, now()::date)::my_type  -- example values
             ELSE (r2.team_name
                 , r2.int_col
                 , r2.date_col)::my_type
          END AS combo
   FROM   rtp
   JOIN   rtd2 r2 ON <unknown condition>
   ) x;

Or even just (same as above, simpler, shorter, maybe less easy to understand):

SELECT (CASE WHEN rtp.team_id = rtp.sub_team_id
           THEN ('test', 3, now()::date)::my_type
           ELSE (r2.team_name, r2.int_col, r2.date_col)::my_type
        END).*
FROM   rtp
JOIN   rtd2 r2 ON <unknown condition>;

The CASE expression is evaluated once for every column this way. If the evaluation is not trivial, the other variant with a subquery will be faster.



回答2:

Not sure that it would be an improvement, but you could union the SELECT one way with itself the other way:

SELECT 
  ...,
  'testing' AS testing_testing,
  'test example' AS test_response,
  'test example #2' AS another_example, ...
FROM ...
WHERE rtp.team_id = rtp.sub_team_id AND ...
UNION 
SELECT
  ...,
  TRIM(rtd2.team_name) AS testing_testing,
  TRIM(rtd2.normal_data) AS test_response,
  TRIM(rtd2.normal_data_2) AS another_example, ...
WHERE rtp.team_id <> rtp.sub_team_id AND ...;

The column names can safely be omitted from the second query, assuming you bring them out in the same order as in the first.

You may want to make each of those a separate query using common table expressions (CTEs). If you're worried about this changing the order, you can make it a subquery and apply an ORDER BY around it.