I have to resort to raw SQL where the ORM is falling short (using Django 1.7). The problem is that most of the queries end up being 80-90% similar. I cannot figure out a robust & secure way to build queries without violating re-usability.
Is string concatenation the only way out, i.e. build parameter-less query strings using if-else
conditions, then safely include the parameters using prepared statements (to avoid SQL injection). I want to follow a simple approach for templating SQL for my project instead of re-inventing a mini ORM.
For example, consider this query:
SELECT id, name, team, rank_score
FROM
( SELECT id, name, team
ROW_NUMBER() OVER (PARTITION BY team
ORDER BY count_score DESC) AS rank_score
FROM
(SELECT id, name, team
COUNT(score) AS count_score
FROM people
INNER JOIN scores on (scores.people_id = people.id)
GROUP BY id, name, team
) AS count_table
) AS rank_table
WHERE rank_score < 3
How can I:
a) add optional WHERE
constraint on people
or
b) change INNER JOIN
to LEFT OUTER
or
c) change COUNT
to SUM
or
d) completely skip the OVER / PARTITION
clause?
Better query
For starters you can fix the syntax, simplify and clarify quite a bit:
SELECT *
FROM (
SELECT p.person_id, p.name, p.team, sum(s.score)::int AS score
,rank() OVER (PARTITION BY p.team
ORDER BY sum(s.score) DESC)::int AS rnk
FROM person p
JOIN score s USING (person_id)
GROUP BY 1
) sub
WHERE rnk < 3;
Building on my updated table layout. See fiddle below.
You do not need the additional subquery. Window functions are executed after aggregate functions, so you can nest it like demonstrated.
While talking about "rank", you probably want to use rank()
, not row_number()
.
Assuming people.people_id
is the PK, you can simplify GROUP BY
.
Be sure to table-qualify all column names that might be ambiguous
PL/pgSQL function
Then I would write a plpgsql function that takes parameters for your variable parts.
Implementing a
- c
of your points. d
is unclear, leaving that for you to add.
CREATE OR REPLACE FUNCTION f_demo(_agg text DEFAULT 'sum'
, _left_join bool DEFAULT FALSE
, _where_name text DEFAULT NULL)
RETURNS TABLE(person_id int, name text, team text, score int, rnk int) AS
$func$
DECLARE
_agg_op CONSTANT text[] := '{count, sum, avg}'; -- allowed functions
_sql text;
BEGIN
-- assert --
IF _agg ILIKE ANY (_agg_op) THEN
-- all good
ELSE
RAISE EXCEPTION '_agg must be one of %', _agg_op;
END IF;
-- query --
_sql := format('
SELECT *
FROM (
SELECT p.person_id, p.name, p.team, %1$s(s.score)::int AS score
,rank() OVER (PARTITION BY p.team
ORDER BY %1$s(s.score) DESC)::int AS rnk
FROM person p
%2$s score s USING (person_id)
%3$s
GROUP BY 1
) sub
WHERE rnk < 3
ORDER BY team, rnk'
, _agg
, CASE WHEN _left_join THEN 'LEFT JOIN' ELSE 'JOIN' END
, CASE WHEN _where_name <> '' THEN 'WHERE p.name LIKE $1' ELSE '' END
);
-- debug -- quote when tested ok
-- RAISE NOTICE '%', _sql;
-- execute -- unquote when tested ok
RETURN QUERY EXECUTE _sql
USING _where_name; -- $1
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM f_demo();
SELECT * FROM f_demo('sum', TRUE, '%2');
SELECT * FROM f_demo('avg', FALSE);
SELECT * FROM f_demo(_where_name := '%1_'); -- named param
SQL Fiddle
You need a firm understanding of PL/pgSQL. Else, there is just too much to explain. You'll find related answers here on SO under plpgsql for practically every detail in the answer.
All parameters are treated safely, no SQL injection possible. More:
- Define table and column names as arguments in a plpgsql function?
- Table name as a PostgreSQL function parameter
Note in particular, how a WHERE
clause is added conditionally (when _where_name
is passed) with the positional parameter $1
in the query sting. The value is passed to EXECUTE
as value with the USING
clause. No type conversion, no escaping, no chance for SQL injection. Examples:
- Row expansion via "*" is not supported here
- SQL state: 42601 syntax error at or near "11"
- Refactor a PL/pgSQL function to return the output of various SELECT queries
Use DEFAULT
values for function parameters, so you are free to provide any or none. More:
- Functions with variable number of input parameters
- The manual on calling functions
The function format()
is instrumental for building complex dynamic SQL strings in a safe and clean fashion.