I am pretty new to PG, and trying to convert from MSSQL.
I am working on a function that will return JSON results.
This one works:
Create Or Replace Function ExampleTable_SelectList()
Returns JSON As
$$
Select array_to_json(array_agg(row_to_json(t))) From
(Select id, value1, value2, From ExampleTable) t
$$ Language SQL;
Now, I want to call can update that returns a value and turn that value into JSON to return. So, this one gives an error on the set
command.
Create Or Replace Function ExampleTable_Update (id bigint, value1 text)
Returns JSON As
$$
Select row_to_json(t) From
(
Update ExampleTable
Set Value1 = value1
Where id= id
Returning Value1, Value2;
) t
$$ Language SQL;
I suspect that Postgres does not allow the UPDATE
statement as a subquery. Is there anyway around that?
I see two major problems:
1. You cannot put an UPDATE
into a subquery at all. You could solve that with a data-modifying CTE like Patrick demonstrates, but that is more expensive and verbose than necessary for the case at hand.
2. You have a potentially hazardous naming conflict, that hasn't been addressed yet.
Better query / function
Leaving the SQL function wrapper aside for the moment (we'll come back to that). You can use a simple UPDATE
with a RETURNING
clause:
UPDATE tbl
SET value1 = 'something_new'
WHERE id = 123
RETURNING row_to_json(ROW(value1, value2));
The RETURNING
clause allows arbitrary expressions involving columns of the updated row. That's shorter and cheaper than a data-modifying CTE.
The remaining problem: the row constructor ROW(...)
does not preserve column names (which is a known weakness), so you get generic keys in your JSON value:
row_to_json
{"f1":"something_new","f2":"what ever is in value2"}
In Postgres 9.3 you would need a CTE another function to encapsulate the first step or a cast to a well-defined row type. Details:
- Return as array of JSON objects in SQL (Postgres)
In Postgres 9.4 just use json_build_object()
or json_object()
:
UPDATE tbl
SET value1 = 'something_new'
WHERE id = 123
RETURNING json_build_object('value1', value1, 'value2', value2);
Or:
...
RETURNING json_object('{value1, value2}', ARRAY[value1, value2]);
Now you get original column names or whatever you chose as key names:
row_to_json
{"value1":"something_new","value2":"what ever is in value2"}
It's easy to wrap this in a function, which brings us to your second problem ...
Naming conflict
In your original function you use identical names for function parameters and column names. This is a generally a very bad idea. You would need to understand intimately which identifier comes first in which scope.
In the case at hand the result is utter nonsense:
Create Or Replace Function ExampleTable_Update (id bigint, value1 text) Returns
...
Update ExampleTable
Set Value1 = value1
Where id = id
Returning Value1, Value2;
...
$$ Language SQL;
While you seem to expect that the second instance of id
would reference the function parameter, it does not. The column name comes first within the scope of an SQL statement, the second instance references the column. resulting in an expression that is always true
except for NULL values in id
. Consequently, you would update all rows, which could lead to catastrophic loss of data.
What's worse, you might not even realize it until later, because the SQL function will return one arbitrary row as defined by the RETURNING
clause of the function (returns one row, not a set of rows).
In this particular case, you would get "lucky", because you also have value1 = value1
, which overwrites the column with its pre-existing value, effectively doing .. nothing in a very expensive way (unless triggers do something). You may be puzzled to get an arbitrary row with an unchanged value1
as result.
So, don't.
Avoid potential naming conflicts like this unless you know exactly what you are doing (which obviously isn't the case). One convention I like is to prepend an underscore for parameter and variable names in functions, while column names never start with an underscore. In many cases you can just use positional references to be unambiguous: $1
, $2
, ..., but that sidesteps only one half of the issue. Any method is good as long as you avoid naming conflicts. I suggest:
CREATE OR REPLACE FUNCTION foo (_id bigint, _value1 text)
RETURNS json AS
$func$
UPDATE tbl
SET value1 = _value1
WHERE id = _id
RETURNING json_build_object('value1', value1, 'value2', value2);
$func$ LANGUAGE sql;
Also note that this returns the actual column value in value1
after the UPDATE
, which may or may not be the same as your input parameter _value1
. There could be database rules or triggers interfering ...
You need to put the UPDATE
statement in a CTE:
CREATE OR REPLACE FUNCTION ExampleTable_Update (id bigint, value1 text) RETURNS json AS $$
WITH t(Value2) AS (
UPDATE ExampleTable
SET Value1 = $2
WHERE id = $1
RETURNING Value2)
SELECT row_to_json($2, Value2)
FROM t;
$$ LANGUAGE sql;
Note that I am using positional parameters $1
and $2
for the function parameters. The names of these parameters are the same as the column names in the table and that is generally a bad idea because of the potential for name-resolution conflicts; see Erwin Brandstetter's answer for a more elaborate explanation.