I'm writing a polymorphic PL/pgSQL function that iterates over an array. I am interested in using FOREACH
, however I cannot figure out how to declare a temporary variable with the right type.
My function is below, for more information see the comment on line 4.
CREATE OR REPLACE FUNCTION uniq(ary anyarray) RETURNS anyarray AS $$
DECLARE
ret ary%TYPE := '{}';
v ???; -- how do I get the element type of @ary@?
BEGIN
IF ary IS NULL THEN
return NULL;
END IF;
FOREACH v IN ARRAY ary LOOP
IF NOT v = any(ret) THEN
ret = array_append(ret, v);
END IF;
END LOOP;
RETURN ret;
END;
$$ LANGUAGE plpgsql;
Answer to primary question
AFAIK, you cannot declare a variable of a polymorphic type without a "template" variable or parameter.
There are related examples in the manual at the end of the chapter Declaring Function Parameters, but this trick is not covered: add another IN
, INOUT
or OUT
parameter with data type ANYELEMENT
to the function definition. It resolves to the matching element type automatically and can be (ab)used as variable inside the function body directly or as template for more variables:
CREATE OR REPLACE FUNCTION uniq1(ary ANYARRAY, v ANYELEMENT = NULL)
RETURNS anyarray AS
$func$
DECLARE
ret ary%TYPE := '{}';
some_var v%TYPE; -- we could declare more variables now
-- but we don't need to
BEGIN
IF ary IS NULL THEN
RETURN NULL;
END IF;
FOREACH v IN ARRAY ary LOOP -- instead, we can use v directly
IF NOT v = any(ret) THEN
ret := array_append(ret, v);
END IF;
END LOOP;
RETURN ret;
END
$func$ LANGUAGE plpgsql;
Related:
- Can I make a plpgsql function return an integer without using a variable?
Copying types like that only works in the DECLARE
section and is different type casting. It is explained in the manual here.
Assign a default value, so the added parameter does not have to be included in the function call: ANYELEMENT
= NULL
Call (unchanged):
SELECT uniq1('{1,2,1}'::int[]);
SELECT uniq1('{foo,bar,bar}'::text[]);
Better function
I would actually use an OUT parameter for convenience and invert the test logic:
CREATE OR REPLACE FUNCTION uniq2(ary ANYARRAY, elem ANYELEMENT = NULL
, OUT ret ANYARRAY)
RETURNS anyarray AS
$func$
BEGIN
IF ary IS NULL
THEN RETURN;
ELSE ret := '{}'; -- init
END IF;
FOREACH elem IN ARRAY ary LOOP
IF elem = ANY(ret) THEN -- do nothing
ELSE
ret := array_append(ret, elem);
END IF;
END LOOP;
END
$func$ LANGUAGE plpgsql;
But this still does not cover all cases containing NULL elements.
Proper function
To work for NULL elements as well:
CREATE OR REPLACE FUNCTION uniq3(ary ANYARRAY, elem ANYELEMENT = NULL
, OUT ret ANYARRAY)
RETURNS anyarray AS
$func$
BEGIN
IF ary IS NULL
THEN RETURN;
ELSE ret := '{}'; -- init
END IF;
FOREACH elem IN ARRAY ary LOOP
IF elem IS NULL THEN -- special test for NULL
IF array_length(array_remove(ret, NULL), 1) = array_length(ret, 1) THEN
ret := array_append(ret, NULL);
END IF;
ELSIF elem = ANY(ret) THEN -- do nothing
ELSE
ret := array_append(ret, elem);
END IF;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Checking for NULL in an array is a bit of a pain:
- How to determine if NULL is contained in an array in Postgres?
All of these functions are just proof of concept. I would use neither. Instead:
Superior solutions with plain SQL
In Postgres 9.4 use WITH ORDINALITY
to preserve original order of elements.
Detailed explanation:
- PostgreSQL unnest() with element number
Basic code for single value:
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (elem) elem, i
FROM unnest('{1,2,1,NULL,4,NULL}'::int[]) WITH ORDINALITY u(elem, i)
ORDER BY elem, i
) sub
ORDER BY i) AS uniq;
Returns:
uniq
------------
{1,2,NULL,4}
About DISTINCT ON
:
- Select first row in each GROUP BY group?
Built into a query:
SELECT *
FROM test t
, LATERAL (
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (elem) elem, i
FROM unnest(t.arr) WITH ORDINALITY u(elem, i)
ORDER BY elem, i
) sub
ORDER BY i) AS arr
) a;
This has a tiny corner case: it returns an empty array a NULL array. To cover all bases:
SELECT t.*, CASE WHEN t.arr IS NULL THEN NULL ELSE a.arr END AS arr
FROM test t
, LATERAL (
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (elem) elem, ord
FROM unnest(t.arr) WITH ORDINALITY u(elem, ord)
ORDER BY elem, ord
) sub
ORDER BY ord) AS arr
) a;
Or:
SELECT *
FROM test t
LEFT JOIN LATERAL (
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (elem) elem, i
FROM unnest(t.arr) WITH ORDINALITY u(elem, i)
ORDER BY elem, i
) sub
ORDER BY i) AS arr
) a ON t.arr IS NOT NULL;
In Postgres 9.3 or older you can substitute with generate_subscripts()
:
SELECT *
FROM test t
, LATERAL (
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (t.arr[i]) t.arr[i] AS elem, i
FROM generate_subscripts(t.arr, 1) i
ORDER BY t.arr[i], i
) sub
ORDER BY i
) AS arr
) a;
We need this in sqlfiddle, which currently only supports pg 9.3, so WITH ORDINALITY
is not available:
SQL Fiddle.
I do not know how to declare a variable of base type of an anyarray
argument (the documentation makes no mention of such a possibility).
You can use FOR LOOP
with integer variable instead:
CREATE OR REPLACE FUNCTION uniq(ary anyarray) RETURNS anyarray AS $$
DECLARE
ret ary%TYPE := '{}';
i int;
BEGIN
IF ary IS NULL THEN
return NULL;
END IF;
FOR i IN array_lower(ary, 1) .. array_upper(ary, 1) LOOP
IF NOT ary[i] = any(ret) THEN
ret = array_append(ret, ary[i]);
END IF;
END LOOP;
RETURN ret;
END;
$$ LANGUAGE plpgsql;
However, loops and variables may not be necessary:
create or replace function uniq_without_loop(arr anyarray)
returns anyarray language plpgsql as $$
begin
return (
select array_agg(distinct elem)
from unnest(arr) elem);
end $$;
The version of the above function that leaves the array order unchanged:
create or replace function unsorted_uniq_without_loop(arr anyarray)
returns anyarray language plpgsql as $$
begin
return (
select array_agg(elem)
from (
select elem
from (
select distinct on(elem) elem, row_number() over ()
from unnest(array[arr]) elem
) sub
order by row_number
) sub);
end $$;