Switching from FOR loops in plpgsql to set-based S

2019-02-26 00:22发布

问题:

I've got quite heavy query with FOR loop to rewrite and would like to do it simpler, using more SQL instead of plpgsql constructions. The query looks like:

FOR big_xml IN SELECT unnest(xpath('//TAG1', my_xml)) LOOP
   str_xml = unnest(xpath('/TAG2/TYPE/text()', big_xml));

      FOR single_xml IN SELECT unnest(xpath('/TAG2/single', big_xml)) LOOP

         CASE str_xml::INT
           WHEN 1
           THEN
            INSERT INTO tab1(id, xml) VALUES (1, single_xml);
           WHEN 2
           THEN
            INSERT INTO tab2(id, xml) VALUES (1, single_xml);
           WHEN 3
            [...] 
           WHEN 11
            [...]
           ELSE 
              RAISE EXCEPTION 'something'
           END CASE;
      END LOOP;
    END LOOP;

RETURN xmlelement(NAME "out", xmlforest(1 AS out));

I've started to rewrite it for better performance with:

INSERT INTO tab1(id, xml)
  SELECT 1, unnest(xpath('/TAG2/single', (SELECT unnest(xpath('//TAG1', my_xml)))); 

But I'm not sure how to deal with those CASE ... INSERT statements. Any ideas? Or maybe my approach is completely wrong?

edited 25.02.14: PostgreSQL 9.3.1

回答1:

To insert into different tables, depending on your data, you need some procedural code. But you can make do with a single loop and a nested unnest():

DEFINE
   int_xml    int;
   single_xml xml;
BEGIN
   FOR r IN                     -- or you can use two variables
      SELECT xpath('/TAG2/single', big_xml))[1]::text::int AS int_xml
           , unnest(xpath('/TAG2/TYPE/text()', big_xml)) AS single_xml
      FROM  (SELECT unnest(xpath('//TAG1', my_xml)) AS big_xml) sub
   LOOP
      CASE int_xml
      WHEN 1 THEN
         INSERT INTO tab1(id, xml) VALUES (1, single_xml);
      WHEN 2 THEN
         INSERT INTO tab2(id, xml) VALUES (1, single_xml);
      WHEN 3 THEN
          ...

      ELSE 
         RAISE EXCEPTION 'something'
      END CASE;
   END LOOP;
END

Or take a temporary table or CTE and append multiple INSERT statements. Should be faster for lots of rows.

WITH cte AS (
   SELECT xpath('/TAG2/single', big_xml))[1]::text::int AS int_xml
        , unnest(xpath('/TAG2/TYPE/text()', big_xml)) AS single_xml
   FROM  (SELECT unnest(xpath('//TAG1', my_xml)) AS big_xml) sub
), i1 AS (
   INSERT INTO tab1(id, xml)
   SELECT 1, single_xml
   FROM   cte
   WHERE  int_xml = 1
), i2 AS (
   INSERT INTO tab2(id, xml)
   SELECT 1, single_xml
   FROM   cte
   WHERE  int_xml = 2
),
   ...
)
SELECT int_xml INTO my_var
FROM   cte
WHERE  int_xml <> ALL ({'1','2','3','11'}::int[])
LIMIT  1;

IF FOUND THEN
   RAISE EXCEPTION 'One of the int_xml did not fit: %!', my_var;
END IF;

The last bit compensates for the exception in your original.

If your big_xml is real big, make sure you provide enough temporary buffers for the CTE. If the general setting is too low, increase the temp_buffers setting for this session only. Details in this related answer:
How can I insert common data into a temp table from disparate schemas?