Merging Concatenating JSON(B) columns in query

2019-02-02 21:08发布

问题:

Using Postgres 9.4, I am looking for a way to merge two (or more) json or jsonb columns in a query. Consider the following table as an example:

  id | json1        | json2
----------------------------------------
  1   | {'a':'b'}   | {'c':'d'}
  2   | {'a1':'b2'} | {'f':{'g' : 'h'}}

Is it possible to have the query return the following:

  id | json
----------------------------------------
  1   | {'a':'b', 'c':'d'}
  2   | {'a1':'b2', 'f':{'g' : 'h'}}

Unfortunately, I can't define a function as described here. Is this possible with a "traditional" query?

回答1:

Here is the complete list of build-in functions that can be used to create json objects in PostgreSQL. http://www.postgresql.org/docs/9.4/static/functions-json.html

  • row_to_json and json_object doest not allow you to define your own keys, so it can't be used here
  • json_build_object expect you to know by advance how many keys and values our object will have, that's the case in your example, but should not be the case in the real world
  • json_object looks like a good tool to tackle this problem but it forces us to cast our values to text so we can't use this one either

Well... ok, wo we can't use any classic functions.

Let's take a look at some aggregate functions and hope for the best... http://www.postgresql.org/docs/9.4/static/functions-aggregate.html

json_object_agg Is the only aggregate function that build objects, that's our only chance to tackle this problem. The trick here is to find the correct way to feed the json_object_agg function.

Here is my test table and data

CREATE TABLE test (
  id    SERIAL PRIMARY KEY,
  json1 JSONB,
  json2 JSONB
);

INSERT INTO test (json1, json2) VALUES
  ('{"a":"b", "c":"d"}', '{"e":"f"}'),
  ('{"a1":"b2"}', '{"f":{"g" : "h"}}');

And after some trials and errors with json_object here is a query you can use to merge json1 and json2 in PostgreSQL 9.4

WITH all_json_key_value AS (
  SELECT id, t1.key, t1.value FROM test, jsonb_each(json1) as t1
  UNION
  SELECT id, t1.key, t1.value FROM test, jsonb_each(json2) as t1
)
SELECT id, json_object_agg(key, value) 
FROM all_json_key_value 
GROUP BY id

EDIT: for PostgreSQL 9.5+, look at Zubin's answer below



回答2:

In Postgres 9.5+ you can merge JSONB like this:

select json1 || json2;

Or, if it's JSON, coerce to JSONB if necessary:

select json1::jsonb || json2::jsonb;

Or:

select COALESCE(json1::jsonb||json2::jsonb, json1::jsonb, json2::jsonb);

(Otherwise, any null value in json1 or json2 returns an empty row)

For example:

select data || '{"foo":"bar"}'::jsonb from photos limit 1;
                               ?column?
----------------------------------------------------------------------
 {"foo": "bar", "preview_url": "https://unsplash.it/500/720/123"}

Kudos to @MattZukowski for pointing this out in a comment.



回答3:

Also you can tranform json into text, concatenate, replace and convert back to json. Using the same data from Clément you can do:

SELECT replace(
    (json1::text || json2::text), 
    '}{', 
    ', ')::json 
FROM test

You could also concatenate all json1 into single json with:

SELECT regexp_replace(
    array_agg((json1))::text,
    '}"(,)"{|\\| |^{"|"}$', 
    '\1', 
    'g'
)::json
FROM test


回答4:

However this question is answered already some time ago; the fact that when json1 and json2 contain the same key; the key appears twice in the document, does not seem to be best practice.

Therefore u can use this jsonb_merge function with PostgreSQL 9.5:

CREATE OR REPLACE FUNCTION jsonb_merge(jsonb1 JSONB, jsonb2 JSONB)
    RETURNS JSONB AS $$
    DECLARE
      result JSONB;
      v RECORD;
    BEGIN
       result = (
    SELECT json_object_agg(KEY,value)
    FROM
      (SELECT jsonb_object_keys(jsonb1) AS KEY,
              1::int AS jsb,
              jsonb1 -> jsonb_object_keys(jsonb1) AS value
       UNION SELECT jsonb_object_keys(jsonb2) AS KEY,
                    2::int AS jsb,
                    jsonb2 -> jsonb_object_keys(jsonb2) AS value ) AS t1
           );
       RETURN result;
    END;
    $$ LANGUAGE plpgsql;

The following query returns the concatenated jsonb columns, where the keys in json2 are dominant over the keys in json1:

select id, jsonb_merge(json1, json2) from test


回答5:

FYI, if someone's using jsonb in >= 9.5 and they only care about top-level elements being merged without duplicate keys, then it's as easy as using the || operator:

select '{"a1": "b2"}'::jsonb || '{"f":{"g" : "h"}}'::jsonb;
      ?column?           
-----------------------------
 {"a1": "b2", "f": {"g": "h"}}
(1 row)


回答6:

This function would merge nested json objects

create or replace function jsonb_merge(CurrentData jsonb,newData jsonb)
 returns jsonb
 language sql
 immutable
as $jsonb_merge_func$
 select case jsonb_typeof(CurrentData)
   when 'object' then case jsonb_typeof(newData)
     when 'object' then (
       select    jsonb_object_agg(k, case
                   when e2.v is null then e1.v
                   when e1.v is null then e2.v
                   when e1.v = e2.v then e1.v 
                   else jsonb_merge(e1.v, e2.v)
                 end)
       from      jsonb_each(CurrentData) e1(k, v)
       full join jsonb_each(newData) e2(k, v) using (k)
     )
     else newData
   end
   when 'array' then CurrentData || newData
   else newData
 end
$jsonb_merge_func$;


回答7:

CREATE OR REPLACE FUNCTION jsonb_merge(pCurrentData jsonb, pMergeData jsonb, pExcludeKeys text[])
RETURNS jsonb IMMUTABLE LANGUAGE sql
AS $$
    SELECT json_object_agg(key,value)::jsonb
    FROM (
        WITH to_merge AS (
            SELECT * FROM jsonb_each(pMergeData) 
        )
        SELECT *
        FROM jsonb_each(pCurrentData)
        WHERE key NOT IN (SELECT key FROM to_merge)
     AND ( pExcludeKeys ISNULL OR key <> ALL(pExcludeKeys))
        UNION ALL
        SELECT * FROM to_merge
    ) t;
$$;

SELECT jsonb_merge('{"a": 1, "b": 9, "c": 3, "e":5}'::jsonb, '{"b": 2, "d": 4}'::jsonb, '{"c","e"}'::text[]) as jsonb