Postgres infinite self join

2019-08-25 10:32发布

问题:

So i have an article, and "comments" on the article..

the comment allows people to reply.. and you could reply to the reply.. so on and so forth, meaning the deepest tree root would be N

Quick mockup of what the tables look like

Comments(id, news_id, user_id, body, likes)

Replies(id, parent_id) --> id here is = Comments.id

User(id, username, password)

News(id, title, body, image)

Is there a way to query the Postgres DB to give me a result of something like

So anything inside the Replies table that has null parent_id is a "main" comment (aka isn't a reply).. I would love if possible if the children fields gets populated within itself (i.e. a reply of a reply)

Is this even possible with Postgres? Or Am i supposed to be fetching all Replies joining them with Comments and then iterating through each one trying to find it's proper desitination?

Btw, i'm using GoLang for my backend and the Gorm package to access my postgres db

EDIT: I'm using this query

with recursive commentss as (
  select r.id, r.parent, array[r.id] as all_parents, 
         c.body, u.username 
    from replies r 
          inner join comments c 
                  on c.id = r.id 
                join users u 
                  on u.id = c.user_refer 
   where (parent <> '') IS NOT TRUE 
   union all 
  select r.id, r.parent, c.all_parents || r.id, 
         co.body, u.username 
    from replies r 
          join comments co 
            on co.id = r.id 
          join users u 
            on u.id = co.user_refer 
          join commentss c 
            on r.parent = c.id 
               and r.id <> ALL (c.all_parents)
  ) 
   select * from commentss order by all_parents;

Which results to :

Which is a step closer.. however what i need is to have a JSON object returned looking like

comments: [
  {
    comment_id: ...,
    username: ...,
    comment_body: ....,
    comment_likes: ....,
    children: [...]
  },
  {
    .....
  }
]

Where the first items inside the comments object would be the comments that are NOT a reply, and the children field should be populated with the replied comments.. and the comments inside the children should also have their children populated to replies to that reply

回答1:

Hoping that this is your expected result. (I did something similar here: https://stackoverflow.com/a/52076212/3984221)

demo: db<>fiddle

Table comments:

id  body          user_id  likes  
--  ------------  -------  -----  
a   foo           1        1      
b   foofoo        1        232    
c   foofoofoo     1        23232  
d   fooFOO        1        53     
e   cookies       1        864    
f   bar           1        44     
g   barbar        1        54     
h   barBAR        1        222    
i   more cookies  1        1      

Table replies

id  parent_id  
--  ---------  
a   (null)     
b   a          
c   b          
d   a          
e   (null)     
f   (null)     
g   f          
h   f          
i   (null)     

Result:

{
    "comments": [{
        "children": [],
        "username": "Mike Tyson",
        "comment_id": "i",
        "comment_body": "more cookies",
        "comment_likes": 1
    },
    {
        "children": [{
            "children": [],
            "username": "Mike Tyson",
            "comment_id": "b",
            "comment_body": "foofoo",
            "comment_likes": 232
        },
        {
            "children": [{
                "children": [],
                "username": "Mike Tyson",
                "comment_id": "c",
                "comment_body": "foofoofoo",
                "comment_likes": 23232
            }],
            "username": "Mike Tyson",
            "comment_id": "d",
            "comment_body": "fooFOO",
            "comment_likes": 53
        }],
        "username": "Mike Tyson",
        "comment_id": "a",
        "comment_body": "foo",
        "comment_likes": 1
    },
    {
        "children": [],
        "username": "Mike Tyson",
        "comment_id": "e",
        "comment_body": "cookies",
        "comment_likes": 864
    },
    {
        "children": [{
            "children": [],
            "username": "Mike Tyson",
            "comment_id": "g",
            "comment_body": "barbar",
            "comment_likes": 54
        },
        {
            "children": [],
            "username": "Mike Tyson",
            "comment_id": "h",
            "comment_body": "barBAR",
            "comment_likes": 222
        }],
        "username": "Mike Tyson",
        "comment_id": "f",
        "comment_body": "bar",
        "comment_likes": 44
    }]
}

Query:

Recursion:

WITH RECURSIVE parent_tree AS (
    SELECT 
        id, 
        NULL::text[] as parent_id,
        array_append('{comments}'::text[], (row_number() OVER ())::text) as path, 
        rc.children  
    FROM replies r
    LEFT JOIN LATERAL (SELECT parent_id, ARRAY_AGG(id) as children FROM replies WHERE parent_id = r.id GROUP BY parent_id) rc ON rc.parent_id = r.id
    WHERE r.parent_id IS NULL 

    UNION

    SELECT 
        r.id, 
        array_append(pt.parent_id, r.parent_id), 
        array_append(array_append(pt.path, 'children'), (row_number() OVER (PARTITION BY pt.parent_id))::text),
        rc.children      
    FROM parent_tree pt
    JOIN replies r ON r.id = ANY(pt.children)
    LEFT JOIN LATERAL (SELECT parent_id, ARRAY_AGG(id) as children FROM replies WHERE parent_id = r.id GROUP BY parent_id) rc ON rc.parent_id = r.id
), json_objects AS (
   SELECT c.id, jsonb_build_object('children', '[]'::jsonb, 'comment_id', c.id, 'username', u.name, 'comment_body', c.body, 'comment_likes', c.likes) as jsondata
   FROM comments c
   LEFT JOIN users u ON u.id = c.user_id
)
SELECT 
    parent_id, 
    path,
    jsondata
FROM parent_tree pt 
LEFT JOIN json_objects jo ON pt.id = jo.id
ORDER BY parent_id NULLS FIRST

The only recursion part is within CTE parent_tree. Here I am searching for the parents and build a path. This path is needed for inserting the json data later at the right position.

The second CTE (json_objects) builds a json object for each comment with an empty children array where the children can be inserted later.

The LATERAL join searches the replies table for children of the current id and gives an array with their ids.

The ORDER BY clause at the end is important. With this it is ensured that all upper nodes come before the lower nodes (their children). Otherwise the input into the global json object could fail later because a necessary parent could not exist at the right moment.

Building the final JSON object:

CREATE OR REPLACE FUNCTION json_tree() RETURNS jsonb AS $$
DECLARE
    _json_output jsonb;
    _temprow record;
BEGIN

    SELECT 
        jsonb_build_object('comments', '[]'::jsonb) 
    INTO _json_output;

    FOR _temprow IN
        -- <query above>
    LOOP
        SELECT jsonb_insert(_json_output, _temprow.path, _temprow.jsondata) INTO _json_output;
    END LOOP;

    RETURN _json_output;
END;
$$ LANGUAGE plpgsql;

It is not possible to build the json object within the recursion because within the query the jsondata object is not a global variable. So if I would add b as child into a in one recursion branch, it wouldn't exist in another branch where I would add c as child.

So it is necessary to generate a global variable. This could be done in a function. With the calculated path and child objects it is really simple to build the final json together: looping through the result set and add the json object into the path of the global object.