I would like to ask get some help about understanding how "with recursive" works.
More precisely WHY the anchor query (the non-recursive term) isn't replicated into the sub call of the CTE. I tried my best to understand alone but I'm not sure.
First of all let's take the example of PostgreSQL which is the simplest one I found (make the sum of 1 to 100) :
WITH RECURSIVE t(n) AS (
VALUES (1)
UNION ALL
SELECT n+1 FROM t WHERE n < 100)
SELECT sum(n) FROM t;
My code walkthrough ( I used links below) :
Evaluate the non-recursive term. For UNION [...].
Include all remaining rows in the result of the recursive query, and also place them in a temporary working table.
So long as the working table is not empty, repeat these steps:
Evaluate the recursive term, substituting the current contents of the working table for the recursive self-reference. For UNION [...].
Include all remaining rows in the result of the recursive query, and also place them in a temporary intermediate table.
Replace the contents of the working table with the contents of the intermediate table, then empty the intermediate table."
LVL 0 :
non-recursive part
- CTE : (N) 1
- WORKING TABLE : (N) 1
recursive part
- CTE : (N) 1
- WORKING TABLE : (N) 1
- INTERMEDIATE TABLE (N) 2
(this is the part I mess around I think) - substitution of WORKING TABLE
So the recursive t will use WORKING TABLE to do SELECT n+1 and put the result in INTERMEDIATE TABLE.
UNION ALL
- CTE : (N) 1 2
- WORKING TABLE : (N) 2
- INTERMEDIATE TABLE : CLEANED
Then we go into the next lvl by the call of t right?
(because END condition WHERE n < 100 = FALSE)
LVL 1 :
We know coz postgreSQL says it "So long as the working table is not empty, repeat the recursive steps"
So it will repeat the step 2. and 3. (if i'm correct) until END condition then do the SUM.
BUT if I just walkthrough the call of the next lvl of t should we not do VALUES(1) first ?
I'm really confused about how it is possible.
Best regards,
Falt4rm
- PostgreSQL : query-with
- Microsoft Technet : structure of CTE
There is no "recursion" taking place here and I think that this is where you get confused.
From the PostgreSQL documentation: http://www.postgresql.org/docs/9.4/static/queries-with.html
Note: Strictly speaking, this process is iteration not recursion,
but RECURSIVE is the terminology chosen by the SQL standards committee.
To paraphrase this sentence, a WITH RECURSIVE
can be viewed as a simple WHILE
loop.
WITH RECURSIVE t(n) AS (
VALUES (1)
UNION ALL
SELECT n+1 FROM t WHERE n < 100
)
SELECT * FROM t;
Here is some custom-made pseudo-code to explain this process in detail
# Step 1: initialisation
LET cte_result = EMPTY
LET working_table = VALUES (1)
LET intermediate_table = EMPTY
# Step 2: result initialisation, merge initialisation into cte_result
cte_result = cte_result UNION working_table
# Step 3: iteration test
WHILE (working_table is not empty) DO
# Step 4: iteration select, we substitute the self-reference with working_table
intermediate_table = SELECT n+1 FROM working_table WHERE n < 100
# Step 5: iteration merge, merge the iteration result into cte_result
cte_result = cte_result UNION intermediate_table
# Step 6: iteration end, prepare for next iteration
working_table = intermediate_table
intermediate_table = EMPTY
END WHILE
# Step 7: return
RETURN cte_result
And using an example
# Step 1: initialisation
cte_result: EMPTY | working_table: 1 | intermediate_table: EMPTY
# Step 2: result initialisation
cte_result: 1 | working_table: 1 | intermediate_table: EMPTY
# Step 3: iteration test
count(working_table) = 1 # OK
# Step 4: iteration select
cte_result: 1 | working_table: 1 | intermediate_table: 2
# Step 5: iteration merge
cte_result: 1, 2 | working_table: 1 | intermediate_table: 2
# Step 6: iteration end
cte_result: 1, 2 | working_table: 2 | intermediate_table: EMPTY
# Step 3: iteration test
count(working_table) = 1 # OK
# Step 4: iteration select
cte_result: 1, 2 | working_table: 2 | intermediate_table: 3
# Step 5: iteration merge
cte_result: 1, 2, 3 | working_table: 2 | intermediate_table: 3
# Step 6: iteration end
cte_result: 1, 2, 3 | working_table: 3 | intermediate_table: EMPTY
# … 97 more iterations and you get this state
cte_result: 1, 2, …, 100 | working_table: 100 | intermediate_table: EMPTY
# Step 3: iteration test
count(working_table) = 1 # OK
# Step 4: iteration select, the iteration query does not return any rows due to the WHERE clause
cte_result: 1, 2, …, 100 | working_table: 100 | intermediate_table: EMPTY
# Step 5: iteration merge, nothing is merged into the cte_result
cte_result: 1, 2, …, 100 | working_table: 100 | intermediate_table: EMPTY
# Step 6: iteration end
cte_result: 1, 2, …, 100 | working_table: EMPTY | intermediate_table: EMPTY
# Step 3: iteration test
count(working_table) = 0 # STOP
# Step 7: return
cte_result: 1, 2, …, 100
So the result of the CTE is all numbers from 1 to 100.