I need help with a problem regarding data saved in a parent-children model table and a report I need to build upon it. I've already tried searching for topics about parent-children issues, but I couldn't find anything useful in my scenario.
What I have
A Microsoft SQL Server 2000 database server.
A categories
table, which has four columns: category_id
, category_name
, father_id
and visible
; the categories have x root categories (where x is variable), and could be y level deep (where y is variable), if a category is a root level one it has father_id
null otherwise it's filled with the id of the father category.
A sales
table, which has z columns, one of which is category_id
, a foreign key to categories.category_id
; a sale must always have a category, and it could be linked anywhere in the aforementioned y level.
What I need
I've been asked a report displaying only the root (first level) categories, and the quantity of sales belongings to each of these, or their children, no matter how deep. I.e. if one of the root categories is food
, which has a children category named fruit
, which has a children category named apple
, I need to count every item belonging to food
or fruit
or apple
.
Couldn't you use the nested set data model?
I know of the nested set model, but I already have the table this way, and migrating it to the nested set model would be a pain (let alone I didn't even fully grasp how nested set works), not counting the changes needed in the application using the database. (If someone thinks this is still the least pain way, please explain why and how the current data could be migrated.)
Couldn't you use CTE (Common Table Expressions)?
No, it's a Microsoft SQL Server 2000, and Common Table Expressions are introduced in the 2005 edition.
Thanks in advance, Andrea.
SQL 2000 Based solution
DECLARE @Stack TABLE (
StackID INTEGER IDENTITY
, Category VARCHAR(20)
, RootID INTEGER
, ChildID INTEGER
, Visited BIT)
INSERT INTO @Stack
SELECT [Category] = c.category_name
, [RootID] = c.category_id
, [ChildID] = c.category_id
, 0
FROM Categories c
WHILE EXISTS (SELECT * FROM @Stack WHERE Visited = 0)
BEGIN
DECLARE @StackID INTEGER
SELECT @StackID = MAX(StackID) FROM @Stack
INSERT INTO @Stack
SELECT st.Category
, st.RootID
, c.category_id
, 0
FROM @Stack st
INNER JOIN Categories c ON c.father_id = st.ChildID
WHERE Visited = 0
UPDATE @Stack
SET Visited = 1
WHERE StackID <= @StackID
END
SELECT st.RootID
, st.Category
, COUNT(s.sales_id)
FROM @Stack st
INNER JOIN Sales s ON s.category_id = st.ChildID
GROUP BY st.RootID, st.Category
ORDER BY st.RootID
SQL 2005 Based solution
A CTE should get you what you want
- Select each category from Categories to be the root item
- recursively add each child of every root item
INNER JOIN
the results with your sales table. As every root is in the result of the CTE, a simple GROUP BY
is sufficient to get a count for each item.
SQL Statement
;WITH QtyCTE AS (
SELECT [Category] = c.category_name
, [RootID] = c.category_id
, [ChildID] = c.category_id
FROM Categories c
UNION ALL
SELECT cte.Category
, cte.RootID
, c.category_id
FROM QtyCTE cte
INNER JOIN Categories c ON c.father_id = cte.ChildID
)
SELECT cte.RootID
, cte.Category
, COUNT(s.sales_id)
FROM QtyCTE cte
INNER JOIN Sales s ON s.category_id = cte.ChildID
GROUP BY cte.RootID, cte.Category
ORDER BY cte.RootID
Something like this?
CREATE TABLE #SingleLevelCategoryCounts
{
category_id,
count,
root_id
}
CREATE TABLE #ProcessedCategories
{
category_id,
root_id
}
CREATE TABLE #TotalTopLevelCategoryCounts
{
category_id,
count
}
INSERT INTO #SingleLevelCategoryCounts
SELECT
category_id, SUM(*), category_id
FROM
Categories
INNER JOIN Sales ON Categories.category_id = sales.category_id
WHERE
Categories.father_id IS NULL
GROUP BY
Categories.category_id
WHILE EXISTS (SELECT * FROM #SingleLevelCategoryCounts)
BEGIN
IF NOT EXISTS(SELECT * FROM #TopLevelCategoryCounts)
BEGIN
INSERT INTO #TopLevelCategoryCounts
SELECT
root_id, count
FROM
#SingleLevelCategoryCounts
END
ELSE
BEGIN
UPDATE top
SET
top.count = top.count + level.count
FROM
#TopLevelCategoryCounts top
INNER JOIN #SingleLevelCategoryCounts level ON top.category_id = level.count
END
INSERT INTO #ProcessedCategories
SELECT category_id, root_id FROM #SingleLevelCategoryCounts
DELETE #SingleLevelCategoryCounts
INSERT INTO #SingleLevelCategoryCounts
SELECT
category_id, SUM(*), pc.root_id
FROM
Categories
INNER JOIN Sales ON Categories.category_id = sales.category_id
INNER JOIN #ProcessedCategories pc ON Categories.father_id = pc.category_id
WHERE
Categories.category_id NOT IN
(
SELECT category_id in #ProcessedCategories
)
GROUP BY
Categories.category_id
END