I have a big product catalog tree which currently contains ~36 thousand categories and ~1 milion products (i.e leafs). It is structured like this (maximum depth is 5):
Cat1
|_Cat11
| |_Cat111
| | |_Cat1111
| | | |_Product1
| | | |_...
| | |_Cat1112
| | | |_Product1
| | | |_...
| | |_Cat1113
| | | |_Product1
| | | |_...
| |_Cat112
| |_Cat1121
| | |_Product1
| | |_...
| |_Cat1122
| | |_Product1
| | |_...
| |_Cat1123
| |_Product1
| |_...
|_Cat12
| |_Cat121
| | |_Cat1211
| | |_Cat1212
| | |_Cat1213
| |_Cat122
| |_Cat1221
| |_Cat1222
| |_Cat1223
|_...
Cat2
|...
When searching this catalog (using SQL Server Freetext search) I get a bunch of categories and products back very fast. Some searches get hits on very many products. I want the result to be aggregated and presented with a total of hits for each category currently expanded. Like this (2 examples at different levels):
**Ex1 (first level)**
Cat1(563)
|
Cat2(332)
|
Cat8(2)
**Ex2 (second level)**
Cat1
|_Cat12(102)
|_Cat14(201)
|_...
What I have tried so far is to store all parent/child relations in Redis (stored as sets). Then to get the aggregated result I simply traverse from the products (from the search result) up through its parents up to the currently expanded category (or rather its immediate children) to find which categories to present and to count number of products below it matching the search. If I have about 5000 products in the search result then this takes about 20 seconds. Way to long.
What whould be a better way to accomplish this? One way would be to have all 1 million products already aggregated on each category but that would require 36 million keys and would probably require to much RAM. I currently use 500Mb already.