Best way to create Combination of records (Order d

2019-09-06 20:29发布

I hope I don't butcher the explanation of my question:

I've got a table that has hundreds of rows, each row is a recipe with nutritional information, for example:

recipe_table:

id  | calories | protein| carbs | fat

recipe1, 100,    20g,     10g,     2g
recipe2, 110,    10g,     12g,     12g
recipe3, 240,    20g,     1g,      23g
....

I needed to create a new table (recipe_index) that would show every possible combination of every recipe in recipe_table as a set of 3, so it would look something like:

recipe_index:

id1     | id2    | id3    |calories| protein | carbs | fat
recipe1, recipe2, recipe3,   450,     50g,      23g,   37g
....

Basically it allows me to query recipe_index and say "what 3 recipe combinations come to a total value that's between 440 calories and 460 calories"

My current code for doing this works at 3 meals, however I end up with about 450,000 records in recipe_index, I need to do this same thing for 4,5 and 6 meals as well, so I'm calculating millions and millions of records at the end of this. Is there a more efficient way of doing this? Perhaps I need to look into partitioning a table for each range?

My current SQL code:

INSERT INTO recipe_index
SELECT distinct '3' as nummeals, t1.id as id1, t2.id as id2, t3.id as id3, 0 as id4,   
t1.calories_ps+t2.calories_ps+t3.calories_ps as calories, t1.protein_ps+t2.protein_ps+t3.protein_ps as  
protein, t1.carbohydrate_ps+t2.carbohydrate_ps+t3.carbohydrate_ps as carbohydrate, 
t1.fat_ps+t2.fat_ps+t3.fat_ps as fat from recipes t1 inner join  recipes t2  on t1.Id < t2.Id inner join  recipes t3  on t2.Id < t3.Id WHERE t1.image <> '' AND t2.image <> '' AND t3.image <> ''

If I missed anything obvious please let me know

1条回答
该账号已被封号
2楼-- · 2019-09-06 20:52

You would do this with a join. In order to prevent duplicates, you want a condition where the recipe ids are in order (this also prevents one recipe from appearing three times):

select r1.id, r2.id, r3.id,
       (r1.calories + r2.calories + r3.calories) as calories,
       (r1.protein + r2.protein + r3.protein) as protein,
       (r1.carbs + r2.carbs + r3.carbs) as carbs,
       (r1.fat + r2.fat + r3.fat) as calories
from recipe_table r1 join
     recipe_table r2
     where r1.id < r2.id join
     recipe_table r3
     where r2.id < r3.id;

The only difference from your query is that the distinct is not necessary, because the ordering prevents duplicates.

The problem you are facing is that there are a lot of combinations. So there are millions of combinations of 4 recipes. I'm guessing you are starting with 77 or so recipes. The number of combinations of 4 of them is 77*76*75*74 -- and this sequence will grow quickly for 5 and 6 combos.

查看更多
登录 后发表回答