Database design for products with multiple categor

2020-02-15 15:44发布

问题:

Suppose you are designing a searchable database for a webstore, and have a situation like the following:


Each product will have multiple, different categories.

For example, consider a MARKER as a product.

E.g.: The same MARKER might have as its category all three of the following:

  • TOY
  • SCHOOL-STATIONERY
  • ART-ACCESSORY

Now, each category has a corresponding set of any number of parameters (let's call them features) that apply only to that category.

Thus, for each of its three product-category representations, the MARKER above will have multiple, different product-feature-names and respective product-feature-values.

E.g.: We might have these category-based feature-names and product-specific feature-values:

  • PRODUCT-NAME::: CATEGORY-NAME::: Feature1-name = Feature1-value, etc.
  • MARKER::: TOY::: AgeGroup = 6-12, Toxic = No, ChristmasSpecial = Yes
  • MARKER::: SCHOOL-STATIONERY::: BulkAvailability = No
  • MARKER::: ART-ACCESSORY::: Use = Fine-drawing, CompatibleSurface = All

What would be the best / most-optimal design for this type of situation?


My thought is to use three tables, but I don't know if this is the most efficient for retrieving data later (perhaps this should be done in two tables, or even just one?):

PRODUCT-TABLE

id, product_name

CATEGORY-TABLE

id, fk_product_id, fk_category_name

FEATURE-TABLE

id, fk_category_id, feature_name, feature_value

回答1:

Just have the database say what's so.

yn = yes,no
age_range = 0-4,4-6,6-12,...
use = Fine-Drawing,Canoeing,...
surface_group = All,PaperOrSkin,PaperOrWall,...
PRODUCT(p,n) -- product [p] is named [n]
TOY(p,AgeGroup,Toxic,ChristmasSpecial,...)
    -- [p] is a toy for age range [AgeGroup] and whether it's toxic is [Toxic] and whether it's on Christmas special is [ChristmasSpecial] and ...
SCHOOL-STATIONERY(p,BulkAvailability,...)
    -- [p] is school stationery and its bulk availability is [BulkAvailability] and ...
ART-ACCESSORY(p,Use,CompatibleSurface,...)
    -- [p] is an art accessory with use [u] and is compatible with surfaces in surface group [CompatibleSurface] and ...

An SQL query combines conditions and tables. The meaning of the query is combined from the conditions and the table meanings given above.

See this stackoverflow post from today or zillions of others on EAV and OTLT. Or this. Also, just learn about database design. Maybe starting here.

Notice things are simpler if yes/no choices are separated:

TOY(p,AgeGroup,...) -- [p] is a toy for age range [AgeGroup] and ...
TOY-TOXIC(p) -- toy [p] is toxic
TOY-XMAS-SPECIAL(p) -- toy [p] is on Christmas special
SCHOOL-STATIONERY(p,...) -- [p] is school stationery and ...
SCHOOL-STATIONERY-BULK-AVAILABLE(p) -- school stationery [p] is available in bulk

Probably you also want to sometimes not have labels for groups of things but simply state about the things or the first and last things in a range:

age = 0,1,2,...
surface = Paper,Skin,Wall,...
TOY(p,...) -- [p] is a toy and ...
TOY(p,MinAge,MaxAge) -- [p] is a toy with age minimum [MinAge] and maximum [MaxAge]
ART-ACCESSORY(p,Use,...) -- [p] is an art accessory with use [u] and ...
ART-ACCESSORY-COMPATIBLE-SURFACE(p,CompatibleSurface) -- accessory [p] is compatible with surface [CompatibleSurface]

You can always change the currently-hardwired info (type sets and type, column and table names). You can always write generic queries that don't even know what is hardwired because all the hardwired names are values in the tables constituting DBMS metadata.