Suppose you are designing a searchable database for a webstore, and have a situation like the following:
Each product
will have multiple, different categories
.
For example, consider a MARKER as a product
.
E.g.: The same MARKER might have as its category
all three of the following:
- TOY
- SCHOOL-STATIONERY
- ART-ACCESSORY
Now, each category
has a corresponding set of any number of parameters (let's call them features
) that apply only to that category.
Thus, for each of its three product-category
representations, the MARKER above will have multiple, different product-feature-names
and respective product-feature-values
.
E.g.: We might have these category-based feature-names
and product-specific feature-values
:
- PRODUCT-NAME::: CATEGORY-NAME::: Feature1-name = Feature1-value, etc.
- MARKER::: TOY::: AgeGroup = 6-12, Toxic = No, ChristmasSpecial = Yes
- MARKER::: SCHOOL-STATIONERY::: BulkAvailability = No
- MARKER::: ART-ACCESSORY::: Use = Fine-drawing, CompatibleSurface = All
What would be the best / most-optimal design for this type of situation?
My thought is to use three tables, but I don't know if this is the most efficient for retrieving data later (perhaps this should be done in two tables, or even just one?):
PRODUCT-TABLE
id, product_name
CATEGORY-TABLE
id, fk_product_id, fk_category_name
FEATURE-TABLE
id, fk_category_id, feature_name, feature_value
Just have the database say what's so.
yn = yes,no
age_range = 0-4,4-6,6-12,...
use = Fine-Drawing,Canoeing,...
surface_group = All,PaperOrSkin,PaperOrWall,...
PRODUCT(p,n) -- product [p] is named [n]
TOY(p,AgeGroup,Toxic,ChristmasSpecial,...)
-- [p] is a toy for age range [AgeGroup] and whether it's toxic is [Toxic] and whether it's on Christmas special is [ChristmasSpecial] and ...
SCHOOL-STATIONERY(p,BulkAvailability,...)
-- [p] is school stationery and its bulk availability is [BulkAvailability] and ...
ART-ACCESSORY(p,Use,CompatibleSurface,...)
-- [p] is an art accessory with use [u] and is compatible with surfaces in surface group [CompatibleSurface] and ...
An SQL query combines conditions and tables. The meaning of the query is combined from the conditions and the table meanings given above.
See this stackoverflow post from today or zillions of others on EAV and OTLT. Or this. Also, just learn about database design. Maybe starting here.
Notice things are simpler if yes/no choices are separated:
TOY(p,AgeGroup,...) -- [p] is a toy for age range [AgeGroup] and ...
TOY-TOXIC(p) -- toy [p] is toxic
TOY-XMAS-SPECIAL(p) -- toy [p] is on Christmas special
SCHOOL-STATIONERY(p,...) -- [p] is school stationery and ...
SCHOOL-STATIONERY-BULK-AVAILABLE(p) -- school stationery [p] is available in bulk
Probably you also want to sometimes not have labels for groups of things but simply state about the things or the first and last things in a range:
age = 0,1,2,...
surface = Paper,Skin,Wall,...
TOY(p,...) -- [p] is a toy and ...
TOY(p,MinAge,MaxAge) -- [p] is a toy with age minimum [MinAge] and maximum [MaxAge]
ART-ACCESSORY(p,Use,...) -- [p] is an art accessory with use [u] and ...
ART-ACCESSORY-COMPATIBLE-SURFACE(p,CompatibleSurface) -- accessory [p] is compatible with surface [CompatibleSurface]
You can always change the currently-hardwired info (type sets and type, column and table names). You can always write generic queries that don't even know what is hardwired because all the hardwired names are values in the tables constituting DBMS metadata.