Saving, organizing and querying products, options/

2019-03-21 14:50发布

问题:

First of all, let me make clear that I'm not asking for any code; I just wan't some general ideas/guidance/opinions about how could I implement what I'm about to ask.

I'm starting to build an online e-commerce system (Yii2 + MongoDB, so, PHP + NoSQL), and there are two requisites that I'm not entirely sure how to implement without creating a huge mess in both my code and the database.

Both requisites are related, so I'll explain them both as one.

As any other serious e-commerce, it would have categories. And also, as any other serious e-commerce, each product will have tags or options. Let me explain a little bit further what I call tags/options.

Those are the available options that a user could select when buying a product, for example, the color or the size, the material, etc.

  1. Categories

There would be multiple general categories as well as other sub-categories. For example, Electronics could be a general category and sub-categories would be Computers and Smart TVs. Then, Motherboards and RAM could be sub-categories of Computers.

This by itself could be easily stored in a database, but here it goes the problem:

  • Each product should appear when listing any of the categories it belongs to, or the upper categories. That means that if I (as the final user) browse all the items in Computers category, I should see NVIDIA GTX670 which belongs to the subcategory Graphic cards of the category Computers.

I could save each product the following way:

{
    _id: asdasfwetrw34tw34t245y45y,
    name: "NVIDIA GTX670",
    price: 99.50,
    ...
    ...
    categories: [
        "Electronics", //<-- just the ID of that group
        "Computers", //<-- just the ID of that group
        "Graphic cards" //<-- just the ID of that group
    ]
}

But:

  • I'm not sure how fast would be a query for retrieving all the items of a certain category (and, of course, the items of all sub-categories).
  • I'm not sure what other cons would that method have, so, please feel free to recommend any alternative schema for storing this.


2. Tags/options

This is where the real headache is.

Each option could belong to 0 or more categories and subcategories, so the category Woman fashion could have the options size and color, but the category Sunglasses (subcategory of Woman fashion) could have only color, or even another set of options, completely different from Woman fashion.

Furthermore, the values inside each option (red, green, blue in the color option) could appear in random categories. So Woman fashion would have colors like Strawberry Red and Tangerine, while Cars would have Carbon and Black metallic.

Also, there would be several types of options:

  • Completely static (like size, which could be only S or M, but never both. In any case, the administrator won't be able to write a custom size, like Kind of small; he would be able to just select what it's already in the database).
  • Static that can combine together (like colors, which could be red or green, or a combination of colors that the admin chooses).
  • Free-input (like dimensions or weight, that would, ideally, be input fields and dropdown values to join with. For example [10] | (mg||kg|tons) or [20] (cm|m|km|miles)).


I could save each option like this:

{
    option: "Color",
    type: "Static with combinations"
    values: [
        {
            value: "Red",
            categories: [
                "Sunglasses"
            ]
        },
        {
            value: "Green",
            categories: [
                "Sunglasses",
                "T-Shirts"
            ]
        },
        {
            value: "Black metallic",
            categories: [
                "Cars"
            ]
        }
    ],
    categories: [
        "Woman fashion", //<-- only the ID of this group
        "Cars" //<-- only the ID of this group
    ]
}

But I'm worried about how big a single option could turn to be, when there are 30 categories and each value of the option is set to appear in random categories.
Also I just don't see it clean enough, but maybe that is just me.

Anyways, as with the previous point, please feel free to suggest anything to can come up with, I'll greatly appreciate any feedback that you can give me.

回答1:

I'm running a e-commerce website, too. Here's my advice on how I implement the features you mentioned. Hope it helps.

  • Categories

I organize them in a flat structure, in your case it would be:

    {_id: 1, name: "Electronics", parentId: 0, idPath: "/0/1/" ...}
    {_id: 2, name: "Computers", parentId: 1, idPath: "/0/1/2/", ...}
    {_id: 3, name: "Graphic Cards", parentId: 2, idPath: "/0/1/2/3/", ...}

And the product now needs to be only in the leaf categories. In your case:

    {
        _id: asdasfwetrw34tw34t245y45y,
        name: "NVIDIA GTX670",
        price: 99.50,
        ...
        ...
        categoryIds: [3]
    }

The product can be in multiple categories of course, so the categoryIds remains an array. Here's the tricky part. When you list the Electronics category, you can find all its subcategories by:

    db.categories.find({idPath: /^\/0\/1/})

idPath index works here so it's going to be fast. when you find out all the sub-categories, you can easily find all the products in them (build index on the categoryIds of Product collection).

Or alternatively, you can read all the categories into memory and build a hash table with the key->categoryId, value->[all the subcategories]. Your categories usually won't change frequently and you won't have a lot of categories. Thus it's going to be fine.

  • Tags/Options

First of all I think there's something wrong with your category. Women fashion is something generic, you should put your product into something more specific, and the options should be there too. For example, there maybe a category coat which has the size & color, other than women fashion. While there may still be color option in women fashion because it's a common characteristic of all subcategories.
If you think about it, why all the subcategories are organized in one parent category? because they have something in common. That common part should be the common options of the parent category. that is to say, there should be a inheritance between all the parent category and subcategories. For example:

women fashion: color
|-coat: size
|-sun glasses: shape

Then coat would finally has 2 options color & size. sun glasses: color & shape. When you are viewing women fashion, there's only 1 option color. It filters the subcategories too because they inherit from women fashion.
As to the values of color, my idea is only use the standard colors Strawberry Red is actually red, Tangerine is actually orange. You don't really want them to appear when you filter the products. Otherwise there would be too many options, definitely not good for user experience.
However, besides color option from the category, my site also has something called customizable options. These options are only defined on the products. They never appear when you viewing category. Here you can have Strawberry Red & Tangerine. In my opinion, these are not 'natural' properties of a product. They are only used to make the user feel more comfortable when viewing the product. Thus also you can have option of this kind like Tangerine with figure etc.
One more thing about options. you may want to mark which options are supposed to be used for filtering products. For example color is definitely one. While dimension may be not.

About types of options. Yours are fine if it's enough for you. I have a lot more types like Number, String, Single Choice, Multiple Choices. I also plan to implement the Unit. Tricky part of Unit is that for example

1GB = 1024MB = 1024*1024B

So when you get a hard disk of 1GB and 1TB, you may want to do a conversion before filtering products. This is off the topic I'll get back to your question.

Note that although the options of different categories have the same name. They are not likely the same thing. Material of Coat and Furniture are 2 different things. So I tend to define different options for different categories. Thus there maybe color for toys, and color for women fashion. This does not conflict with the inheritance mentioned above because from some level, the subcategories begin to share the same options. This is completely related to how you organize your category structure. And if you want to change category structure or move products some time, it would be painful. So be careful when you define your categories.

That's all that comes up in my mind. I'm afraid I'm not native English speaker thus you may find some part of my answer hard to understand. Feel free to let me know.