Why is multi-value field a bad idea in relational

2020-06-23 06:15发布

问题:

Having been working with Mongodb and Solr/Lucene, I am starting to wonder why multi-value field for relational databases are (generally) considered an bad idea?

I am aware of the theoretical foundation of relational database and normalization. In practice, however, I ran into many use cases where I end up using an meta table of key-value pairs to supplement the main table, such as in the cases of tagging, where I wish I don't have to make multiple joins to look up the data. Or where requirements suddenly changed from having to support an single author to multiple authors per article.

So, what are some disadvantages of having multi-value fields or did the vendor choose not to support it since it not part of the SQL standard?

回答1:

The main disadvantage is query bias. The phenomenon that such databases tend to get designed with one particular kind of query in mind, and turn out to be difficult to handle when other queries need to be written.

Suppose you have Students and Courses, and you model all of that so that you can say, in a single row in a single table, "John Doe takes {French, Algebra, Relational Theory}" and "Jane Doe takes {German, Functional Computing, Relational Theory}".

That makes it easy to query "what are all the courses followed by ...", but try and imagine what it would take to produce the answer to "what are all the students who follow Relational Theory".

Try and imagine all the things the system should itself be doing to give such a query (if it were possible to write it) any chance of performing reasonably ...



回答2:

The query bias is assuming that SQL is a always a good query language. The fact is it is sometimes an excellent query language, but it has never been one size fits all. Multivalue databases allow you to pack multiple values and handle 'alternate perspective' queries. Examples of MVDBs: UniData http://u2.rocketsoftware.com/products/u2-unidata, OpenInsight http://www.revelation.com/, Reality http://www.northgate-is.com/. There are many others. Their query languages support what you are looking to do.



回答3:

I think this has its roots in the fact that there is no simple, standard way to map a collection to a column in the Relational world. A mutifield value is basically a simple collection (an array of strings in most use cases), which is difficult to represent as a column. Some RDBMS support this by using a delimiter but then again, it starts to feel like an anti-pattern even if the DB driver lets you use multi-value fields in a relational database. Databases like MongoDB rely on a JSON-like structure to define the data, where collections are easily mapped and retrieved.