可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm pretty new to databases, so forgive me if this is a silly question.
In modern databases, if I use an index to access a row, I believe this will be O(1) complexity. But if I do a query to select another column, will it be O(1) or O(n)? Does the database have to iterate through all the rows, or does it build a sorted list for each column?
回答1:
Actually, I think access based on an index will be O(log(n)), because you'll still be searching down through a B-Tree-esque organization to get to your record.
回答2:
To answer your literal question, yes if there is no index on a column, the database engine will have to look at all rows.
In the more interesting case of selecting by multiple columns, both with and without index, the situation becomes more complex: If the Query Optimizer chooses to use the index, then it'll first select rows based on the index and then apply a filter with the remaining constraints. Thus reducing the second filtering operation from O(number of rows) to O(number of selected rows by index). The ratio between these two number is called selectivity and an important statistic when choosing which index to use.
回答3:
Indexes are per column, so if you use a where clause on a un-indexed column it will do a so called tablescan which is O(n).
回答4:
I don't know the answer, but keep in mind that big-O notation only gives you an indication of performance for data-set sizes which are arbitrarily large.
For example, the bottleneck for database performance is typically disk seeks. Therefore, performance is greatly increased if the working data-set can be kept in memory. Big-O notation won't tell you anything about such optimizations, because they are only relevant for finite data-sets.
回答5:
B-trees do not yield O(logN), that is the complexity of a binary tree.
A B-Tree is organised such that it has an entire block per node, thus once a node is found a single I/O operation can read an entire block.
With the number of items per node = blocking factor (#records/block){bfr}, a B-Tree optimized search will yield O(log bfr÷2 +1 N) I/O operations instead of O(N) I/O operations seeking a record by key.
回答6:
You have indexes. Clustered indexes are physically sorted on the disk, you can have only one per table. Unclustered indexes are logically sorted and you can have many of those (careful not to abuse it either, it might slow down write actions).
If there is no index on your column then I believe it's the good old row by row method.
回答7:
There are different types of indexes, different execution plans and different implementations for different databases. Most of the code of relations database is in search-optimising algorithms. There is not a single answer to your question. You can use a tool to visualise the execution plan when you want to know how a query is going to be executed.