I came across this post in Stackoverflow. The first answer mentions something like A clustered index has all the data for the table while a non clustered index only has the column + the location of the clustered index or the row if it is on a heap (a table without a clustered index). How can a non-clustered index have the location of the clustered index? It only contains the column values sorted as nodes in a B-treee with each node pinting to the row where the column has that node-value, right?
相关问题
- sql execution latency when assign to a variable
- What is the best way to cache a table from a (SQL)
- php PDO::FETCH_ASSOC doesnt detect select after ba
- Bulk update SQL Server C#
- SQL to Parse a Key-Value String
相关文章
- Entity Framework 4.3.1 failing to create (/open) a
- Use savefig in Python with string and iterative in
- Code for inserting data into SQL Server database u
- Accessing an array element when returning from a f
- Delete Every Alternate Row in SQL
- Linux based PHP install connecting to MsSQL Server
- SQL Azure Reset autoincrement
- How do we alias a Sql Server instance name used in
Assuming you're talking about SQL Server and also assuming that you have a clustered index on your table (as you should).
Then the nonclustered index has the columns that you define in your
CREATE INDEX
statement, plus it has the column(s) that make up your clustered index (if present).That clustering key value is the "pointer" to where the actual data is located.
If the query executor seeks through your nonclustered index for a value and find a match, then
either that value is all you care about - then you just get back that value
or the nonclustered index might also have some included columns (in the leaf level page) and with those, the query can be satisfied (all columns requested are present) so you get back the values you asked for
or then the values you want are not all in the nonclustered index leaf-level page (that's especially true if you do
SELECT *
all the time) and then the query executor has to take the clustering key value from the nonclustered index, and go back to the clustering index, do what's called a key lookup, seek through the clustering index, and find the associated data page where the complete row is stored -> and now the query executor can return the values you've asked forFor a pretty good explanation - see this blog post here. It says:
Or see this blog post in a whole series on SQL Server indexes which explains the "bookmarks" stored in a nonclustered index leaf-level page, too.
It's pretty easy to imagine like it this:
You have a table of customers, for example customer(id, name, age, adress). On this table you have a clustered index on age. This means your data is sorted by age on the hard drive. This is very beneficial for when you want to do range queries like:
Then the data can be fetched from your hard drive with only a few sequential reads. If the index were unclustered you would have to make one disc access (included the seek of the data) for every matching customer tuple.
Maybe for your application you also need to access the users by id. This means without an additional index on id you would have to run over the entire file to find an particular id because it's sorted by age and you have no index! To avoid that, you create a second index on id. Now you can search for an id in this index and the leaf of the index, which contains the customer you are looking for, points to the place in your (by age clustered) data on disc, where you find the tuple. By this you must not read the whole table need much fewer disc accesses (in general 2 for index lookup + 1 for fetching the tuple).
EDIT: I didn't see that you were talking about the same column. One thing I could imagine is that you do one clustered index on one column for the reason described above and another combined index of this and another column for example. This can be useful to do an index-only lookup, here you have all the required attributes in the index and don't need to do a page fetch at all. Another reason would be to have a clustered B+-Index for range queries and a Hash-Index for equality queries. But I think the benefit here would be negligible.
Hope this helped!