“Clustered index” and “Order by Clause”

2020-06-03 06:39发布

问题:

Is there any difference between Clustered Index and Order by Clause?

I have to populate the Dropdown from the Master Table and following is the query.

Select Id, Name from Table Order by Name

Should I use the Order by Clause Or Clustered Index for the above task?

EDIT

Below is the schema of the table

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[lookup]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[lookup](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [varchar](50) NULL,
 CONSTRAINT [PK_lookup_ID] PRIMARY KEY NONCLUSTERED
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
END
GO

IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[lookup]') AND name = N'IX_lookup_Name')
CREATE CLUSTERED INDEX [IX_lookup_Name] ON [dbo].[lookup]
(
    [Name] ASC
)WITH (PAD_INDEX  = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]

I have a clustered index on Name as well. But right now it is not showing in the schema. Sorry, I don't know why.

回答1:

Apples and Oranges. A clustered index is a storage option. An ORDER BY is a query option. If you need ordered results, the only way to get them is to add an ORDER BY clause to your query. Period.

An index may help the query optimizer generate a more efficient plan and leverage the index as a means to satisfy the ORDER BY requirement. But by no means will the presence of an index, clustered or non-clustered, guarantee any ordering of a result.

So you absolutely need ORDER BY in your queries. You also may consider an index by Name column to help this the query. Whether an index will be used or not, it depends on many more factors. you should read Designing Indexes and The Tipping Point.



回答2:

A index allows for quick searching filtering "WHERE CLAUSE" but also has the added bonus in that data will be sorted.

example

This is how data would be saved in the table.

ID    Name
1     Jack
2     Bob
3     Jill

If you add a clustered index on Name(ASC) this is how it will be saved (primary keys are always stored along with each indexed for look up infomation)

2     Bob
1     Jack
3     Jill

So using your SQL

Select Id, Name from Table Order by Name

For a select without the clustered index, the database will check to see if an index exists that can help do its work quicker. It will not find any so it will select the data from the table, sort it, and then return.

For select with the clustered index, the database will check to see if an index exists that can help do its work quicker. It will find the index on name that is sorted ASC. It can just select the ID and Name from the index, and then return as it knows that the data is sorted already.

So without the index on name the database has to sort the data every time the query is run. With the index, the sort happens when data is Inserted or Updated (which slows updates a little bit) The difference is that it only needs to sort once and not every time.



回答3:

If Id is your primary key(that is common scenario) and is used in joins, you should create clustered index on Id. But for searching performance gain, you should create nonclustered index on Name which will include Id.



回答4:

A clustered index and an order by clause are two completely different things. A clustered index decides how the rows in the stored table are sorted. An order by clause decides how the results of the query should be ordered.

A nonclustered index creates another "shadow table" in the DB storage, which is ordered on the indexed columns. It also contains the primary key, so that it can quickly find the right row in the "real" table. A best practice for database design is to create a clustered index on the primary key (unless there are reasons against it). Any other columns that need to be indexed can be handled in nonclustered indexes.

To optimize performance it is much more important that the where clause conditions can use an index than that the order by can.



回答5:

My first question is: What is the business use case? If the answer is "display the rows in name order", then ORDER BY Name.

Since you mentioned that you already have a non-clustered index on Name anyway, you should be good to go.

I also imagine that you will be filtering data on 'Name' anyway, so you'll already be making use of the index.

My second thought: Are you prematurely optimizing this? Is the table going to have thousands or millions of rows? If not, you probably won't notice whether or not an index exists anyway. And if you do have thousands of rows, how well will the dropdown box behave without filtering?

We can do a lot of guessing, so it is always best if you profile the queries in your environment.

In general, you put a CLUSTERED INDEX on incrementing values (IDENTITY, creation dates, etc) or if the data is relatively static. Not every table requires a clustered index.