Which is faster/best? SELECT * or SELECT column1,

2018-12-31 03:30发布

I've heard that SELECT * is generally bad practice to use when writing SQL commands because it is more efficient to SELECT columns you specifically need.

If I need to SELECT every column in a table, should I use

SELECT * FROM TABLE

or

SELECT column1, colum2, column3, etc. FROM TABLE

Does the efficiency really matter in this case? I'd think SELECT * would be more optimal internally if you really need all of the data, but I'm saying this with no real understanding of database.

I'm curious to know what the best practice is in this case.

UPDATE: I probably should specify that the only situation where I would really want to do a SELECT * is when I'm selecting data from one table where I know all columns will always need to be retrieved, even when new columns are added.

Given the responses I've seen however, this still seems like a bad idea and SELECT * should never be used for a lot more technical reasons that I ever though about.

标签: sql database
30条回答
何处买醉
2楼-- · 2018-12-31 03:45

Given your specification that you are selecting all columns, there is little difference at this time. Realize, however, that database schemas do change. If you use SELECT * you are going to get any new columns added to the table, even though in all likelihood, your code is not prepared to use or present that new data. This means that you are exposing your system to unexpected performance and functionality changes.

You may be willing to dismiss this as a minor cost, but realize that columns that you don't need still must be:

  1. Read from database
  2. Sent across the network
  3. Marshalled into your process
  4. (for ADO-type technologies) Saved in a data-table in-memory
  5. Ignored and discarded / garbage-collected

Item #1 has many hidden costs including eliminating some potential covering index, causing data-page loads (and server cache thrashing), incurring row / page / table locks that might be otherwise avoided.

Balance this against the potential savings of specifying the columns versus an * and the only potential savings are:

  1. Programmer doesn't need to revisit the SQL to add columns
  2. The network-transport of the SQL is smaller / faster
  3. SQL Server query parse / validation time
  4. SQL Server query plan cache

For item 1, the reality is that you're going to add / change code to use any new column you might add anyway, so it is a wash.

For item 2, the difference is rarely enough to push you into a different packet-size or number of network packets. If you get to the point where SQL statement transmission time is the predominant issue, you probably need to reduce the rate of statements first.

For item 3, there is NO savings as the expansion of the * has to happen anyway, which means consulting the table(s) schema anyway. Realistically, listing the columns will incur the same cost because they have to be validated against the schema. In other words this is a complete wash.

For item 4, when you specify specific columns, your query plan cache could get larger but only if you are dealing with different sets of columns (which is not what you've specified). In this case, you do want different cache entries because you want different plans as needed.

So, this all comes down, because of the way you specified the question, to the issue resiliency in the face of eventual schema modifications. If you're burning this schema into ROM (it happens), then an * is perfectly acceptable.

However, my general guideline is that you should only select the columns you need, which means that sometimes it will look like you are asking for all of them, but DBAs and schema evolution mean that some new columns might appear that could greatly affect the query.

My advice is that you should ALWAYS SELECT specific columns. Remember that you get good at what you do over and over, so just get in the habit of doing it right.

If you are wondering why a schema might change without code changing, think in terms of audit logging, effective/expiration dates and other similar things that get added by DBAs for systemically for compliance issues. Another source of underhanded changes is denormalizations for performance elsewhere in the system or user-defined fields.

查看更多
高级女魔头
3楼-- · 2018-12-31 03:46

Select is equally efficient (in terms of velocity) if you use * or columns.

The difference is about memory, not velocity. When you select several columns SQL Server must allocate memory space to serve you the query, including all data for all the columns that you've requested, even if you're only using one of them.

What does matter in terms of performance is the excecution plan which in turn depends heavily on your WHERE clause and the number of JOIN, OUTER JOIN, etc ...

For your question just use SELECT *. If you need all the columns there's no performance difference.

查看更多
大哥的爱人
4楼-- · 2018-12-31 03:46

If you need every column then just use SELECT * but remember that the order could potentially change so when you are consuming the results access them by name and not by index.

I would ignore comments about how * needs to go get the list - chances are parsing and validating named columns is equal to the processing time if not more. Don't prematurely optimize ;-)

查看更多
永恒的永恒
5楼-- · 2018-12-31 03:47

Specifying the column list is usually the best option because your application won't be affected if someone adds/inserts a column to the table.

查看更多
千与千寻千般痛.
6楼-- · 2018-12-31 03:47

The result is too huge. It is slow to generate and send the result from the SQL engine to the client.

The client side, being a generic programming environment, is not and should not be designed to filter and process the results (e.g. the WHERE clause, ORDER clause), as the number of rows can be huge (e.g. tens of millions of rows).

查看更多
看风景的人
7楼-- · 2018-12-31 03:47

What everyone above said, plus:

If you're striving for readable maintainable code, doing something like:

SELECT foo, bar FROM widgets;

is instantly readable and shows intent. If you make that call you know what you're getting back. If widgets only has foo and bar columns, then selecting * means you still have to think about what you're getting back, confirm the order is mapped correctly, etc. However, if widgets has more columns but you're only interested in foo and bar, then your code gets messy when you query for a wildcard and then only use some of what's returned.

查看更多
登录 后发表回答