varchar(max) everywhere?

2019-01-03 02:34发布

Is there any problem with making all your Sql Server 2008 string columns varchar(max)? My allowable string sizes are managed by the application. The database should just persist what I give it. Will I take a performance hit by declaring all string columns to be of type varchar(max) in Sql Server 2008, no matter what the size of the data that actually goes into them?

7条回答
做自己的国王
2楼-- · 2019-01-03 03:17

By using VARCHAR(MAX) you are basically telling SQL Server "store the values in this field how you see best", SQL Server will then choose whether to store values as a regular VARCHAR or as a LOB (Large object). In general if the values stored are less than 8,000 bytes SQL Server will treat values as a regular VARCHAR type.

If the values stored are too large then the column is allowed to spill off the page in to LOB pages, exactly as they do for other LOB types (text, ntext and image) - if this happens then additional page reads are required to read the data stored in the additional pages (i.e. there is a performance penatly), however this only happens if the values stored are too large.

In fact under SQL Server 2008 or later data can overflow onto additional pages even with the fixed length data types (e.g. VARCHAR(3,000)), however these pages are called row overflow data pages and are treated slightly differently.

Short version: from a storage perspective there is no disadvantage of using VARCHAR(MAX) over VARCHAR(N) for some N.

(Note that this also applies to the other variable-length field types NVARCHAR and VARBINARY)

FYI - You can't create indexes on VARCHAR(MAX) columns

查看更多
beautiful°
3楼-- · 2019-01-03 03:17

I asked the similar question earlier. got some interesting replies. check it out here There was one site that had a guy talking about the detriment of using wide columns, however if your data is limited in the application, my testing disproved it. The fact you can't create indexes on the columns means I wouldn't use them all the time (personally i wouldn't use them that much at all, but i'm a bit of a purist in that regard). However if you know there isn't much stored in them, i don't think they are that bad. If you do any sorting on columns a recordset with a varchar(max) in it (or any wide column being char or varchar), then you could suffer performance penalties. these could be resolved (if required) by indexes, but you can't put indexes on varchar(max). If you want to future proof your columns, why not just put them to something reasonable. eg a name column be 255 characters instead of max... that kinda thing.

查看更多
Animai°情兽
4楼-- · 2019-01-03 03:26

For this question specifically a few points I don't see mentioned.

  1. On 2005/2008/2008 R2 if a LOB column is included in an index this will block online index rebuilds.
  2. On 2012 the online index rebuild restriction is lifted but LOB columns cannot participate in the new functionality Adding NOT NULL Columns as an Online Operation.
  3. Locks can be taken out longer on rows containing columns of this data type. (more)

A couple of other reasons are covered in my answer as to why not varchar(8000) everywhere.

  1. Your queries may end up requesting huge memory grants not justified by the size of data.
  2. On table with triggers it can prevent an optimisation where versioning tags are not added.
查看更多
贪生不怕死
5楼-- · 2019-01-03 03:30

Ideally, you should only allow what you need. Meaning if you're certain a particular column (say a username column) is never going to be more than 20 characters long, using a VARCHAR(20) vs. a VARCHAR(MAX) lets the database optimize queries and data structures.

From MSDN: http://msdn.microsoft.com/en-us/library/ms176089.aspx

Variable-length, non-Unicode character data. n can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes.

Are you really going ever going to come close to 2^31-1 bytes for these columns?

查看更多
beautiful°
6楼-- · 2019-01-03 03:31

Indexes can not be over 900 bytes wide for one. So you can probably never create an index. If your data is less then 900 bytes, use varchar(900).

This is one downside: because it gives

  • really bad searching performance
  • no unique constraints
查看更多
forever°为你锁心
7楼-- · 2019-01-03 03:35

There is another reason to avoid using varchar(max) on all columns. For the same reason we use check constraints (to avoid filling tables with junk caused by errant software or user entries), we would want to guard against any faulty process that adds much more data than intended. For example, if someone or something tried to add 3,000 bytes into a City field, we would know for certain that something is amiss and would want to stop the process dead in its tracks to debug it at the earliest possible point. We would also know that a 3000-byte city name could not possibly be valid and would mess up reports and such if we tried to use it.

查看更多
登录 后发表回答