The weird length of varchar and nvarchar in T-SQL

2019-08-04 09:46发布

问题:

I had a question about the Transcat SQL, below is the sql code;

DECLARE @main nVARCHAR(max);

--there are over 4000 characters to @main 
set @main = '01234567890123456789...';
Print len(@main)

the length of @main is correct, e.g. 4007 however, if I change the code to below:

DECLARE @main nVARCHAR(max);

--there are over 4000 characters to @main 
set @main = N'01234567890123456789...';
Print len(@main)

the length of @main will always be 4000, it is weird, I don't understand. Another thing is if I change the code to below:

DECLARE @main nVARCHAR(max), @split nVARCHAR(500);

--there are 500 characters to @split 
set @split = '01234567890123456789...';
set @main = @split + @split + @split + @split + @split + @split + @split + @split + @split + @split;
Print len(@main)

The length of @main is 4000, why? No matter I add letter N before '01234567890123456789...'? if I change the @split to varchar(500) DECLARE @main nVARCHAR(max), @split VARCHAR(500);

--there are 500 characters to @split 
set @split = '01234567890123456789...';
-- 10 @split
set @main = @split + @split + @split + @split + @split + @split + @split + @split + @split + @split;
Print len(@main)

The length of @main is correct, no matter I add letter N before ''01234567890123456789...' or not, the length of @main is always 5000

回答1:

You're running into issues with implicit types of string expressions. With the exception of MAX lengths, a VARCHAR type can hold 8000 chars (limited by the size of a page in SQL SERVER) and, because NVARCHAR stores double-byte chars, it can hold half the number of chars (4000). By default string expressions (literals, concatenated strings etc) are evaluated using either a max length of 8000 for varchar or a max len of 4000 for nvarchar. You have to explicitly cast the expression to a MAX type to get the behavior you expect.

In your first example, your string expression '01234567890123456789...' is implicitly evaluated as a VARCHAR(8000) which is then implicitly converted to NVARCHAR(MAX) so it correctly retains the number of characters since your expression only had 4007 chars.

When you convert your string expression to unicode (i.e. N'01234567890123456789...' it evaluates the string as NVARCHAR(4000) before assigning the value to your NVARCHAR(MAX) variable.

Change your second example to this:

DECLARE @main nVARCHAR(max);

--there are over 4000 characters to @main 
set @main = CAST(N'01234567890123456789...' AS NVARCHAR(MAX));
Print len(@main)

And it will behave as you expect.

Likewise when you're concatenating the @split variables, SQL Server is evaluating the expression as NVARCHAR(4000) you have to cast the leftmost value as NVARCHAR(MAX) in order to have the expression evaluated as NVARCHAR(MAX)

DECLARE @main nVARCHAR(max), @split nVARCHAR(500);

--there are 500 characters to @split 
set @split = '01234567890123456789...';
set @main = CAST(@split as NVARCHAR(MAX)) + @split + @split + @split + @split + @split + @split + @split + @split + @split;
Print len(@main)

It will work.