Len() vs datalength() in SQL Server 2005

2019-02-02 03:58发布

问题:

Recently I faced a issue when using len() in a query to find out the length of a query, len() was not counting the trailing spaces in the value. But datalength() is counting the trailing spaces also.

Does this means that if I'm doing some operation which deals with the actual length of the value then I have to use dalalength() over len().

ex: If I need the value of a particular value to be is of 10 character length. ie If the value is 3 character length I've to append 7 spaces to it.

Regards

回答1:

Yes that's exactly what you must do. If you want to just get number of characters excluding blanks you would use LEN() function, while in all other cases DATALENGTH().

Even LEN() documentation has an information that to get number of bytes to represent the extension you should use DATALENGTH()

Here are the links to MSDN docs:

LEN()

DATALENGTH()



回答2:

Be careful. DATALENGTH returns the number of bytes used, not the number of characters.



回答3:

len counts the number of characters used not the storage required, this will be even more evident when you use nvarchar instead of varchar

len does not count trailing spaces either

take a look at this

declare @v nchar(5)
select @v ='ABC  '


select len(@v),datalength(@v)

and the output for len is 3 while the output for datalength =10



回答4:

Just use Replace():

SELECT LEN(REPLACE(N'4 Trailing Spaces:    ', ' ', '_'))

This will replace trailing spaces with a character that LEN() will actually count.

The problem with DataLength() is you have to keep track of wether your string is Unicode (nChar, nVarChar) or ASCII (Char, VarChar) to know if you also need to divide the datalength of a Unicode string by 2.



回答5:

I was about to use the Len(Replace('blah blah ',' ','_') suggestion when it struck me it may be more efficient to use. Just posting in case someone stumbles upon this thread as I did.

len('blah blah ' + '.')-1



回答6:

Unfortunately there is no perfect solution that I am aware of.

One of the proposed solutions, LEN(string + '.')-1 returns wrong results (-1) if the string is Unicode of size 4000 or non-Unicode and of size 8000. That is because the concatenation is ignored. You can overcome this if you want, by casting the string to a MAX-size string: LEN(CAST(string as nvarchar(max)) + '.')-1, but is it worth it?

As mentioned by others, DATALENGTH(string) returns the number of bytes used for storage. For Unicode strings, it may not be enough to divide the result by 2: Unicode surrogate characters can take more than 16 bits.

All in all, be mindful of the limitations of each approach and choose whatever you believe will cause you less issues.