I have quite a large nvarchar which I wish to pass to the HashBytes function. I get the error:
"String or binary would be truncated. Cannot insert the value NULL into column 'colname', tbale 'table'; column does not allow nulls. UPDATE fails. The statement has been terminated."
Being ever resourceful, I discovered this was due to the HashBytes function having a maximum limit of 8000 bytes. Further searching showed me a 'solution' where my large varchar would be divided and hashed seperately and then later combined with this user defined function:
function [dbo].[udfLargeHashTable] (@algorithm nvarchar(4), @InputDataString varchar(MAX))
RETURNS varbinary(MAX)
AS
BEGIN
DECLARE
@Index int,
@InputDataLength int,
@ReturnSum varbinary(max),
@InputData varbinary(max)
SET @ReturnSum = 0
SET @Index = 1
SET @InputData = convert(binary,@InputDataString)
SET @InputDataLength = DATALENGTH(@InputData)
WHILE @Index <= @InputDataLength
BEGIN
SET @ReturnSum = @ReturnSum + HASHBYTES(@algorithm, SUBSTRING(@InputData, @Index, 8000))
SET @Index = @Index + 8000
END
RETURN @ReturnSum
END
which I call with:
set @ReportDefinitionHash=convert(int,dbo.[udfLargeHashTable]('SHA1',@ReportDefinitionForLookup))
Where @ReportDefinitionHash is int, and @ReportDefinitionForLookup is the varchar
Passing a simple char like 'test' produces a different int with my UDF than a normal call to HashBytes would produce.
Any advice on this issue?
If you can't create a function and have to use something that already exists in the DB:
can be made to work using the syntax:
Taken from: http://www.sqlnotes.info/2012/01/16/generate-md5-value-from-big-data/
This can be used as function body, too:
The idea si to
HASH
each4000
part of theNVARCHAR(MAX)
string and concatanate the results. Then toHASH
the latter result.Just use this function (taken from Hashing large data strings with a User Defined Function):
And the results are as following:
Output:
I've taken the accepted answer, and modified it a bit with the following improvements:
With these changes, the functions can now be used in persisted computed columns as they are now marked deterministic when created.
tested and working select master.sys.fn_repl_hash_binary(someVarbinaryMaxValue) moreover not complicated :)
It seems the easiest solution is to write a recursive hashing algorithm that parses the input text value into sub
varchar(8000)
segments. I arbitrarily chose to slice the input string into 7500 character segments The hashing algorithm returns avarbinary(20)
which can easily be converted into avarchar(20)