SQL Server Trace file create unique identifier for

2019-08-01 15:33发布

问题:

To analyse my imported trace file it would like to have a unique value for Select Distinct TextData from myImportedTraceFile

I tried using hashbyte although i am not sure if MD5 is the right tool to create a unique identifier. Even if this were the case (please tell me if it is so) then i still have the problem that

  • Using HASHBYTES('MD5', CAST(TextData AS varchar(7999))) As TextData_HashBytes cuts a few rows of (see this reply)

What can i do to create a unique identifier for every unique value (Select Distinct TextData from ..) in the column TextData?

Update

Based on the post from Dan i created this testcase

Drop Table #Temp

Create Table #Temp
(    
   A int, 
   B NText
)

Insert Into #Temp ( A, B) 
    Select 1, 'some space' UNION ALL      
    Select 2, ' some space' UNION ALL              
    Select 3, ' some space '  UNION ALL                 
    Select 4, 'some space   '  UNION ALL                 
    Select 5, '  some space '  UNION ALL                 
    Select 6, '  some space  '

 -- this returns 6 rows
 Select 
      HASHBYTES('MD5', CAST(B AS nvarchar(MAX)))
      , CAST(B AS nvarchar(MAX)) as B from #Temp;

 -- this returns 3 rows
SELECT NEWID() AS UniqueID, B FROM 
( Select DISTINCT CAST(B AS nvarchar(MAX)) AS B 
    FROM #Temp 
) sq 

These three rows are the result

'  some space ' -- 2sp B + 1sp E --> row 5
' some space'   -- 1sp B + 0sp E --> row 2
'some space   ' -- 0sp B + 3sp E --> row 4

It is unclear how row 1 (0sp), 3 (1sp B+E) and 6 (2sp B+E) are handled. So some whitespace is removed other not.

回答1:

You could use a derived table with SELECT DISTINCT:

SELECT NEWID() AS UniqueID, TextData
FROM (
    SELECT DISTINCT CAST(TextData AS nvarchar(MAX)) AS TextData
    FROM myImportedTraceFile
    ) AS UniqueQueries;