Get hash values from SQL Server and Oracle and com

2019-04-30 00:35发布

问题:

Is it possible to generate hash code from both database server and compare them? How to write the following pseudo SQL in SQL Server? Especially the two getHash functions which accept mutliple numeric/float columns in SQL server and oracle.

select s.PK
from sqltable s
    join openquery(oracleLinkedServer, 
      'select PK, getHash(Column1, floatColumn2, ..., floatColumnN) oracleHash 
       from oracleTable') o on o.PK = s.PK
where
    getHash(Column1, floatColumn2, ..., floatColumnN) <> oracleHash

回答1:

In SQL Server:

select upper(substring(sys.fn_sqlvarbasetostr(hashbytes('MD5','A')),3,32));

result:

7FC56270E7A70FA81A5935B72EACBE29

In Oracle :

select rawtohex(
    DBMS_CRYPTO.Hash (
        UTL_I18N.STRING_TO_RAW ('A', 'AL32UTF8'),
        2)
    ) from dual;

result:

7FC56270E7A70FA81A5935B72EACBE29

Make sure your strings are exactly the same (case sensitive). Here I used 'A' as a simple example, but it could be any string really.

If you avoid data type differences by converting to a big string, you should be able to produce the same md5 hash on different platforms. Note that SQL Server prepended a '0x' to the hash to denote hex representation, which I stripped with the substring.



回答2:

In SQL Server you have hashbytes(); in Oracle you have DBMS_CRYPTO.Hash(). You should be able to use them to calc an MD5 hash on both sides, though I am not positive the hashes will match... its worth a shot.

There are other ways to compare tables but to answer your question these are the two native functions on either platform.



回答3:

You can use CHECKSUM() in SQL Server to compute a multi-column hash.



回答4:

If you have limited access to the Oracle system, and haven't been granted execute permissions on DBMS_CRYPTO.Hash(), you can still use DBMS_OBFUSCATION_TOOLKIT.MD5() to generate the same hashed value on both Oracle and SQL Server. You just have to make sure to convert the string to the same code page as the machine running SQL Server. I believe the default on most North American Windows boxes is windows-1252 ANSI Latin 1; Western European (Windows) (from https://docs.microsoft.com/en-us/windows/desktop/intl/code-page-identifiers)

I don't particularly like having function calls in my WHERE clauses, so I put the SQL side hash into a CTE, and then join to the OPENQUERY results.

;WITH sqlHash AS
(
    SELECT s.PK
        ,UPPER(CONVERT(CHAR(32), HASHBYTES('MD5', s.Column1 + '|' + s.Column2 + '|' + s.ColumnN), 2)) AS sqlHash
    FROM sqltable s
)
SELECT s.PK
FROM sqlHash s
JOIN OPENQUERY
     ( oracleLinkedServer, '
     SELECT 
        PK
        ,CONVERT(UPPER(RAWTOHEX(
            SYS.DBMS_OBFUSCATION_TOOLKIT.MD5(input_string => "oraColumn1" || ''|'' || "oraColumn2" || ''|'' || "oraColumnN")
            )),''AL32UTF8'',''WE8MSWIN1252'') AS oracleHash
       FROM oracleTable' ) o ON o.PK = s.PK
WHERE s.sqlHash <> o.oracleHash

A couple of things to point out here that might not be immediately apparent.

First, I'm adding a pipe character, |, between each column in the concatenated text. This is differentiate combinations like "digital" + "aaron" and "digi" + "talaaron" from producing the same hash, as the string would become "digital|aaron" and "digi|talaaron" which should produce different hashed values.

Second, when we get the SQL side has in the CTE, the value comes back as a VARBINARY. If we just displayed this value, there would be a leading 0x. But we want to strip this off, and also convert the value to something a bit more useful than a VARBINARY. So we convert to CHAR(32) with a style of 2.