Is it possible to generate hash code from both database server and compare them? How to write the following pseudo SQL in SQL Server? Especially the two getHash
functions which accept mutliple numeric/float columns in SQL server and oracle.
select s.PK
from sqltable s
join openquery(oracleLinkedServer,
'select PK, getHash(Column1, floatColumn2, ..., floatColumnN) oracleHash
from oracleTable') o on o.PK = s.PK
where
getHash(Column1, floatColumn2, ..., floatColumnN) <> oracleHash
In SQL Server:
select upper(substring(sys.fn_sqlvarbasetostr(hashbytes('MD5','A')),3,32));
result:
7FC56270E7A70FA81A5935B72EACBE29
In Oracle :
select rawtohex(
DBMS_CRYPTO.Hash (
UTL_I18N.STRING_TO_RAW ('A', 'AL32UTF8'),
2)
) from dual;
result:
7FC56270E7A70FA81A5935B72EACBE29
Make sure your strings are exactly the same (case sensitive). Here I used 'A' as a simple example, but it could be any string really.
If you avoid data type differences by converting to a big string, you should be able to produce the same md5 hash on different platforms. Note that SQL Server prepended a '0x' to the hash to denote hex representation, which I stripped with the substring.
In SQL Server you have hashbytes(); in Oracle you have DBMS_CRYPTO.Hash(). You should be able to use them to calc an MD5 hash on both sides, though I am not positive the hashes will match... its worth a shot.
There are other ways to compare tables but to answer your question these are the two native functions on either platform.
You can use CHECKSUM()
in SQL Server to compute a multi-column hash.
If you have limited access to the Oracle system, and haven't been granted execute permissions on DBMS_CRYPTO.Hash()
, you can still use DBMS_OBFUSCATION_TOOLKIT.MD5()
to generate the same hashed value on both Oracle and SQL Server. You just have to make sure to convert the string to the same code page as the machine running SQL Server. I believe the default on most North American Windows boxes is windows-1252 ANSI Latin 1; Western European (Windows)
(from https://docs.microsoft.com/en-us/windows/desktop/intl/code-page-identifiers)
I don't particularly like having function calls in my WHERE
clauses, so I put the SQL side hash into a CTE, and then join to the OPENQUERY
results.
;WITH sqlHash AS
(
SELECT s.PK
,UPPER(CONVERT(CHAR(32), HASHBYTES('MD5', s.Column1 + '|' + s.Column2 + '|' + s.ColumnN), 2)) AS sqlHash
FROM sqltable s
)
SELECT s.PK
FROM sqlHash s
JOIN OPENQUERY
( oracleLinkedServer, '
SELECT
PK
,CONVERT(UPPER(RAWTOHEX(
SYS.DBMS_OBFUSCATION_TOOLKIT.MD5(input_string => "oraColumn1" || ''|'' || "oraColumn2" || ''|'' || "oraColumnN")
)),''AL32UTF8'',''WE8MSWIN1252'') AS oracleHash
FROM oracleTable' ) o ON o.PK = s.PK
WHERE s.sqlHash <> o.oracleHash
A couple of things to point out here that might not be immediately apparent.
First, I'm adding a pipe character, |
, between each column in the concatenated text. This is differentiate combinations like "digital" + "aaron"
and "digi" + "talaaron"
from producing the same hash, as the string would become "digital|aaron"
and "digi|talaaron"
which should produce different hashed values.
Second, when we get the SQL side has in the CTE, the value comes back as a VARBINARY
. If we just displayed this value, there would be a leading 0x
. But we want to strip this off, and also convert the value to something a bit more useful than a VARBINARY
. So we convert to CHAR(32)
with a style of 2
.