Is ora_hash deterministic?

2019-01-24 05:08发布

问题:

I am working with an Oracle database and I need to be able to partition the data in a table. I understand that Rracle has an ora_hash function that can partition the data into buckets. Is the ora_hash function deterministic?

In my program I will be making several different database queries with each query asking for a different bucket number.

For example, in one query I might ask for the first two buckets:

SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) in (0,1);

In a subsequent query I might ask for the 2nd and 3rd bucket:

SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) in (1,2);

In the above example, will ora_hash always divide the table into the exact same 10 buckets? Assume that the data in the tables hasn't changed. Will the second bucket (bucket 1), be identical in both queries?

There is documentation that suggests that seed value enables oracle to return different results for the same data set. So I am assuming that if I don't use seed value, then ora_hash will be deterministic. See the documentation.

回答1:

Jon Heller's answer has some more details, so go upvote his answer. Since this is the accepted answer still, I'll inline part of his response:

ORA_HASH is definitely deterministic for data types that can be used for partitioning, such as NUMBER, VARCHAR, DATE, etc.

But ORA_HASH is not deterministic for at least some of the other data types, such as CLOB.



回答2:

ORA_HASH is definitely deterministic for data types that can be used for partitioning, such as NUMBER, VARCHAR, DATE, etc.

But ORA_HASH is not deterministic for at least some of the other data types, such as CLOB.


My answer is based on this Jonathan Lewis article about ORA_HASH.

Jonathan Lewis doesn't explicitly say they are deterministic, but he does mention that ORA_HASH "seems to be the function used internally – with a zero seed – to determine which partition a row belongs to in a hash partitioned table". And if it's used for hash partitioning then it must be deterministic, or else partition-wise joins wouldn't work.

To show that ORA_HASH can be non-deterministic for some data types, run the below query. It's from a comment in the same article:

with src as (select to_clob('42') val from dual connect by level<=5)
select val,ora_hash(val,7) from src order by 2;

Surprisingly, this same issues happens with dbms_sqlhash.gethash.