I am working with an Oracle database and I need to be able to partition the data in a table. I understand that Rracle has an ora_hash function that can partition the data into buckets. Is the ora_hash function deterministic?
In my program I will be making several different database queries with each query asking for a different bucket number.
For example, in one query I might ask for the first two buckets:
SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) in (0,1);
In a subsequent query I might ask for the 2nd and 3rd bucket:
SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) in (1,2);
In the above example, will ora_hash always divide the table into the exact same 10 buckets? Assume that the data in the tables hasn't changed. Will the second bucket (bucket 1), be identical in both queries?
There is documentation that suggests that seed value enables oracle to return different results for the same data set. So I am assuming that if I don't use seed value, then ora_hash will be deterministic. See the documentation.
ORA_HASH
is definitely deterministic for data types that can be used for partitioning, such as NUMBER, VARCHAR, DATE, etc.But
ORA_HASH
is not deterministic for at least some of the other data types, such as CLOB.My answer is based on this Jonathan Lewis article about
ORA_HASH
.Jonathan Lewis doesn't explicitly say they are deterministic, but he does mention that
ORA_HASH
"seems to be the function used internally – with a zero seed – to determine which partition a row belongs to in a hash partitioned table". And if it's used for hash partitioning then it must be deterministic, or else partition-wise joins wouldn't work.To show that
ORA_HASH
can be non-deterministic for some data types, run the below query. It's from a comment in the same article:Surprisingly, this same issues happens with
dbms_sqlhash.gethash
.Jon Heller's answer has some more details, so go upvote his answer. Since this is the accepted answer still, I'll inline part of his response: