is it possible to retrieve statistics about the minimal or maximal value of a numeric column in Oracle 10g? I have found the table USER_TAB_COL_STATISTICS having a LOW_VALUE and HIGH_VALUE column, but I am not sure whether those are the values I am looking for.
I need to find an efficient way to ask the DBS for those statistics. Using a regular MIN(a) and MAX(a) query would be too slow on large tables.
Thanks in advance.
Yes, LOW_VALUE and HIGH_VALUE will tell you the minimum and maximum values in the column but:
If you index the column then MIN(a) and MAX(a) should be very fast as in this example where T1 has 50000 rows and is indexed on OBJECT_ID:
The result is the same if you select the MAX instead of the MIN. However, if you select the MIN and MAX in a single select statement the result is different:
This suggests that it may be better to get them separately, though I haven't conclusively proved that.
The other answers here (using an index fast full scan; or examining the
user_tab_columns
statistics) are excellent.Here's another method that might be suitable - if you're only interested in a rough estimate, you can use the
SAMPLE
clause (and adjust the sample size up or down depending on how accurate you need it):This takes a 1% sample from the table. It will generally sample different rows each time it is run, so don't expect the results to be identical run-to-run. If you want it to run quicker, you can have lower sample sizes, e.g.
SAMPLE(0.01)
, or if you want to sample half the table,SAMPLE(50)
.The advantage of this approach over the "analyze, then-query-user-tab-cols" approach is that the analyze runs queries like this anyway in order to generate the statistics - so doing it this way may mean less work overall.
An example with a table containing numbers from 1 up to 1234:
If you analyze the table, the low_value and high_value columns contain the right numbers.
They are raw, so they cannot be read easily. Using the utl_raw.cast_to_number function makes them readable:
However, be careful: the numbers may be inaccurate when updates have taken place between the time the statistics were gathered and the time the query ran.
Regards, Rob.
In my case the column of interest has
TIMESTAMP
type and it seems there's noUTL_RAW.CAST_TO_TIMESTAMP
function.It helped to use the trick from http://www.oaktable.net/content/convert-rawhex-timestamp-0 to convert Oracle
RAW
type toTIMESTAMP
: