UPDATE:
ORACLE VERSION 10G
I have a list of records in Oracle
as follows, these are actually sections of various books
The records are generated in the below format
[main topic].[sub topic].[first level section] ..... .[last level section]
Sections
--------
1
7.1
6.2
7.1
7.4
6.8.3
6.8.2
10
1.1
7.6
6.1
11
8.3
8.5
1.1.2
6.4
6.6
8.4
1.1.6
6.8.1
7.7.1
7.5
7.3
I want to order this like as follows
1
1.1
1.1.2
1.1.6
6.2
6.4
6.5
6.6
6.7
6.8.1
6.8.2
6.8.3
7.2
7.3
7.4
7.5
7.6
7.7.1
7.7.2
8.3
8.4
8.5
10
But as the field is not a numeric
datatype
the sorting results in something like this
1
10
1.1
1.1.2
1.1.6
....
.....
8.5
How can I sort them. I am unable to convert them to number due to multiple number of decimal points.
Is there any function in oracle
that supports such a sorting technique
Solution without regexp and functions (suppose
t
is a table with source data):SQLFiddle example
Main idea is to calculate number, wich indicates priority of each row. Suppose, we have
33.17.21.2
value. This string may be treated as a number in hypotetical numeral system with baseQ
like a hexadecimal numbers represents IPv4 address, and then converted to a numeric representation:33*(Q^3) + 17*(Q^2) + 21*(Q^1) + 2*(Q^0)
For example, if
Q=100
then number from exmple is33*1000000 + 17*10000 + 21*100 + 2*1 = 33172102
First trouble with this approach is that each level numbers required to be less than choosed
Q
value. It's by design and can't be eleminated.Next is that we don't know how many levels at all, we have
7.1
and2.2.2.2.2.2
, and shorter one most come first. Therefore while calculating value it starts from some fixed powerN
and then degrades power ofQ
, so in case ofQ=100
andN=3
sequence of multipilers starts with this numbers:1000000, 10000, 100, 1, 1/100, 1/10000, 1/1000000, ...
In code above
Q=1000
andN=10
, but this may be changed depending on required parameters. Number of levels limited by choosedQ
value and precision of Oraclenumber
type. Theoretically it's possible to build expression for longer strings by splitting string into parts.Rest of the code is just hierarchical query for splitting string to sequence of numbers.
Update
Same approach may be used easily just for strings:
'20'
comes before'8'
because information about second digit are missing. If we pad both values to some fixed length it ordered as expected:'008' < '020'
, so it's possible to deal with strings only:With string length limitation of 4000 chars and 9 digits on each level with single separation symbol (
'-'
in example above) it's possible to handle 400 levels of hierarchy.Main disadvantage of this method is a memory consumption and comparison speed. From other side, lack of a conversion to a number makes it compatible even with mixed chapter numbering( things like
'13.3.a.vii'
or'III.A.13.2'
(Ooops ... roman numerals handled improperly)In case of decimal-number-only numbering variant with strings may be compacted by translation of numbers to hexadecimal representation. With 4 hex symbols it's possible to handle 16535 numbers on each level, and with 8 symbols - full 32-bit number which more than enough for most applications.
P.S. Of course, it's possible to use all expressions above in select list to examine calculated values instead of using it in
order by
.In case the number of level is fix (e.g. max. 4) you can use this one:
When the maximum depth is known, you can split the section in sub-sections:
If the maximum depth of sub-sections is unknown (but presumably less than a couple hundred on 8-bit character databases or less than a few thousands in ANSI-character databases), you could define a function that converts your unsortable digits into sortable characters:
Here is the solution I've ended up with for general case (when number of dots is not known) - leave the first dot as it is, and replace all others with zeroes, so you will have just a float number on which you can apply order by:
It can be rewritten more ellegantly using regular expresions, but I'm not really familiar with them, so just used basic Oracle functions :)
Simplest I think... Copy and run to see the output:
You can try this out -
Best part - Don't need to worry about the depth of levels
How it works:-
Since your Sections are of string type, the sorting is done based on ASCII codes. Also, the most significant part of your section is the first set of digist before the first DOT.
Now, its the 0 which would create all the problems - so replace the '0' with anything which has ASCII value higher than that of 9.
I've tested it with some basic combinations (including higher depths) - go ahead and test it properly before using it.