I have a PostgreSQL 8.4 database that is created with the da_DK.utf8 locale.
dbname=> show lc_collate;
lc_collate
------------
da_DK.utf8
(1 row)
When I select something from a table where I order on a character varying column I get a strange behaviour IMO. When ordering the result PostgreSQL ignores dashes that prefixes the value, e.g.:
select name from mytable order by name asc;
May return something like
name
----------------
Ad...
Ae...
Ag...
- Ak....
At....
The dash prefix seems to be ignored.
I can fix this issue by converting the column to latin1 when ordering:
select name from mytable order by convert_to(name, 'latin1') asc;
The I get the expected result as:
name
----------------
- Ak....
Ad...
Ae...
Ag...
At....
Why does the dash prefix get ignored by default? Can that behavior be changed?
This is because
da_DK.utf8
locale defines it this way. Linux locale aware utilities, for examplesort
will also work like this.Your
convert_to(name, 'latin1')
will break if it finds a character which is not in Latin 1 character set, for example€
, so it isn't a good workaround.You can use
order by convert_to(name, 'SQL_ASCII')
, which will ignore locale defined sort and simply use byte values.Ugly hack edit:
This will sort first anything which starts with ASCII non-letter. This is very ugly, because sorting further in string would behave strange, but it can be good enough for you.
A workaround that will work in my specific case is to replace dashes with exclamation points. I happen to know that I will never get exclamation points and it will be sorted before any letters or digits.
It will certainly affect performance so I may look into creating a special column for sorting but I really don't like that either...
I don't know how seems ordering rules for Dutch, but for Polish special characters like space, dashes etc are not "counted" in sorting in most dictionaries. Some good sort routines do the same and ignores such special characters. Probably in Dutch there is similar rule, and this rule is implemented by Ubuntu locale aware sort function.