I want to sanitise some input and replace several characters with acceptable input, e.g. a Danish 'å
' with 'aa
'.
This is easily done using several statements, e.g. /æ/ae/
, /å/aa/
, /ø/oe/
, but due to tool limitations, I want to be able to do this in a single regular expression.
I can catch all of the relevant cases (/[(æ)(ø)(å)(Æ)(Ø)(Å)]/
) but I replacement does not work as I want it to (but probably completely as intended):
$ temp="RødgrØd med flæsk"
$ echo $temp
RødgrØd med flæsk
$ echo $temp | sed 's/[(æ)(ø)(å)(Æ)(Ø)(Å)]/(ae)(oe)(aa)(Ae)(Oe)(Aa)/g'
R(ae)(oe)(aa)(Ae)(Oe)(Aa)dgr(ae)(oe)(aa)(Ae)(Oe)(Aa)d med fl(ae)(oe)(aa)(Ae)(Oe)(Aa)sk
(first echo line is to show that it isn't an encoding issue)
Just as an aside, the tool issue is that I should like to also use the same regex in a Sublime Text 2 snippet.
Anyone able to discern what is wrong with my regex statement?
Thanks in advance.
With
you'll do the trick.
So, translate into what you need
This might work for you (GNU sed):
It works by adding a lookup table to the end of the line, looping until all keys are replaced then removes the lookup table.
Split it up into several
sed
statements, separated by;
: