The character ̈ (unicode 0x308) cannot be represented in the “Western (ISO Latin 9)” encoding.
I need to replace several (3) of this special characters in many txt-files. Ideal would be one single regex command for the TEXTWRANGLER editor application I run on my Mac so I can use in the find&replace function of Textwrangler (similar to BBedit).
Here are the 3 special chars:
- ä into ä
- ö into ö
- ü into ü
(please note the first letter persists of two chars (e.g. the a and the ̈ unicode 0x308) and therefore it is not WESTERN ISO LATIN compatibel.
I tried regex (groups) but I was not successfull: In TEXTWRANGLER I use the find&replace function (incl. grep=regex option)
FIND: (ä|ö|ü)+
REPLACE: \1ä , \2ö , \3ü
any idea?
Brief
I've just tested this with Notepad++, although I'm not sure if this will work in any Mac text editor alternatives.
This method is a conditional replacement using a dictionary in regex. It's more of a hack, but it does work assuming it's supported by the text editor. Once you're done remove the dictionary from the bottom of the file.
Code
See regex in use here
Replacement
Results
Input
Input - Modified
This input includes the dictionary at the end
Output
Explanation
(ä|ö|ü)
Capture either character in the group into capture group 1(?=[\s\S]*Dictionary:[\s\S]*\1=([^\s=:]+))
Positive lookahead ensuring what follows matches[\s\S]*
Match any character any number of timesDictionary:
MatchDictionary:
literally (this can be changed to anything, but you should make sure this is a unique string that won't be present anywhere else in your input)[\s\S]*
Match any character any number of times\1
Match the same text as most recently matched by the first capture group=
Match the equal sign character=
literally([^\s=:]+)
Capture one or more of any character not present in the set (not whitespace,=
or:
) into capture group 2